[
https://issues.apache.org/jira/browse/IMPALA-10193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203666#comment-17203666
]
Fifteen edited comment on IMPALA-10193 at 9/29/20, 6:10 AM:
------------------------------------------------------------
[~tarmstrong] Yeah, actually I have fixed it locally and it works fine. But I
am not very sure whether there is any factors beyond my consideration. To
address the problem of wrong memory limit in start up options,
Here is my fix:
# Add a new environment variable *MAX_MEM_GB* , it denotes the MAX mem
available for both mini-cluster and CDH cluster.
# When starting 'impalad', the algorithm take *MAX_MEM_GB* rather than
*sys_mem* into account.
# When starting 'yarn node manager', similarly, *MAX_MEM_GB* substitutes for
*sys_mem*
Implementations:
1. in file 'bin/impala-config.sh', I added a new environment variable
{code:java}
# Maximum memory available for mini-cluster and CDH cluster
export MAX_MEM_GB=28
{code}
2. in file 'bin/start-impala-cluster.py', I made the local variable
`available_mem` equal to the MAX_MEM_GB if it's set. Otherwise it equal to the
`sys_mem` in hedge of changing the default routine. The final mem_limit remains
to be 0.7 * available_mem / cluster_size in this case.
{code:java}
def compute_impalad_mem_limit(cluster_size):
# Set mem_limit of each impalad to the smaller of 12GB or
# 1/cluster_size (typically 1/3) of 70% of available memory.
#
# The default memory limit for an impalad is 80% of the total system memory.
On a
# mini-cluster with 3 impalads that means 240%. Since having an impalad be
OOM killed
# is very annoying, the mem limit will be reduced. This can be overridden
using the
# --impalad_args flag. virtual_memory().total returns the total physical
memory.
# The exact ratio to use is somewhat arbitrary. Peak memory usage during
# tests depends on the concurrency of parallel tests as well as their
ordering.
# On the other hand, to avoid using too much memory, we limit the
# memory choice here to max out at 12GB. This should be sufficient for tests.
#
# Beware that ASAN builds use more memory than regular builds.
physical_mem_gb = psutil.virtual_memory().total / 1024 / 1024 / 1024
available_mem = int(os.getenv("MAX_MEM_GB", str(physical_mem_gb)))
mem_limit = int(0.7 * available_mem * 1024 * 1024 * 1024 / cluster_size)
print("mem_limit" + str(mem_limit))
return min(12 * 1024 * 1024 * 1024, mem_limit)
{code}
3. in file
'testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py
b/testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py',
Similarly, a 'available_ram_gb' is added, and all of the other computation
logic remains identical.
{code:java}
def _get_yarn_nm_ram_mb():
sys_ram = _get_system_ram_mb()
available_ram_gb = int(os.getenv("MAX_MEM_GB", str(sys_ram / 1024)))
# Fit into the following envelope:
# - need 4GB at a bare minimum
# - leave at least 24G for other services
# - don't need more than 48G
ret = min(max(available_ram_gb * 1024 - 24 * 1024, 4096), 48 * 1024)
print >>sys.stderr, "Configuring Yarn NM to use {0}MB RAM".format(ret)
return ret
{code}
I am still testing this fix in my 32GB docker container which runs on a 128GB
physical machine.
was (Author: fifteencai):
[~tarmstrong] Yeah, actually I have fixed it locally and it works fine. But I
am not very sure whether there is any factors beyond my consideration. To
address the problem of wrong memory limit in start up options,
Here is my fix:
# Add a new environment variable *MAX_MEM_GB* , it denotes the MAX mem
available for both mini-cluster and CDH cluster.
# When starting 'impalad', the algorithm take *MAX_MEM_GB* rather than
*sys_mem* into account.
# When starting 'yarn node manager', similarly, *MAX_MEM_GB* substitutes for
*sys_mem*
Implementations:
1. in file 'bin/impala-config.sh', I added a new environment variable
{code:java}
# Maximum memory available for mini-cluster and CDH cluster
export MAX_MEM_GB=28
{code}
2. in file 'bin/start-impala-cluster.py', I made the local variable
`available_mem` equal to the MAX_MEM_GB if it's set. Otherwise it equal to the
`sys_mem` in hedge of changing the default routine. The final mem_limit remains
to be 0.7 * available_mem / cluster_size in this case.
{code:java}
def compute_impalad_mem_limit(cluster_size):
# Set mem_limit of each impalad to the smaller of 12GB or
# 1/cluster_size (typically 1/3) of 70% of available memory.
#
# The default memory limit for an impalad is 80% of the total system memory.
On a
# mini-cluster with 3 impalads that means 240%. Since having an impalad be
OOM killed
# is very annoying, the mem limit will be reduced. This can be overridden
using the
# --impalad_args flag. virtual_memory().total returns the total physical
memory.
# The exact ratio to use is somewhat arbitrary. Peak memory usage during
# tests depends on the concurrency of parallel tests as well as their
ordering.
# On the other hand, to avoid using too much memory, we limit the
# memory choice here to max out at 12GB. This should be sufficient for tests.
#
# Beware that ASAN builds use more memory than regular builds.
physical_mem_gb = psutil.virtual_memory().total / 1024 / 1024 / 1024
available_mem = int(os.getenv("MAX_MEM_GB", str(physical_mem_gb)))
mem_limit = int(0.7 * available_mem * 1024 * 1024 * 1024 / cluster_size)
print("mem_limit" + str(mem_limit))
return min(12 * 1024 * 1024 * 1024, mem_limit)
{code}
3. in file
'testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py
b/testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py',
Similarly, a 'available_ram_gb' is added, and all of the other computation
logic remains identical.
{code:java}
def _get_yarn_nm_ram_mb():
sys_ram = _get_system_ram_mb()
available_ram_gb = int(os.getenv("MAX_MEM_GB", str(sys_ram / 1024)))
# Fit into the following envelope:
# - need 4GB at a bare minimum
# - leave at least 24G for other services
# - don't need more than 48G
ret = min(max(available_ram_gb * 1024 - 24 * 1024, 4096), 48 * 1024)
print >>sys.stderr, "Configuring Yarn NM to use {0}MB RAM".format(ret)
return ret
{code}
I am still testing this fix in my 32GB docker container which runs on a 128GB
physical machine.
> Limit the memory usage of the whole mini-cluster
> ------------------------------------------------
>
> Key: IMPALA-10193
> URL: https://issues.apache.org/jira/browse/IMPALA-10193
> Project: IMPALA
> Issue Type: Bug
> Components: Infrastructure
> Affects Versions: Impala 3.4.0
> Reporter: Fifteen
> Priority: Minor
> Attachments: image-2020-09-28-17-18-15-358.png
>
>
> The mini-cluster contains 3 virtual nodes, and all of them runs in a single
> 'Machine'. By quoting, it implies the machine can be a docker container. If
> the container is started with `-priviledged` and the actual memory is limited
> by CGROUPS, then the total memory in `top` and the actual available memory
> can be different!
>
> For example, in the container below, `top` tells us the total memory is
> 128GB, while the total memory set in CGROUPS is actually 32GB. If the acutal
> mem usage exceeds 32GB, process (such as impalad, hivemaster2 etc.) get
> killed.
> !image-2020-09-28-17-18-15-358.png!
>
> So we may need a way to limit the whole mini-cluster's memory usage.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]