[
https://issues.apache.org/jira/browse/PIG-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14558850#comment-14558850
]
Remi Catherinot commented on PIG-4555:
--------------------------------------
Hi, i've written my answer in the original mail, with prefix "Remi: " in
addition to what just follow.
I'm note the creator of the initial JIRA who is the one wanting to pass UseNUMA
to the TEZ AM jvm through PIG. Like you said, the TEZ poor efficiency is a TEZ
issue, not a PIG one. As for PIG, the 'real' problem, is more the fact that a
certain level of expertise is needed to finely control what option end-up being
used to launch TEZ AM. I made such a configuration mistake on my 1st tests
using +UseNUMA. It's hard to know, among all possible ways to set command lines
options, which one will end-up on the final command line.
For me, the case can be closed has not being a bug and if it still need a fix,
it would more be a documentation-fix on explaining command line option control
with tez/yarn.
Another point : I do finely tune my servers, I do use interrupt pinning, a
certain level of process/cpu affinity & co, linux kernel module and drivers
low-level settings, block devices settings, sysctl settings, read/write disc
cache ratio & co, disabling hyper threading & co. I do play a lot with numa too
and some other -XX jvm options. Even if I screw up my 1st tests, adding UseNUMA
which splits the young generation across NUMA would more likely trigger a real
OOM than solving it (because each young generation part is smaller, one small
amount per-numa node, not sure if the jvm accept to use the young generation of
another numa node when one if full), except if there is a bug the JVM itself
when interleaving the heap young generation. UseNUMA does not change the amount
of memory the JVM can use (and so TEZ inside the JVM). That is also why I
reacted to the JIRA in the 1st place, because I'm pretty sure the real problem
is not where the JIRA suggest it is. Maybe the author had a problem like mine :
when forcing the UseNUMA option, he also forced some other options, and that is
maybe those options that solved the OOM issue.
-----Message d'origine-----
De : Rohini Palaniswamy (JIRA) [mailto:[email protected]]
Envoyé : samedi 23 mai 2015 00:57
À : CATHERINOT Rémi Ext DTSI/DERS
Objet : [jira] [Commented] (PIG-4555) Add -XX:+UseNUMA for Tez jobs
[
https://issues.apache.org/jira/browse/PIG-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14556958#comment-14556958
]
Rohini Palaniswamy commented on PIG-4555:
-----------------------------------------
bq. i end-up having my containers (the AM one) being killed because they use
too much virtual memory (about 17GB of virtual memory)
17GB is really bad. How much was the Xmx? What is the virtual memory without
NUMA?
Remi: When settings the +XX:UseNUMA option, i end up having some other default
options overridden, including my -Xmx option. So the jvm started to use its
default sizing corresping to my hardware (which is 64 bits and is a 64Gb server.
bq. But for sure, in my case, setting -XX:+UseNUMA do trigger an OOM.
Are you sure it hits OOM or just the container being killed because of
yarn.nodemanager.vmem-pmem-ratio being breached?
Remi: OOM is an abuse of langage there. it is effectively a container-killed
issue do to container virtual memory consumption extimation/limitation.
bq. I'm pretty sure there is already some configuration variables one can set
in its tez-site.xml file to set this option so no need to have pig force this
setting by code. For what i understand, the real problem is not about
-XX!:+UseNUMA. The real problem is more that some option from the tez
configuration are ignored.
TEZ_AM_LAUNCH_CMD_OPTS_DEFAULT is "-XX:+PrintGCDetails -verbose:gc
-XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC" . i.e -XX:+UseNUMA is
part of default tez AM options. In Pig, we give preference to mapreduce AM
settings (if tez.am.launch.cmd-opts is not overriden in tez-site.xml) and
translate them to tez instead of using the mentioned tez defaults. Since the
mapreduce AM settings are always there from mapred-default.xml or
mapred-site.xml, -XX:+UseNUMA is never there. So this is about making use of
the default tez settings in Pig. If in a particular environment -XX:+UseNUMA
is problematic, it can be overriden in tez-site.xml.
Remi: Using tez.am.launch.cmd-opts in tez-site.xml is the answer for the
original author. I'm no tez expert neither with nor without pig because I use
yarn and pure mapreduce version 2 jobs. But I was pretty sure such
configuration variables already existed. My point was more to have the author
use what already existed rather that maybe having someone starting to work on a
patch that was not needed or even which could have been a bad idea (forcing
parameter by code rather that by simple user-environment configuration). It's
just that in the past, I've seen some JIRA that have been implemented (like the
one for CMX support and which is currently being pushed into the future PIG
0.15) which I really think should not have been implemented the way it is right
now (more or less using lzo name to passe CMX codec and hack in lzo/cmx
false/true encoding detection to make one call the other, not sure that would
be stable for multiple jobs using both encodings in the same JVM since lots of
compression codec configurations are static). I use -XX:+UseNUMA myself now
that I've have setted the right configuration variable to not lose my other Xmx
settings and it work pretty well also for map-reduce v2 yarn jobs.
The real issue of why Tez AM performed poorly without NUMA is still there and
will be tracked in TEZ jira. You have some concerns raised and I don't have
knowledgeable answers for them at this point. So moved this to 0.16 and will
add this after we actually fully understand more about the NUMA behavior and
what is happening with and without NUMA in Tez AM.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
_________________________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou
falsifie. Merci.
This message and its attachments may contain confidential or privileged
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been
modified, changed or falsified.
Thank you.
> Add -XX:+UseNUMA for Tez jobs
> -----------------------------
>
> Key: PIG-4555
> URL: https://issues.apache.org/jira/browse/PIG-4555
> Project: Pig
> Issue Type: Improvement
> Reporter: Rohini Palaniswamy
> Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
>
> For very big Tez jobs (~50K tasks), AM quickly goes OOM without
> -XX:+UseNUMA. tez.am.launch.cmd-opts default setting has that, but since pig
> gives preference to yarn.app.mapreduce.am.command-opts if present (which
> usually it is), -XX:+UseNUMA is not there. Need to add -XX:+UseNUMA if we
> are picking up mapreduce setting.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)