[jira] [Commented] (PIG-4555) Add -XX:+UseNUMA for Tez jobs

Remi Catherinot (JIRA) Tue, 26 May 2015 01:26:53 -0700

    [ 
https://issues.apache.org/jira/browse/PIG-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14558850#comment-14558850
 ]


Remi Catherinot commented on PIG-4555:
--------------------------------------

Hi, i've written my answer in the original mail, with prefix "Remi: " in 
addition to what just follow.

I'm note the creator of the initial JIRA who is the one wanting to pass UseNUMA 
to the TEZ AM jvm through PIG. Like you said, the TEZ poor efficiency is a TEZ 
issue, not a PIG one. As for PIG, the 'real' problem, is more the fact that a 
certain level of expertise is needed to finely control what option end-up being 
used to launch TEZ AM. I made such a configuration mistake on my 1st tests 
using +UseNUMA. It's hard to know, among all possible ways to set command lines 
options, which one will end-up on the final command line.

For me, the case can be closed has not being a bug and if it still need a fix, 
it would more be a documentation-fix on explaining command line option control 
with tez/yarn.

Another point : I do finely tune my servers, I do use interrupt pinning, a 
certain level of process/cpu affinity & co, linux kernel module and drivers 
low-level settings, block devices settings, sysctl settings, read/write disc 
cache ratio & co, disabling hyper threading & co. I do play a lot with numa too 
and some other -XX jvm options. Even if I screw up my 1st tests, adding UseNUMA 
which splits the young generation across NUMA would more likely trigger a real 
OOM than solving it (because each young generation part is smaller, one small 
amount per-numa node, not sure if the jvm accept to use the young generation of 
another numa node when one if full), except if there is a bug the JVM itself 
when interleaving the heap young generation. UseNUMA does not change the amount 
of memory the JVM can use (and so TEZ inside the JVM). That is also why I 
reacted to the JIRA in the 1st place, because I'm pretty sure the real problem 
is not where the JIRA suggest it is. Maybe the author had a problem like mine : 
when forcing the UseNUMA option, he also forced some other options, and that is 
maybe those options that solved the OOM issue.

-----Message d'origine-----
De : Rohini Palaniswamy (JIRA) [mailto:[email protected]] 
Envoyé : samedi 23 mai 2015 00:57
À : CATHERINOT Rémi Ext DTSI/DERS
Objet : [jira] [Commented] (PIG-4555) Add -XX:+UseNUMA for Tez jobs


    [ 
https://issues.apache.org/jira/browse/PIG-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14556958#comment-14556958
 ] 

Rohini Palaniswamy commented on PIG-4555:
-----------------------------------------

bq. i end-up having my containers (the AM one) being killed because they use 
too much virtual memory (about 17GB of virtual memory)
   17GB is really bad. How much was the Xmx? What is the virtual memory without 
NUMA?

Remi: When settings the +XX:UseNUMA option, i end up having some other default 
options overridden, including my -Xmx option. So the jvm started to use its 
default sizing corresping to my hardware (which is 64 bits and is a 64Gb server.

bq. But for sure, in my case, setting -XX:+UseNUMA do trigger an OOM.
   Are you sure it hits OOM or just the container being killed because of 
yarn.nodemanager.vmem-pmem-ratio being breached? 

Remi: OOM is an abuse of langage there. it is effectively a container-killed 
issue do to container virtual memory consumption extimation/limitation.

bq. I'm pretty sure there is already some configuration variables one can set 
in its tez-site.xml file to set this option so no need to have pig force this 
setting by code. For what i understand, the real problem is not about 
-XX!:+UseNUMA. The real problem is more that some option from the tez 
configuration are ignored.
   TEZ_AM_LAUNCH_CMD_OPTS_DEFAULT is "-XX:+PrintGCDetails -verbose:gc 
-XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC" . i.e  -XX:+UseNUMA is 
part of default tez AM options. In Pig, we give preference to mapreduce AM 
settings (if tez.am.launch.cmd-opts is not overriden in tez-site.xml) and 
translate them to tez instead of using the mentioned tez defaults. Since the 
mapreduce AM settings are always there from mapred-default.xml or 
mapred-site.xml, -XX:+UseNUMA is never there. So this is about making use of 
the default tez settings in Pig. If in a particular environment  -XX:+UseNUMA 
is problematic, it can be overriden in tez-site.xml.

Remi: Using tez.am.launch.cmd-opts in tez-site.xml is the answer for the 
original author. I'm no tez expert neither with nor without pig because I use 
yarn and pure mapreduce version 2 jobs. But I was pretty sure such 
configuration variables already existed. My point was more to have the author 
use what already existed rather that maybe having someone starting to work on a 
patch that was not needed or even which could have been a bad idea (forcing 
parameter by code rather that by simple user-environment configuration). It's 
just that in the past, I've seen some JIRA that have been implemented (like the 
one for CMX support and which is currently being pushed into the future PIG 
0.15) which I really think should not have been implemented the way it is right 
now (more or less using lzo name to passe CMX codec and hack in lzo/cmx 
false/true encoding detection to make one call the other, not sure that would 
be stable for multiple jobs using both encodings in the same JVM since lots of 
compression codec configurations are static). I use -XX:+UseNUMA myself now 
that I've have setted the right configuration variable to not lose my other Xmx 
settings and it work pretty well also for map-reduce v2 yarn jobs.

The real issue of why Tez AM performed poorly without NUMA is still there and 
will be tracked in TEZ jira. You have some concerns raised and I don't have 
knowledgeable answers for them at this point. So moved this to 0.16 and will 
add this after we actually fully understand more about the NUMA behavior and 
what is happening with and without NUMA in Tez AM. 

     




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.



> Add -XX:+UseNUMA for Tez jobs
> -----------------------------
>
>                 Key: PIG-4555
>                 URL: https://issues.apache.org/jira/browse/PIG-4555
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.16.0
>
>
>     For very big Tez jobs (~50K tasks), AM quickly goes OOM without 
> -XX:+UseNUMA. tez.am.launch.cmd-opts default setting has that, but since pig 
> gives preference to yarn.app.mapreduce.am.command-opts if present (which 
> usually it is),  -XX:+UseNUMA is not there. Need to add -XX:+UseNUMA if we 
> are picking up mapreduce setting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4555) Add -XX:+UseNUMA for Tez jobs

Reply via email to