[
https://issues.apache.org/jira/browse/HIVE-12683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060465#comment-15060465
]
Hitesh Shah commented on HIVE-12683:
------------------------------------
The Tez AM resource sizing has no relation to the task container sizing. That
said, for various benchmarks done in the past, I dont believe anyone has needed
to go beyond 16GB for the Tez AM for very large DAGs.
[~rohitgarg1989] What was the AM size configured to when the OOM happened? If
you are running a version older than Tez 0.7.0, there were some memory issues
that require a large AM size i.e. large being say 16 GB but for 0.7.0 and
higher, even 4 GB should be sufficient for a decent sized DAG. You can set it
to 8 GB to be safe for now with Xmx say 6.4 GB and that should be sufficient.
If you still hit an OOM with 8 GB, a jira against Tez with the heap dump would
be helpful.
[~gopalv] anything to add? any configs that need to be tuned / turned off for
Hive that ends up using more memory in the AM? Any implicit caching of splits,
etc?
> Does Tez run slower than hive on larger dataset (~2.5 TB)?
> ----------------------------------------------------------
>
> Key: HIVE-12683
> URL: https://issues.apache.org/jira/browse/HIVE-12683
> Project: Hive
> Issue Type: Bug
> Reporter: rohit garg
>
> We have started to look into testing tez query engine. From initial results,
> we are getting 30% performance boost over Hive on smaller data set(1-10 GB)
> but Hive starts to perform better than Tez as data size increases. Like when
> we run a hive query with Tez on about 2.3 TB worth of data, it performs worse
> than hive alone.(~20% less performance) Details are in the post below.
> On a cluster with 1.3 TB RAM, I set the following property :
> set tez.task.resource.memory.mb=10000; set tez.am.resource.memory.mb=59205;
> set tez.am.launch.cmd-opts =-Xmx47364m; set hive.tez.container.size=59205;
> set hive.tez.java.opts=-Xmx47364m; set tez.am.grouping.max-size=36700160000;
> Is it normal or I am missing some property / not configuring some property
> properly? Also, I am using an older version of Tez as of now. Could that be
> the issue too? I still have to bootstrap latest version of Tez on EMR and
> test it and see if that could do any better.
> Thought of asking here too
> http://www.jwplayer.com/blog/hive-with-tez-on-emr/
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)