Thank you!


[email protected]
 
From: Rajesh Balamohan
Date: 2015-04-24 17:24
To: user
Subject: Re: How to Tuning Tez Task Performance
Listing some details at very high level,

- Set "tez.task.generate.counters.per.io=true" to get more details on the task 
counters. Basically this starts printinng the counters per edge, which can be a 
lot more useful for debugging.

- In case you want to avoid container launches etc when you analyze for first 
time, try hive.prewarm.enabled=true & hive.prewarm.numcontainers=<no of 
containers you want in your sesssion to be prewarmed>

- Container reuse is enabled by default in tez. 
(tez.am.container.idle.release-timeout-min.millis, 
tez.am.container.idle.release-timeout-max.millis controls the amount of time a 
container is held by AM before releasing it)

- Set tez.runtime.io.sort.mb appropriately to avoid spills (you can check task 
counters in the logs to find out the spills and adjust it accordingly)

- Set tez.runtime.sort.threads=2 to enable PipelinedSorter which is a lot 
performant than DefaultSorter (this is the default in master branch. But if you 
are using earlier releases, you can turn it on by setting 
tez.runtime.sort.threads=2).

- Set tez.runtime.compress=true and set tez.runtime.compress.codec (SnappyCodec 
is preferred, but it is upto you to choose)

- Set tez.runtime.shuffle.keep-alive.enabled=true in case you have shuffle 
heavy workload. This reduces number of connections in shuffle.

- Adjust memory allocated to different inputs/outputs based on 
tez.task.scale.memory.ratios (but this is more of expert level setting which 
you might want to touch after nailing down any memory pressure)

- Adjusting shuffle buffers are also possible, but would advise only when you 
nail down an issue related to shuffle/merge codepath.

- Set "tez.runtime.optimize.local.fetch=true" to bypass http fetches (when data 
is locally present)


Feel free to refer to 
https://github.com/t3rmin4t0r/tez-autobuild/blob/master/tez-site.xml for any 
commonly used settings for benchmarks.

On Fri, Apr 24, 2015 at 1:52 PM, [email protected] <[email protected]> wrote:
I want to  Tuning Tez Task Performance. This Tez Task is created by Hive.  How 
to Tuning Tez Task Performance?
Analyze performance  by Tez Task Counts  of Tez Log ? Any Suggestion?



[email protected]



-- 
~Rajesh.B

Reply via email to