Rajesh, What are the problems with having tez.runtime.shuffle.keep-alive.enabled and tez.runtime.optimize.local.fetch set to true always by default?
Regards, Rohini On Fri, Apr 24, 2015 at 1:54 AM, Rajesh Balamohan < [email protected]> wrote: > Listing some details at very high level, > > - Set "tez.task.generate.counters.per.io=true" to get more details on the > task counters. Basically this starts printinng the counters per edge, which > can be a lot more useful for debugging. > > - In case you want to avoid container launches etc when you analyze for > first time, try hive.prewarm.enabled=true & hive.prewarm.numcontainers=<no > of containers you want in your sesssion to be prewarmed> > > - Container reuse is enabled by default in tez. > (tez.am.container.idle.release-timeout-min.millis, > tez.am.container.idle.release-timeout-max.millis controls the amount of > time a container is held by AM before releasing it) > > - Set tez.runtime.io.sort.mb appropriately to avoid spills (you can check > task counters in the logs to find out the spills and adjust it accordingly) > > - Set tez.runtime.sort.threads=2 to enable PipelinedSorter which is a lot > performant than DefaultSorter (this is the default in master branch. But if > you are using earlier releases, you can turn it on by setting > tez.runtime.sort.threads=2). > > - Set tez.runtime.compress=true and set tez.runtime.compress.codec > (SnappyCodec is preferred, but it is upto you to choose) > > - Set tez.runtime.shuffle.keep-alive.enabled=true in case you have shuffle > heavy workload. This reduces number of connections in shuffle. > > - Adjust memory allocated to different inputs/outputs based on > tez.task.scale.memory.ratios (but this is more of expert level setting > which you might want to touch after nailing down any memory pressure) > > - Adjusting shuffle buffers are also possible, but would advise only when > you nail down an issue related to shuffle/merge codepath. > > - Set "tez.runtime.optimize.local.fetch=true" to bypass http fetches (when > data is locally present) > > > Feel free to refer to > https://github.com/t3rmin4t0r/tez-autobuild/blob/master/tez-site.xml for > any commonly used settings for benchmarks. > > On Fri, Apr 24, 2015 at 1:52 PM, [email protected] <[email protected]> > wrote: > >> I want to Tuning Tez Task Performance. This Tez Task is created by Hive. >> How to Tuning Tez Task Performance? >> Analyze performance by Tez Task Counts of Tez Log ? Any Suggestion? >> >> ------------------------------ >> [email protected] >> > > > > -- > ~Rajesh.B >
