>   What are the problems with having
>tez.runtime.shuffle.keep-alive.enabled and
>tez.runtime.optimize.local.fetch set to true always by default?

Nothing has failed due to these so far - we¹ve gone through one entire
release where we tested both heavily and found that they work very well at
scale.

local.fetch is already enabled by default in 0.7.x (TEZ-2333).

shared.fetch isn¹t getting flipped right now because last release it
didn¹t get enough coverage on customer setups (for my liking) to bake it
in (the broadcast edge didn¹t whitelist that config).

The keep-alive shuffle was tested on 350 nodes, with 10,000 mappers. And
the advantage of these were significant - between those three options a
broadcast JOIN went from about 30 minutes of shuffle time to around 2 1/2
minutes.

You do need a 64 bit OS (not sandbox) with a modern kernel to safely flip
these on - system configs on Centos need to roughly correspond to the
ktune settings for RHEL (other than THP & numad/zone_reclaim).

These configs help shuffle in general - off the top of my head,
tcp_fin_timeout and somaxconn comes to mind immediately as being the
relevant configs to always tune.

There¹s a certain inflection point we hit in shuffle, where it¹s worse to
be faster - fixes like HADOOP-11226 help there, but they need
router/switch configs as well.

Cheers,
Gopal


Reply via email to