> Anyway.. wanna to ask if there is any Tez configuration or future
>release (running Tez 0.6) which might improve the disk utilisation during
>such heavyweight sorts !?

Make sure compression is turned on. Everytime I¹ve seen this issue, it had
to do with someone turning off compression due to a bad libsnappy install.

tez.runtime.compress/tez.runtime.compress.codec

If you happen to use DefaultCodec, remember to set
zlib.compress.level=BEST_SPEED (is not an int) as well in the job conf.

Further up in 0.8.x land, we don¹t do full merges if the pipelined shuffle
is turned on, which for a 6 disk system allows a single skewed task to be
about 12x bigger before hitting this exception.

Cheers,
Gopal


Reply via email to