quotas for size of intermediate map/reduce output?

Matt Steele Wed, 21 Sep 2011 15:45:40 -0700

Hi All,

Is it possible to enforce a maximum to the disk space consumed by a
map/reduce job's intermediate output?  It looks like you can impose limits
on hdfs consumption, or, via the capacity scheduler, limits on the RAM that
a map/reduce slot uses, or the number of slots used.


But if I'm worried that a job might exhaust the cluster's disk capacity
during the shuffle, my sense is that I'd have to quarantine the job on a
separate cluster.  Am I wrong?  Do you have any suggestions for me?

Thanks,
Matt

quotas for size of intermediate map/reduce output?

Reply via email to