Hi All,

Is it possible to enforce a maximum to the disk space consumed by a
map/reduce job's intermediate output?  It looks like you can impose limits
on hdfs consumption, or, via the capacity scheduler, limits on the RAM that
a map/reduce slot uses, or the number of slots used.

But if I'm worried that a job might exhaust the cluster's disk capacity
during the shuffle, my sense is that I'd have to quarantine the job on a
separate cluster.  Am I wrong?  Do you have any suggestions for me?

Thanks,
Matt

Reply via email to