Hi All, Is it possible to enforce a maximum to the disk space consumed by a map/reduce job's intermediate output? It looks like you can impose limits on hdfs consumption, or, via the capacity scheduler, limits on the RAM that a map/reduce slot uses, or the number of slots used.
But if I'm worried that a job might exhaust the cluster's disk capacity during the shuffle, my sense is that I'd have to quarantine the job on a separate cluster. Am I wrong? Do you have any suggestions for me? Thanks, Matt