Is there a way to tell Pig to restrict the size of map/reduce output that can 
be saved to dfs? E.g. if a job creates over-limit data, it won't be allowed to 
save the result to the dfs and the job will fail. 

This will help to prevent unexpected huge data from being saved to dfs by 
mapper/reducer created by a Pig script. This means we have an estimate of how 
much data will be generated by a Pig script in advance. Then, with this quota, 
if over-sized result is generated, it won't be saved and the job fails.

Thanks,
Michael


      

Reply via email to