Cheolsoo Park created PIG-3288:
----------------------------------

             Summary: Kill jobs if the number of output files is over a 
configurable limit
                 Key: PIG-3288
                 URL: https://issues.apache.org/jira/browse/PIG-3288
             Project: Pig
          Issue Type: Wish
            Reporter: Cheolsoo Park
            Assignee: Cheolsoo Park
             Fix For: 0.12


I ran into a situation where a Pig job tried to create too many files on hdfs 
and overloaded NN. To prevent such events, it would be nice if we could set a 
upper limit on the number of files that a Pig job can create.

In fact, Hive has a property called "hive.exec.max.created.files". The idea is 
that each mapper/reducer increases a counter every time when they create files. 
Then, MRLauncher periodically checks whether the number of created files so far 
has exceeded the upper limit. If so, we kill running jobs and exit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to