[
https://issues.apache.org/jira/browse/PIG-3288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13708292#comment-13708292
]
Aniket Mokashi commented on PIG-3288:
-------------------------------------
[~cheolsoo], how about taking approach similar to MonitoredUDF? That way,
instead of a common property for all sorts of errors, you can configure your
own property inside your EvalFunc/LoadFunc with Annotations and pig will kill
the job if the UDF misbehaves (with respect to contract of the udf rather than
contract of the pig-installation aka pig.properties).
I have another use case that can utilize this framework (if we build one). I
can use this with assertions: I can annotate assert udf and kill the job
instantaneously if the assertion fails.
> Kill jobs if the number of output files is over a configurable limit
> --------------------------------------------------------------------
>
> Key: PIG-3288
> URL: https://issues.apache.org/jira/browse/PIG-3288
> Project: Pig
> Issue Type: Wish
> Reporter: Cheolsoo Park
> Assignee: Cheolsoo Park
> Fix For: 0.12
>
> Attachments: PIG-3288-2.patch, PIG-3288-3.patch, PIG-3288-4.patch,
> PIG-3288-5.patch, PIG-3288.patch
>
>
> I ran into a situation where a Pig job tried to create too many files on hdfs
> and overloaded NN. To prevent such events, it would be nice if we could set a
> upper limit on the number of files that a Pig job can create.
> In fact, Hive has a property called "hive.exec.max.created.files". The idea
> is that each mapper/reducer increases a counter every time when they create
> files. Then, MRLauncher periodically checks whether the number of created
> files so far has exceeded the upper limit. If so, we kill running jobs and
> exit.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira