[
https://issues.apache.org/jira/browse/PIG-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13115040#comment-13115040
]
Daniel Dai commented on PIG-2208:
---------------------------------
I cannot agree more on this. User will have to run a handicapped Pig in some
cases, which is not good. However, I cannot find a workaround easier than the
proposed option.
"local aggregation" is a good addition to this approach. When counter is
disabled, user can at least check log. But counters are handier than logs, so
keeping counters makes sense.
"with [no]counters" also makes sense when user want finer control. But an
overall control should still be there in case user don't want to change script.
I will add a comment in the pig.properties to explain "pig.disable.counter"
option.
> Restrict number of PIG generated Haddop counters
> -------------------------------------------------
>
> Key: PIG-2208
> URL: https://issues.apache.org/jira/browse/PIG-2208
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.8.1, 0.9.0
> Reporter: Richard Ding
> Assignee: Richard Ding
> Fix For: 0.9.1
>
> Attachments: PIG-2208.patch
>
>
> PIG 8.0 implemented Hadoop counters to track the number of records read for
> each input and the number of records written for each output (PIG-1389 &
> PIG-1299). On the other hand, Hadoop has imposed limit on per job counters
> (MAPREDUCE-1943) and jobs will fail if the counters exceed the limit.
> Therefore we need a way to cap the number of PIG generated counters.
> Here are the two options:
> 1. Add a integer property (e.g., pig.counter.limit) to the pig property file
> (e.g., 20). If the number of inputs of a job exceeds this number, the input
> counters are disabled. Similarly, if the number of outputs of a job exceeds
> this number, the output counters are disabled.
> 2. Add a boolean property (e.g., pig.disable.counters) to the pig property
> file (default: false). If this property is set to true, then the PIG
> generated counters are disabled.
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira