[ 
https://issues.apache.org/jira/browse/PIG-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13552261#comment-13552261
 ] 

Jarek Jarcec Cecho commented on PIG-3002:
-----------------------------------------

Hi Bill,
again thank you for your comment. I appreciate your input. Please do not 
understand me wrong, I'll be more than happy to change my patch by including 
your suggestions. However I would like to make sure that we're on the same page 
first.

I've dug into Hadoop source code and the call getCounter(Enum) is defined in 
[1]. This call is just forwarded to findCounter(Enum) defined in 
AbstractCounters in [2]. This method will firstly check if the counter exists 
in object "cache", if so, then this object is returned without any exception 
being raised. If the object is not present, then it call method 
findCounter(String, String) that will eventually create the counter and throw 
CountersExceededException exception if needed. 

Let's assume for example that we have following counters with maximal counter 
limit set to 2:

* A => 1
* B => 2

If we try to call method getCounter(Enum) with values A, B and C, then we will 
get:

* A => 1
* B => 2
* C => CountersExceededException, because C is not in map "cache" and thus the 
code will try to create new counter and fail on configured limitation.

By this example, I'm trying to explain that the method getCounter(Enum) won't 
throw CountersExceededException for all subsequent calls after the Counters 
gets full, but just for those that do not yet exists. I'm also trying to show 
that the method computeWarningAggregate by itself won't affect generated 
aggregates.

You're definitely right that newly introduced function getCounterValue() is 
driven by single caller and that it's not the best way to do it. I've tried to 
put the code directly inside computeWarningAggregate(), but I failed on 
compilation error as CountersExceededException is not available in Hadoop 1.0. 
Therefore I've moved the code into the shim layer.

I've defined the method getCounterValue() to return long instead of Counter 
object to narrow down the purpose to just get the long value of the counters, 
not the counter object itself. I believe that in such case, swallowing the 
exception is reasonable, because in such case the counter do not exists there 
and thus it's value is 0. Maybe providing descriptive javadoc to method 
getCounterValue() would help here?

Jarcec

Links:
1: 
https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Counters.java#L516
2: 
https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/counters/AbstractCounters.java#L163
                
> Pig client should handle CountersExceededException
> --------------------------------------------------
>
>                 Key: PIG-3002
>                 URL: https://issues.apache.org/jira/browse/PIG-3002
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Bill Graham
>            Assignee: Jarek Jarcec Cecho
>              Labels: newbie, simple
>         Attachments: PIG-3002.patch
>
>
> Running a pig job that uses more than 120 counters will succeed, but a grunt 
> exception will occur when trying to output counter info to the console. This 
> exception should be caught and handled with friendly messaging:
> {noformat}
> org.apache.pig.backend.executionengine.ExecException: ERROR 2043: Unexpected 
> error during execution.
>         at org.apache.pig.PigServer.launchPlan(PigServer.java:1275)
>         at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1249)
>         at org.apache.pig.PigServer.execute(PigServer.java:1239)
>         at org.apache.pig.PigServer.executeBatch(PigServer.java:333)
>         at 
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:136)
>         at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:197)
>         at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:169)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
>         at org.apache.pig.Main.run(Main.java:604)
>         at org.apache.pig.Main.main(Main.java:154)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> Caused by: org.apache.hadoop.mapred.Counters$CountersExceededException: 
> Error: Exceeded limits on number of counters - Counters=120 Limit=120
>         at 
> org.apache.hadoop.mapred.Counters$Group.getCounterForName(Counters.java:312)
>         at org.apache.hadoop.mapred.Counters.findCounter(Counters.java:431)
>         at org.apache.hadoop.mapred.Counters.getCounter(Counters.java:495)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.computeWarningAggregate(MapReduceLauncher.java:707)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:442)
>         at org.apache.pig.PigServer.launchPlan(PigServer.java:1264)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to