[ https://issues.apache.org/jira/browse/PIG-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13552261#comment-13552261 ]
Jarek Jarcec Cecho commented on PIG-3002: ----------------------------------------- Hi Bill, again thank you for your comment. I appreciate your input. Please do not understand me wrong, I'll be more than happy to change my patch by including your suggestions. However I would like to make sure that we're on the same page first. I've dug into Hadoop source code and the call getCounter(Enum) is defined in [1]. This call is just forwarded to findCounter(Enum) defined in AbstractCounters in [2]. This method will firstly check if the counter exists in object "cache", if so, then this object is returned without any exception being raised. If the object is not present, then it call method findCounter(String, String) that will eventually create the counter and throw CountersExceededException exception if needed. Let's assume for example that we have following counters with maximal counter limit set to 2: * A => 1 * B => 2 If we try to call method getCounter(Enum) with values A, B and C, then we will get: * A => 1 * B => 2 * C => CountersExceededException, because C is not in map "cache" and thus the code will try to create new counter and fail on configured limitation. By this example, I'm trying to explain that the method getCounter(Enum) won't throw CountersExceededException for all subsequent calls after the Counters gets full, but just for those that do not yet exists. I'm also trying to show that the method computeWarningAggregate by itself won't affect generated aggregates. You're definitely right that newly introduced function getCounterValue() is driven by single caller and that it's not the best way to do it. I've tried to put the code directly inside computeWarningAggregate(), but I failed on compilation error as CountersExceededException is not available in Hadoop 1.0. Therefore I've moved the code into the shim layer. I've defined the method getCounterValue() to return long instead of Counter object to narrow down the purpose to just get the long value of the counters, not the counter object itself. I believe that in such case, swallowing the exception is reasonable, because in such case the counter do not exists there and thus it's value is 0. Maybe providing descriptive javadoc to method getCounterValue() would help here? Jarcec Links: 1: https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Counters.java#L516 2: https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/counters/AbstractCounters.java#L163 > Pig client should handle CountersExceededException > -------------------------------------------------- > > Key: PIG-3002 > URL: https://issues.apache.org/jira/browse/PIG-3002 > Project: Pig > Issue Type: Bug > Reporter: Bill Graham > Assignee: Jarek Jarcec Cecho > Labels: newbie, simple > Attachments: PIG-3002.patch > > > Running a pig job that uses more than 120 counters will succeed, but a grunt > exception will occur when trying to output counter info to the console. This > exception should be caught and handled with friendly messaging: > {noformat} > org.apache.pig.backend.executionengine.ExecException: ERROR 2043: Unexpected > error during execution. > at org.apache.pig.PigServer.launchPlan(PigServer.java:1275) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1249) > at org.apache.pig.PigServer.execute(PigServer.java:1239) > at org.apache.pig.PigServer.executeBatch(PigServer.java:333) > at > org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:136) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:197) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:169) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) > at org.apache.pig.Main.run(Main.java:604) > at org.apache.pig.Main.main(Main.java:154) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:186) > Caused by: org.apache.hadoop.mapred.Counters$CountersExceededException: > Error: Exceeded limits on number of counters - Counters=120 Limit=120 > at > org.apache.hadoop.mapred.Counters$Group.getCounterForName(Counters.java:312) > at org.apache.hadoop.mapred.Counters.findCounter(Counters.java:431) > at org.apache.hadoop.mapred.Counters.getCounter(Counters.java:495) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.computeWarningAggregate(MapReduceLauncher.java:707) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:442) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1264) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira