[ https://issues.apache.org/jira/browse/OOZIE-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Thistlethwaite deleted OOZIE-3603: ---------------------------------------- > Oozie Luncher & Map-Reduce Action Complete Successfully However Oozie Still > Fails the Action > -------------------------------------------------------------------------------------------- > > Key: OOZIE-3603 > URL: https://issues.apache.org/jira/browse/OOZIE-3603 > Project: Oozie > Issue Type: Bug > Environment: Oozie Version 5.1.0-CDH6.3.1 > Reporter: houman babai > Priority: Major > > I am using oozie 5.1.0-cdh6.3.1 > In my workflow I have a mapreduce action, which generates over 300 counters. > The oozie launcher & the mapreduce job successfully complete, however, oozie > reports that: > {code} > Error Code: LimitExceededException > LimitExceededException: Too many counters: 121 max=120 > {code} > I have updated mapred-site.xml. > The log for the mapreduce job reports success, in fact I can see all the > counters & the actual output of the mapreduce job on hdfs. > In the oozie launcher log I can see: > * mapreduce.job.counters.max : 8192 > * mapreduce.job.counters.groups.max : 100 > Furthermore, the oozie launcher log ends with: > {code:java} > -------------------- > Submitting Oozie action Map-Reduce job > ======================= > <<< Invocation of Main class completed <<< > Oozie Launcher, propagating new Hadoop job id to Oozie > ======================= > job_1594765755382_0035 > ======================= > Oozie Launcher, uploading action data to HDFS sequence file: > hdfs://HDFS/user/MY-ID/oozie-oozi/0000012-200714223028181-oozie-oozi-W/ACTION-NAME--map-reduce/action-data.seq > Stopping AM > Callback notification attempts left 0 > Callback notification trying > http://OOZIE-URL:11000/oozie/callback?id=0000012-200714223028181-oozie-oozi-W@ACTION-NAME&status=RUNNING > Callback notification to > http://OOZIE-URL:11000/oozie/callback?id=0000012-200714223028181-oozie-oozi-W@ACTION-NAME&status=RUNNING > succeeded > Callback notification succeeded > {code} > I dug out the the following from the oozie logs: > {code} > 114108 2020-07-15 17:57:02,253 TRACE > org.apache.oozie.command.wf.ActionEndXCommand: SERVER[SERVER-NAME] > USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] > JOB[0000012-200714223028181-oozie-oozi-W] > ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Precondition check > for command [action.end] key [0000012-200714223028181-oozie-oozi-W] > 114109 2020-07-15 17:57:02,253 DEBUG > org.apache.oozie.command.wf.ActionEndXCommand: SERVER[SERVER-NAME] > USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] > JOB[0000012-200714223028181-oozie-oozi-W] > ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Execute command > [action.end] key [0000012-200714223028181-oozie-oozi-W] > 114110 2020-07-15 17:57:02,253 DEBUG > org.apache.oozie.command.wf.ActionEndXCommand: SERVER[SERVER-NAME] > USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] > JOB[0000012-200714223028181-oozie-oozi-W] > ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] STARTED > ActionEndXCommand for action 0000012-200714223028181-oozie-oozi-W@ACTION-NAME > 114111 2020-07-15 17:57:02,259 DEBUG > org.apache.oozie.command.wf.ActionEndXCommand: SERVER[SERVER-NAME] > USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] > JOB[0000012-200714223028181-oozie-oozi-W] > ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] End, name > [ACTION-NAME] type [map-reduce] status[DONE] external status [SUCCEEDED] > signal value [null] > 114112 2020-07-15 17:57:02,260 INFO > org.apache.oozie.action.hadoop.MapReduceActionExecutor: SERVER[SERVER-NAME] > USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] > JOB[0000012-200714223028181-oozie-oozi-W] > ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Action ended with > external status [SUCCEEDED] > 114113 2020-07-15 17:57:02,260 DEBUG > org.apache.oozie.service.HadoopAccessorService: SERVER[SERVER-NAME] > USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] > JOB[0000012-200714223028181-oozie-oozi-W] > ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Checking if > filesystem hdfs is supported > 114114 2020-07-15 17:57:02,261 DEBUG > org.apache.oozie.service.HadoopAccessorService: SERVER[SERVER-NAME] > USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] > JOB[0000012-200714223028181-oozie-oozi-W] > ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Checking if > filesystem hdfs is supported > 114115 2020-07-15 17:57:02,340 WARN > org.apache.oozie.command.wf.ActionEndXCommand: SERVER[SERVER-NAME] > USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] > JOB[0000012-200714223028181-oozie-oozi-W] > ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Error ending action > [ACTION-NAME]. ErrorType [ERROR], ErrorCode [LimitExceededException], Message > [LimitExceededException: Too many counters: 121 max=120] > 114116 2020-07-15 17:57:02,341 WARN > org.apache.oozie.command.wf.ActionEndXCommand: SERVER[SERVER-NAME] > USER[MY-NAME] GROUP[-] TOKEN[] APP[APP-NAME] > JOB[0000012-200714223028181-oozie-oozi-W] > ACTION[0000012-200714223028181-oozie-oozi-W@ACTION-NAME] Setting Action > Status to [ERROR] > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)