[jira] [Commented] (MAPREDUCE-4950) MR App Master fails to write the history due to AvroTypeException
[ https://issues.apache.org/jira/browse/MAPREDUCE-4950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17490348#comment-17490348 ] Raman Chodźka commented on MAPREDUCE-4950: -- I am also experiencing this same issue. The culprit seems to be an exception which happens earlier. For example, in my case there is an exception thrown inside eventHandlingThread in org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: {code} 2022-02-10 12:21:58,913 ERROR [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Error writing History Event: org.apache.hadoop.mapreduce.jobhistory.MapAttemptFinishedEvent@5da2cfca java.io.IOException: All datanodes [DatanodeInfoWithStorage[195.201.110.185:50010,DS-fe52ee42-b47a-4ad1-8d4c-8400d6c95b18,DISK]] are bad. Aborting... at org.apache.hadoop.hdfs.DataStreamer.handleBadDatanode(DataStreamer.java:1537) at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1472) at org.apache.hadoop.hdfs.DataStreamer.processDatanodeError(DataStreamer.java:1244) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:663) {code} The exception is thrown in eventHandlingThread when writing to EventWriter which sends event to DatumWriter writer. JsonEncoder is used along with DatumWriter. JsonEncoder uses Parser for some kind of validation during serialization. Apparently the aforementioned IOException leaves Parser in an invalid state (also eventHandlingThread, probably, finishes execution). Finally, when all tasks are complete, JobHistoryEventHandler in serviceStop() tries to write an event via EventWriter which results in {code} 2022-02-10 12:21:58,994 WARN [Thread-71] org.apache.hadoop.service.CompositeService: When stopping the service JobHistoryEventHandler : org.apache.avro.AvroTypeException: Attempt to process a enum when a item-end was expected. org.apache.avro.AvroTypeException: Attempt to process a enum when a item-end was expected. at org.apache.avro.io.parsing.Parser.advance(Parser.java:93) at org.apache.avro.io.JsonEncoder.writeEnum(JsonEncoder.java:234) at org.apache.avro.specific.SpecificDatumWriter.writeEnum(SpecificDatumWriter.java:59) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:67) at org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:114) at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58) at org.apache.hadoop.mapreduce.jobhistory.EventWriter.write(EventWriter.java:95) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$MetaInfo.writeEvent(JobHistoryEventHandler.java:1607) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:645) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:443) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:222) at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54) at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:104) at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158) at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStop(MRAppMaster.java:1855) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:222) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.stop(MRAppMaster.java:1293) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.shutDownJob(MRAppMaster.java:653) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler$1.run(MRAppMaster.java:732) {code} In my case I increased replication factor from 1 to 2 (I have such a small replication factor because those datanodes belong to a QA environment) which made "IOException: All datanodes .., are bad." error less likely. One might also try setting {{mapreduce.jobhistory.jhist.format}} to {{binary}} since BinaryEncoder doesn't seem to perform validation during serialization. But I didn't check whether it works. Even if it does, if an exception is thrown during writing an event to hdfs, the event might end up being partially written potentially leaving events file in corrupt state. > MR App Master fails to write the history due to AvroTypeException > - > > Key: MAPREDUCE-4950 > URL:
[jira] [Commented] (MAPREDUCE-4950) MR App Master fails to write the history due to AvroTypeException
[ https://issues.apache.org/jira/browse/MAPREDUCE-4950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16077088#comment-16077088 ] Raul Gutierrez Segales commented on MAPREDUCE-4950: --- seeing this again with 2.7.1, trying to confirm if {{mapreduce.map.speculative=false}} makes it go away. > MR App Master fails to write the history due to AvroTypeException > - > > Key: MAPREDUCE-4950 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4950 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver, mr-am >Reporter: Devaraj K >Priority: Critical > > {code:xml} > 2013-01-19 19:31:27,269 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: In stop, > writing event MAP_ATTEMPT_STARTED > 2013-01-19 19:31:27,269 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.service.CompositeService: Error stopping > JobHistoryEventHandler > org.apache.avro.AvroTypeException: Attempt to process a enum when a > array-start was expected. > at org.apache.avro.io.parsing.Parser.advance(Parser.java:93) > at org.apache.avro.io.JsonEncoder.writeEnum(JsonEncoder.java:210) > at > org.apache.avro.specific.SpecificDatumWriter.writeEnum(SpecificDatumWriter.java:54) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66) > at > org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:65) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:57) > at > org.apache.hadoop.mapreduce.jobhistory.EventWriter.write(EventWriter.java:66) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$MetaInfo.writeEvent(JobHistoryEventHandler.java:825) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:517) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.stop(JobHistoryEventHandler.java:346) > at > org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:99) > at > org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:89) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler.handle(MRAppMaster.java:445) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler.handle(MRAppMaster.java:406) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) > at java.lang.Thread.run(Thread.java:662) > 2013-01-19 19:31:27,271 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Deleting staging directory > hdfs://hacluster /root/staging-dir/root/.staging/job_1358603069474_0135 > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-4950) MR App Master fails to write the history due to AvroTypeException
[ https://issues.apache.org/jira/browse/MAPREDUCE-4950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338740#comment-15338740 ] Raul Gutierrez Segales commented on MAPREDUCE-4950: --- [~fengshen], [~banasc], [~devaraj.k]: did you ever get to the root cause? I am seeing and I suspect it has something to do with speculative maps on big jobs.. Any luck when disabling speculative execution? > MR App Master fails to write the history due to AvroTypeException > - > > Key: MAPREDUCE-4950 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4950 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver, mr-am >Reporter: Devaraj K >Priority: Critical > > {code:xml} > 2013-01-19 19:31:27,269 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: In stop, > writing event MAP_ATTEMPT_STARTED > 2013-01-19 19:31:27,269 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.service.CompositeService: Error stopping > JobHistoryEventHandler > org.apache.avro.AvroTypeException: Attempt to process a enum when a > array-start was expected. > at org.apache.avro.io.parsing.Parser.advance(Parser.java:93) > at org.apache.avro.io.JsonEncoder.writeEnum(JsonEncoder.java:210) > at > org.apache.avro.specific.SpecificDatumWriter.writeEnum(SpecificDatumWriter.java:54) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66) > at > org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:65) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:57) > at > org.apache.hadoop.mapreduce.jobhistory.EventWriter.write(EventWriter.java:66) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$MetaInfo.writeEvent(JobHistoryEventHandler.java:825) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:517) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.stop(JobHistoryEventHandler.java:346) > at > org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:99) > at > org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:89) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler.handle(MRAppMaster.java:445) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler.handle(MRAppMaster.java:406) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) > at java.lang.Thread.run(Thread.java:662) > 2013-01-19 19:31:27,271 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Deleting staging directory > hdfs://hacluster /root/staging-dir/root/.staging/job_1358603069474_0135 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-4950) MR App Master fails to write the history due to AvroTypeException
[ https://issues.apache.org/jira/browse/MAPREDUCE-4950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14188349#comment-14188349 ] Christopher Banas commented on MAPREDUCE-4950: -- We have recently started seeing this message on our local cluster as well. MR App Master fails to write the history due to AvroTypeException - Key: MAPREDUCE-4950 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4950 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mr-am Reporter: Devaraj K Priority: Critical {code:xml} 2013-01-19 19:31:27,269 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: In stop, writing event MAP_ATTEMPT_STARTED 2013-01-19 19:31:27,269 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.service.CompositeService: Error stopping JobHistoryEventHandler org.apache.avro.AvroTypeException: Attempt to process a enum when a array-start was expected. at org.apache.avro.io.parsing.Parser.advance(Parser.java:93) at org.apache.avro.io.JsonEncoder.writeEnum(JsonEncoder.java:210) at org.apache.avro.specific.SpecificDatumWriter.writeEnum(SpecificDatumWriter.java:54) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66) at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:65) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:57) at org.apache.hadoop.mapreduce.jobhistory.EventWriter.write(EventWriter.java:66) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$MetaInfo.writeEvent(JobHistoryEventHandler.java:825) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:517) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.stop(JobHistoryEventHandler.java:346) at org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:99) at org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:89) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler.handle(MRAppMaster.java:445) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler.handle(MRAppMaster.java:406) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) 2013-01-19 19:31:27,271 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Deleting staging directory hdfs://hacluster /root/staging-dir/root/.staging/job_1358603069474_0135 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-4950) MR App Master fails to write the history due to AvroTypeException
[ https://issues.apache.org/jira/browse/MAPREDUCE-4950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14041574#comment-14041574 ] caolong commented on MAPREDUCE-4950: me too MR App Master fails to write the history due to AvroTypeException - Key: MAPREDUCE-4950 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4950 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mr-am Reporter: Devaraj K Priority: Critical {code:xml} 2013-01-19 19:31:27,269 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: In stop, writing event MAP_ATTEMPT_STARTED 2013-01-19 19:31:27,269 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.service.CompositeService: Error stopping JobHistoryEventHandler org.apache.avro.AvroTypeException: Attempt to process a enum when a array-start was expected. at org.apache.avro.io.parsing.Parser.advance(Parser.java:93) at org.apache.avro.io.JsonEncoder.writeEnum(JsonEncoder.java:210) at org.apache.avro.specific.SpecificDatumWriter.writeEnum(SpecificDatumWriter.java:54) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66) at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:65) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:57) at org.apache.hadoop.mapreduce.jobhistory.EventWriter.write(EventWriter.java:66) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$MetaInfo.writeEvent(JobHistoryEventHandler.java:825) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:517) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.stop(JobHistoryEventHandler.java:346) at org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:99) at org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:89) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler.handle(MRAppMaster.java:445) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler.handle(MRAppMaster.java:406) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) 2013-01-19 19:31:27,271 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Deleting staging directory hdfs://hacluster /root/staging-dir/root/.staging/job_1358603069474_0135 {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4950) MR App Master fails to write the history due to AvroTypeException
[ https://issues.apache.org/jira/browse/MAPREDUCE-4950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910215#comment-13910215 ] Rohith commented on MAPREDUCE-4950: --- In my cluster Hadoop-2.3, I too encounterd with similar exception.JobHistoryEvents write failed. I am not pretty sure what is cause for the reason.:-( {noformat} 2014-02-21 22:10:33,841 INFO [Thread-355] org.apache.hadoop.service.AbstractService: Service JobHistoryEventHandler failed in state STOPPED; cause: org.apache.avro.AvroTypeException: Attempt to process a enum when a string was expected. org.apache.avro.AvroTypeException: Attempt to process a enum when a string was expected. at org.apache.avro.io.parsing.Parser.advance(Parser.java:93) at org.apache.avro.io.JsonEncoder.writeEnum(JsonEncoder.java:217) at org.apache.avro.specific.SpecificDatumWriter.writeEnum(SpecificDatumWriter.java:54) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:67) at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58) at org.apache.hadoop.mapreduce.jobhistory.EventWriter.write(EventWriter.java:66) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$MetaInfo.writeEvent(JobHistoryEventHandler.java:870) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:517) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:332) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:159) at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStop(MRAppMaster.java:1386) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.shutDownJob(MRAppMaster.java:550) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler$1.run(MRAppMaster.java:602) {noformat} MR App Master fails to write the history due to AvroTypeException - Key: MAPREDUCE-4950 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4950 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mr-am Reporter: Devaraj K Priority: Critical {code:xml} 2013-01-19 19:31:27,269 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: In stop, writing event MAP_ATTEMPT_STARTED 2013-01-19 19:31:27,269 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.service.CompositeService: Error stopping JobHistoryEventHandler org.apache.avro.AvroTypeException: Attempt to process a enum when a array-start was expected. at org.apache.avro.io.parsing.Parser.advance(Parser.java:93) at org.apache.avro.io.JsonEncoder.writeEnum(JsonEncoder.java:210) at org.apache.avro.specific.SpecificDatumWriter.writeEnum(SpecificDatumWriter.java:54) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66) at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:65) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:57) at org.apache.hadoop.mapreduce.jobhistory.EventWriter.write(EventWriter.java:66) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$MetaInfo.writeEvent(JobHistoryEventHandler.java:825) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:517) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.stop(JobHistoryEventHandler.java:346) at org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:99) at org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:89) at