[
https://issues.apache.org/jira/browse/MAPREDUCE-6554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15032541#comment-15032541
]
Jason Lowe commented on MAPREDUCE-6554:
---------------------------------------
Thanks for the report and patch, Bibin! However I'm not sure this is the
appropriate fix. EventReader is in mapreduce core and isn't necessarily
restricted to the AM reading a prior attempt's history file. This class is
also used by Rumen, for example.
I'm wondering if the generalized exception handling should be restricted to
just the use case in MRAppMaster where it is trying to read the prior history
and AMInfos. Then we're targeting the use case we care about and won't affect
the others.
> MRAppMaster servicestart failing with NPE in
> MRAppMaster#parsePreviousJobHistory
> ---------------------------------------------------------------------------------
>
> Key: MAPREDUCE-6554
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6554
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Bibin A Chundatt
> Assignee: Bibin A Chundatt
> Priority: Critical
> Attachments: 0001-MAPREDUCE-6554.patch
>
>
> Create scenario so that MR app master gets preempted.
> On next MRAppMaster launch tried to recover previous job history file
> {{MRAppMaster#parsePreviousJobHistory}}
> {noformat}
> 2015-11-21 13:52:27,722 INFO [main]
> org.apache.hadoop.service.AbstractService: Service
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state STARTED;
> cause: java.lang.NullPointerException
> java.lang.NullPointerException
> at java.io.StringReader.<init>(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at
> org.apache.hadoop.mapreduce.jobhistory.EventReader.<init>(EventReader.java:75)
> at
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:139)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.parsePreviousJobHistory(MRAppMaster.java:1256)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.processRecovery(MRAppMaster.java:1225)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1087)
> at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1570)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1673)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1566)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1499)
> 2015-11-21 13:52:27,725 INFO [main]
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopping
> JobHistoryEventHandler. Size of the outstanding queue size is 0
> {noformat}
> EventReader(EventReader stream)
> {noformat}
> this.version = in.readLine();
> ...
> Schema myschema = new
> SpecificData(Event.class.getClassLoader()).getSchema(Event.class);
> this.schema = Schema.parse(in.readLine());
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)