[
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14000336#comment-14000336
]
Jason Lowe commented on MAPREDUCE-5309:
---------------------------------------
Thanks for the patch, Rushabh! Some comments:
- Findbugs warning needs to be addressed.
- To match existing parsing behavior for missing fields strings should default
to null rather than an empty string. The difference is subtle and could cause
subtle bugs, so we should match existing behavior here. I think that means we
need to have them be of type \["null", "string"\] and "default": null.
- If the task info or attempt info is null we should log a warning since
something is wrong with the history file.
- In the test cases we don't need to catch the IOException and translate to
YarnException, we can just let the IOException bubble up and fail the test
directly.
- The test of the three separate jhist files should ideally be three separate
unit tests. There's basically no common setup, and that way when a test fails
the test name immediately conveys which version of the history is failing to
parse.
> 2.0.4 JobHistoryParser can't parse certain failed job history files generated
> by 2.0.3 history server
> -----------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-5309
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: jobhistoryserver, mrv2
> Affects Versions: 2.0.4-alpha
> Reporter: Vrushali C
> Assignee: Rushabh S Shah
> Attachments: MAPREDUCE-5309-v2.patch, MAPREDUCE-5309-v3.patch,
> MAPREDUCE-5309.patch, Test20JobHistoryParsing.java, job_2_0_3-KILLED.jhist
>
>
> When the 2.0.4 JobHistoryParser tries to parse a job history file generated
> by hadoop 2.0.3, the jobhistoryparser throws as an error as
> java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array
> cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
> at
> org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
> at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
> at
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
> at
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
> at
> org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
> at
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
> at
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
> at
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
> at
> com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
> at
> org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
> at
> org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
> at
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
> at
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
> at
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
> at
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
> Test code and the job history file are attached.
> Test code:
> package com.twitter.somepackagel;
> import java.io.IOException;
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.fs.FileSystem;
> import org.apache.hadoop.fs.Path;
> import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
> import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
> import org.junit.Test;
> import org.apache.hadoop.yarn.YarnException;
> public class Test20JobHistoryParsing {
>
> @Test
> public void testFileAvro() throws IOException
> {
> Path local_path2 = new Path("/tmp/job_2_0_3-KILLED.jhist");
> JobHistoryParser parser2 = new JobHistoryParser(FileSystem.getLocal(new
> Configuration()), local_path2);
> try {
> JobInfo ji2 = parser2.parse();
> System.out.println(" job info: " + ji2.getJobname() + " "
> + ji2.getFinishedMaps() + " "
> + ji2.getTotalMaps() + " "
> + ji2.getJobId() ) ;
> }
> catch (IOException e) {
> throw new YarnException("Could not load history file "
> + local_path2.getName(), e);
> }
> }
> }
> This seems to stem from the fix in
> https://issues.apache.org/jira/browse/MAPREDUCE-4693
> that added counters to the historyserver for failed tasks.
> This breaks backward compatibility with JobHistoryServer.
--
This message was sent by Atlassian JIRA
(v6.2#6252)