[jira] [Updated] (MAPREDUCE-7076) TestNNBench#testNNBenchCreateReadAndDelete failing in our internal build

2018-04-10 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-7076:
--
Description: 
TestNNBench#testNNBenchCreateReadAndDelete failed couple of times in our 
internal jenkins build.
{noformat}
java.lang.AssertionError: create_write should create the file
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.hdfs.TestNNBench.testNNBenchCreateReadAndDelete(TestNNBench.java:55)
{noformat}
Below is my analysis for why it didn't create the file.
{code:java|title=NNBench.java|borderStyle=solid}
// Some comments here
  public void map(Text key, 
LongWritable value,
OutputCollector output,
Reporter reporter) throws IOException {
  if (barrier()) {
String fileName = "file_" + value;
if (op.equals(OP_CREATE_WRITE)) {
  startTimeTPmS = System.currentTimeMillis();
  doCreateWriteOp(fileName, reporter);
} ...
  } else {
output.collect(new Text("l:latemaps"), new Text("1"));
  }
  // Below are the relevant parts of barrier() method
  private boolean barrier() {
..
// If the sleep time is greater than 0, then sleep and return
...
LOG.info("Waiting in barrier for: " + sleepTime + " ms");
return retVal;
  }
  // Below are the relevant parts of the doCreateWriteOp
  private void doCreateWriteOp(String name,
 Reporter reporter) {
FSDataOutputStream out;
byte[] buffer = new byte[bytesToWrite];  
for (long l = 0l; l < numberOfFiles; l++) {
  Path filePath = new Path(new Path(baseDir, dataDirName), 
  name + "_" + l);
}
  
  }
{code}
This file {{BASE_DIR/data/file_0_0}} is getting created only if the map task 
starts before the time mentioned by {{startTime}}.
 Refer the chunk which I pasted above.
 {{map(..)}} --> {{barrier()}} and *only if* {{barrier()}} evaluates to true it 
will call {{doCreateWriteOp}} which will eventually create the file.
 In test case, the delay value is 3 seconds as per {{"-startTime", "" + 
(Time.now() / 1000 + 3)}}
 In this failing test case, I can see the task starting minimum 6 seconds after 
the test case started.
{noformat}
2017-01-27 03:11:15,387 INFO  [Thread-4] mapreduce.JobSubmitter 
(JobSubmitter.java:printTokens(289)) - Submitting tokens for job: 
job_local1711545156_0001
2017-01-27 03:11:23,405 INFO  [Thread-4] mapreduce.Job (Job.java:submit(1345)) 
- The url to track the job: http://localhost:8080/
{noformat}
Also when I run this test on my laptop, I see the following line being printed.
{noformat}
2017-01-27 17:09:27,982 INFO  [LocalJobRunner Map Task Executor #0] 
hdfs.NNBench (NNBench.java:barrier(676)) - Waiting in barrier for: 1018 ms
{noformat}
This line will be printed only in {{barrier()}} method and I don't see this 
line in the logs of failed test.

 
 In our environment, the jenkins server was very slow and it took more than 6 
seconds to launch a map task.
 The correct fix in my opinion would be to return true in case there is no 
sleep in {{barrier() method}}. Only in exception, it should return false.

  was:
TestNNBench#testNNBenchCreateReadAndDelete failed couple of times in our 
internal jenkins build.
{noformat}
java.lang.AssertionError: create_write should create the file
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.hdfs.TestNNBench.testNNBenchCreateReadAndDelete(TestNNBench.java:55)
{noformat}

Below is my analysis for why it didn't create the file.
{code:title=NNBench.java|borderStyle=solid}
// Some comments here
  public void map(Text key, 
LongWritable value,
OutputCollector output,
Reporter reporter) throws IOException {
  if (barrier()) {
String fileName = "file_" + value;
if (op.equals(OP_CREATE_WRITE)) {
  startTimeTPmS = System.currentTimeMillis();
  doCreateWriteOp(fileName, reporter);
} ...
  } else {
output.collect(new Text("l:latemaps"), new Text("1"));
  }
  // Below are the relevant parts of barrier() method
  private boolean barrier() {
..
// If the sleep time is greater than 0, then sleep and return
...
LOG.info("Waiting in barrier for: " + sleepTime + " ms");
return retVal;
  }
  // Below are the relevant parts of the doCreateWriteOp
  private void doCreateWriteOp(String name,
 Reporter reporter) {
FSDataOutputStream out;
byte[] buffer = new byte[bytesToWrite];  
for (long l = 0l; l < numberOfFiles; l++) {
  Path filePath = new Path(new Path(baseDir, dataDirName), 
  name + "_" + l);
}
  
  }
{code}   

This file {{BASE_DIR/data/file_0_0}} is getting created 

[jira] [Updated] (MAPREDUCE-7076) TestNNBench#testNNBenchCreateReadAndDelete failing in our internal build

2018-04-10 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-7076:
--
Labels: newbie  (was: )

> TestNNBench#testNNBenchCreateReadAndDelete failing in our internal build
> 
>
> Key: MAPREDUCE-7076
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7076
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Rushabh S Shah
>Priority: Minor
>  Labels: newbie
>
> TestNNBench#testNNBenchCreateReadAndDelete failed couple of times in our 
> internal jenkins build.
> {noformat}
> java.lang.AssertionError: create_write should create the file
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.hdfs.TestNNBench.testNNBenchCreateReadAndDelete(TestNNBench.java:55)
> {noformat}
> Below is my analysis for why it didn't create the file.
> {code:title=NNBench.java|borderStyle=solid}
> // Some comments here
>   public void map(Text key, 
> LongWritable value,
> OutputCollector output,
> Reporter reporter) throws IOException {
>   if (barrier()) {
> String fileName = "file_" + value;
> if (op.equals(OP_CREATE_WRITE)) {
>   startTimeTPmS = System.currentTimeMillis();
>   doCreateWriteOp(fileName, reporter);
> } ...
>   } else {
> output.collect(new Text("l:latemaps"), new Text("1"));
>   }
>   // Below are the relevant parts of barrier() method
>   private boolean barrier() {
> ..
> // If the sleep time is greater than 0, then sleep and return
> ...
> LOG.info("Waiting in barrier for: " + sleepTime + " ms");
> return retVal;
>   }
>   // Below are the relevant parts of the doCreateWriteOp
>   private void doCreateWriteOp(String name,
>  Reporter reporter) {
> FSDataOutputStream out;
> byte[] buffer = new byte[bytesToWrite];  
> for (long l = 0l; l < numberOfFiles; l++) {
>   Path filePath = new Path(new Path(baseDir, dataDirName), 
>   name + "_" + l);
> }
>   
>   }
> {code}   
> This file {{BASE_DIR/data/file_0_0}} is getting created only if the map task 
> starts before the time mentioned by {{startTime}}.
> Refer the chunk which I pasted above.
> {{map(..)}} --> {{barrier()}} and *only if* {{barrier()}} evaluates to true 
> it will call {{doCreateWriteOp}} which will eventually create the file.
> In test case, the delay value is 3 seconds as per {{"-startTime", "" + 
> (Time.now() / 1000 + 3)}}
> In this failing test case, I can see the task starting minimum 6 seconds 
> after the test case started.
> {noformat}
> 2017-01-27 03:11:15,387 INFO  [Thread-4] mapreduce.JobSubmitter 
> (JobSubmitter.java:printTokens(289)) - Submitting tokens for job: 
> job_local1711545156_0001
> 2017-01-27 03:11:23,405 INFO  [Thread-4] mapreduce.Job 
> (Job.java:submit(1345)) - The url to track the job: http://localhost:8080/
> {noformat}
> Also when I run this test on my laptop, I see the following line being 
> printed.
> {noformat}
> 2017-01-27 17:09:27,982 INFO  [LocalJobRunner Map Task Executor #0] 
> hdfs.NNBench (NNBench.java:barrier(676)) - Waiting in barrier for: 1018 ms
> {noformat}
> This line will be printed only in {{barrier()}} method and I don't see this 
> line in the logs of failed test.
> In our environment, the jenkins server was very slow and it took more than 6 
> seconds to launch a map task.
> The correct fix in my opinion would be to return true in case there is no 
> sleep in {{barrier() method}}. Only in exception, it will return false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-7076) TestNNBench#testNNBenchCreateReadAndDelete failing in our internal build

2018-04-10 Thread Rushabh S Shah (JIRA)
Rushabh S Shah created MAPREDUCE-7076:
-

 Summary: TestNNBench#testNNBenchCreateReadAndDelete failing in our 
internal build
 Key: MAPREDUCE-7076
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7076
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 2.8.0
Reporter: Rushabh S Shah


TestNNBench#testNNBenchCreateReadAndDelete failed couple of times in our 
internal jenkins build.
{noformat}
java.lang.AssertionError: create_write should create the file
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.hdfs.TestNNBench.testNNBenchCreateReadAndDelete(TestNNBench.java:55)
{noformat}

Below is my analysis for why it didn't create the file.
{code:title=NNBench.java|borderStyle=solid}
// Some comments here
  public void map(Text key, 
LongWritable value,
OutputCollector output,
Reporter reporter) throws IOException {
  if (barrier()) {
String fileName = "file_" + value;
if (op.equals(OP_CREATE_WRITE)) {
  startTimeTPmS = System.currentTimeMillis();
  doCreateWriteOp(fileName, reporter);
} ...
  } else {
output.collect(new Text("l:latemaps"), new Text("1"));
  }
  // Below are the relevant parts of barrier() method
  private boolean barrier() {
..
// If the sleep time is greater than 0, then sleep and return
...
LOG.info("Waiting in barrier for: " + sleepTime + " ms");
return retVal;
  }
  // Below are the relevant parts of the doCreateWriteOp
  private void doCreateWriteOp(String name,
 Reporter reporter) {
FSDataOutputStream out;
byte[] buffer = new byte[bytesToWrite];  
for (long l = 0l; l < numberOfFiles; l++) {
  Path filePath = new Path(new Path(baseDir, dataDirName), 
  name + "_" + l);
}
  
  }
{code}   

This file {{BASE_DIR/data/file_0_0}} is getting created only if the map task 
starts before the time mentioned by {{startTime}}.
Refer the chunk which I pasted above.
{{map(..)}} --> {{barrier()}} and *only if* {{barrier()}} evaluates to true it 
will call {{doCreateWriteOp}} which will eventually create the file.
In test case, the delay value is 3 seconds as per {{"-startTime", "" + 
(Time.now() / 1000 + 3)}}
In this failing test case, I can see the task starting minimum 6 seconds after 
the test case started.
{noformat}
2017-01-27 03:11:15,387 INFO  [Thread-4] mapreduce.JobSubmitter 
(JobSubmitter.java:printTokens(289)) - Submitting tokens for job: 
job_local1711545156_0001
2017-01-27 03:11:23,405 INFO  [Thread-4] mapreduce.Job (Job.java:submit(1345)) 
- The url to track the job: http://localhost:8080/
{noformat}

Also when I run this test on my laptop, I see the following line being printed.
{noformat}
2017-01-27 17:09:27,982 INFO  [LocalJobRunner Map Task Executor #0] 
hdfs.NNBench (NNBench.java:barrier(676)) - Waiting in barrier for: 1018 ms
{noformat}
This line will be printed only in {{barrier()}} method and I don't see this 
line in the logs of failed test.
In our environment, the jenkins server was very slow and it took more than 6 
seconds to launch a map task.
The correct fix in my opinion would be to return true in case there is no sleep 
in {{barrier() method}}. Only in exception, it will return false.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7059) Compatibility issue: job submission fails with RpcNoSuchMethodException when submitting to 2.x cluster

2018-02-26 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16376952#comment-16376952
 ] 

Rushabh S Shah commented on MAPREDUCE-7059:
---

In future, we can add support for all servers to find out what their current 
version is and then we can make a decision based on that.
Like we can add support in FsServerDefaults which will return the version 
number that server is running with.

> Compatibility issue: job submission fails with RpcNoSuchMethodException when 
> submitting to 2.x cluster
> --
>
> Key: MAPREDUCE-7059
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7059
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission
>Affects Versions: 3.0.0
>Reporter: Jiandan Yang 
>Priority: Minor
>
> Running teragen failed in the version of hadoop-3.1, and hdfs server is 2.8.
> {code:java}
> bin/hadoop jar 
> share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.0-SNAPSHOT.jar  teragen  
> 10 /teragen
> {code}
> The reason of failing is 2.8 HDFS does not have setErasureCodingPolicy.
> one  solution is parsing RemoteException in 
> JobResourceUploader#disableErasure like this:
> {code:java}
> private void disableErasureCodingForPath(FileSystem fs, Path path)
>   throws IOException {
> try {
>   if (jtFs instanceof DistributedFileSystem) {
> LOG.info("Disabling Erasure Coding for path: " + path);
> DistributedFileSystem dfs = (DistributedFileSystem) jtFs;
> dfs.setErasureCodingPolicy(path,
> SystemErasureCodingPolicies.getReplicationPolicy().getName());
>   }
> } catch (RemoteException e) {
>   if (!(e.getCause() instanceof RpcNoSuchMethodException)) {
> throw e;
>   }
> }
>   }
> {code}
> Does anyone have better solution?
> The detailed exception trace is:
> {code:java}
> 2018-02-26 11:22:53,178 INFO mapreduce.JobSubmitter: Cleaning up the staging 
> area /tmp/hadoop-yarn/staging/hadoop/.staging/job_1518615699369_0006
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RpcNoSuchMethodException):
>  Unknown method setErasureCodingPolicy called on 
> org.apache.hadoop.hdfs.protocol.ClientProtocol protocol.
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:436)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:846)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:789)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1804)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2457)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1437)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy11.setErasureCodingPolicy(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setErasureCodingPolicy(ClientNamenodeProtocolTranslatorPB.java:1583)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>   at com.sun.proxy.$Proxy12.setErasureCodingPolicy(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.DFSClient.setErasureCodingPolicy(DFSClient.java:2678)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$63.doCall(DistributedFileSystem.java:2665)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$63.doCall(DistributedFileSystem.java:2662)
>   at 
> 

[jira] [Created] (MAPREDUCE-6996) FileInputFormat#getBlockIndex should include file name in the exception.

2017-11-01 Thread Rushabh S Shah (JIRA)
Rushabh S Shah created MAPREDUCE-6996:
-

 Summary: FileInputFormat#getBlockIndex should include file name in 
the exception.
 Key: MAPREDUCE-6996
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6996
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Rushabh S Shah
Priority: Minor


{code:title=FileInputFormat..java|borderStyle=solid}
// Some comments here
 protected int getBlockIndex(BlockLocation[] blkLocations, 
  long offset) {
{
...
...
BlockLocation last = blkLocations[blkLocations.length -1];
long fileLength = last.getOffset() + last.getLength() -1;
throw new IllegalArgumentException("Offset " + offset + 
   " is outside of file (0.." +
   fileLength + ")");
}
{code}
When the file is open for writing, the {{last.getLength()}} and 
{{last.getOffset()}} will be zero and we see the following exception stack 
trace.
{noformat}
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:288)
Caused by: java.lang.IllegalArgumentException: Offset 0 is outside of file 
(0..-1)
at 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getBlockIndex(FileInputFormat.java:453)
at 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:413)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:265)
... 18 more
{noformat}
Its difficult to debug which file was open.
So creating this ticket to include the filename in the exception.
Since {{FileInputFormat#getBlockIndex}} is protected, we can't change the 
signature of that method and add file name to arguments.
The only way I can think to fix this is: 
{code:title=FileInputFormat..java|borderStyle=solid}
 public InputSplit[] getSplits(JobConf job, int numSplits)
throws IOException {
{
...
...
   for (FileStatus file: files) {
  Path path = file.getPath();
  long length = file.getLen();
  if (length != 0) {
FileSystem fs = path.getFileSystem(job);
BlockLocation[] blkLocations;
if (file instanceof LocatedFileStatus) {
  blkLocations = ((LocatedFileStatus) file).getBlockLocations();
} else {
  blkLocations = fs.getFileBlockLocations(file, 0, length);
}
if (isSplitable(fs, path)) {
  long blockSize = file.getBlockSize();
  long splitSize = computeSplitSize(goalSize, minSize, blockSize);

  long bytesRemaining = length;
  while (((double) bytesRemaining)/splitSize > SPLIT_SLOP) {
String[][] splitHosts = getSplitHostsAndCachedHosts(blkLocations,
length-bytesRemaining, splitSize, clusterMap);
splits.add(makeSplit(path, length-bytesRemaining, splitSize,
splitHosts[0], splitHosts[1]));
bytesRemaining -= splitSize;
  }

  if (bytesRemaining != 0) {
String[][] splitHosts = getSplitHostsAndCachedHosts(blkLocations, 
length
- bytesRemaining, bytesRemaining, clusterMap);
splits.add(makeSplit(path, length - bytesRemaining, bytesRemaining,
splitHosts[0], splitHosts[1]));
  }
} else {
  String[][] splitHosts = 
getSplitHostsAndCachedHosts(blkLocations,0,length,clusterMap);
  splits.add(makeSplit(path, 0, length, splitHosts[0], splitHosts[1]));
}
  } else { 
//Create empty hosts array for zero length files
splits.add(makeSplit(path, 0, length, new String[0]));
  }
}
{code}
Have a try-catch block around the above code chunk and catch 
{{IllegalArgumentException}} and check for message {{Offset 0 is outside of 
file (0..-1)}}.
If yes, add the file name and rethrow {{IllegalArgumentException}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6958) Shuffle audit logger should log size of shuffle transfer

2017-09-15 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168504#comment-16168504
 ] 

Rushabh S Shah commented on MAPREDUCE-6958:
---

+1 ltgm non-binding.

> Shuffle audit logger should log size of shuffle transfer
> 
>
> Key: MAPREDUCE-6958
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6958
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Minor
> Attachments: MAPREDUCE-6958.001.patch, MAPREDUCE-6958.002.patch
>
>
> The shuffle audit logger currently logs the job ID and reducer ID but nothing 
> about the size of the requested transfer.  It calculates this as part of the 
> HTTP response headers, so it would be trivial to log the response size.  This 
> would be very valuable for debugging network traffic storms from the shuffle 
> handler.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (MAPREDUCE-6958) Shuffle audit logger should log size of shuffle transfer

2017-09-15 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168013#comment-16168013
 ] 

Rushabh S Shah edited comment on MAPREDUCE-6958 at 9/15/17 3:19 PM:


Overall the patch looks good.
Just one very minor nit.
{noformat}
for (String mapId : mapIds) {
   sb.append(" ");
  sb.append(mapId);
}
{noformat}
Instead of going through {{mapId}}, we can just print {{mapIds}}. It will be 
just less code.


was (Author: shahrs87):
Overall the patch looks good.
Just one very minor nit.
{quote}
+for (String mapId : mapIds) {
+  sb.append(" ");
+  sb.append(mapId);
+}
{quote}
Instead of going through {{mapId}}, we can just print {{mapIds}}. It will be 
just less code.

> Shuffle audit logger should log size of shuffle transfer
> 
>
> Key: MAPREDUCE-6958
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6958
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Minor
> Attachments: MAPREDUCE-6958.001.patch
>
>
> The shuffle audit logger currently logs the job ID and reducer ID but nothing 
> about the size of the requested transfer.  It calculates this as part of the 
> HTTP response headers, so it would be trivial to log the response size.  This 
> would be very valuable for debugging network traffic storms from the shuffle 
> handler.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6958) Shuffle audit logger should log size of shuffle transfer

2017-09-15 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168013#comment-16168013
 ] 

Rushabh S Shah commented on MAPREDUCE-6958:
---

Overall the patch looks good.
Just one very minor nit.
{quote}
+for (String mapId : mapIds) {
+  sb.append(" ");
+  sb.append(mapId);
+}
{quote}
Instead of going through {{mapId}}, we can just print {{mapIds}}. It will be 
just less code.

> Shuffle audit logger should log size of shuffle transfer
> 
>
> Key: MAPREDUCE-6958
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6958
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Minor
> Attachments: MAPREDUCE-6958.001.patch
>
>
> The shuffle audit logger currently logs the job ID and reducer ID but nothing 
> about the size of the requested transfer.  It calculates this as part of the 
> HTTP response headers, so it would be trivial to log the response size.  This 
> would be very valuable for debugging network traffic storms from the shuffle 
> handler.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-6938) Question

2017-08-15 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah resolved MAPREDUCE-6938.
---
Resolution: Invalid

[~remil] This jira board is for bug/improvement/feature tracking system not for 
asking some random questions/programs.
Please send an email to {{gene...@hadoop.apache.org}} or 
{{u...@hadoop.apache.org}} and hope someone would reply.

Thanks,
Rushabh Shah.


> Question
> 
>
> Key: MAPREDUCE-6938
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6938
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>Reporter: Remil
>Priority: Minor
>
> I need 2 helps.
> 1) need a Java map reducer sample program where multiple parameters 
> are passed from mapper to reducer.
> 2) need a Java map reducer program where there is a write to a file inside 
> hdfs filesystem as well as a read from a file inside hdfs other than 
> the normal input file and output file mentioned in the mapper and reducer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6633) AM should retry map attempts if the reduce task encounters commpression related errors.

2016-04-18 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-6633:
--
Target Version/s: 3.0.0, 2.8.0, 2.7.3

> AM should retry map attempts if the reduce task encounters commpression 
> related errors.
> ---
>
> Key: MAPREDUCE-6633
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6633
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Fix For: 2.8.0
>
> Attachments: MAPREDUCE-6633.patch
>
>
> When reduce task encounters compression related errors, AM  doesn't retry the 
> corresponding map task.
> In one of the case we encountered, here is the stack trace.
> {noformat}
> 2016-01-27 13:44:28,915 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
> shuffle in fetcher#29
>   at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>   at 
> com.hadoop.compression.lzo.LzoDecompressor.setInput(LzoDecompressor.java:196)
>   at 
> org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
>   at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
>   at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:537)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:336)
>   at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)
> {noformat}
> In this case, the node on which the map task ran had a bad drive.
> If the AM had retried running that map task somewhere else, the job 
> definitely would have succeeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6633) AM should retry map attempts if the reduce task encounters commpression related errors.

2016-04-11 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235139#comment-15235139
 ] 

Rushabh S Shah commented on MAPREDUCE-6633:
---

[~eepayne]: Thanks for the reviews and committing.
Does it make sense to fix it in 2.7 branch also ?

> AM should retry map attempts if the reduce task encounters commpression 
> related errors.
> ---
>
> Key: MAPREDUCE-6633
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6633
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Fix For: 2.8.0
>
> Attachments: MAPREDUCE-6633.patch
>
>
> When reduce task encounters compression related errors, AM  doesn't retry the 
> corresponding map task.
> In one of the case we encountered, here is the stack trace.
> {noformat}
> 2016-01-27 13:44:28,915 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
> shuffle in fetcher#29
>   at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>   at 
> com.hadoop.compression.lzo.LzoDecompressor.setInput(LzoDecompressor.java:196)
>   at 
> org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
>   at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
>   at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:537)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:336)
>   at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)
> {noformat}
> In this case, the node on which the map task ran had a bad drive.
> If the AM had retried running that map task somewhere else, the job 
> definitely would have succeeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6633) AM should retry map attempts if the reduce task encounters commpression related errors.

2016-04-06 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229066#comment-15229066
 ] 

Rushabh S Shah commented on MAPREDUCE-6633:
---

Ran the failed junit failure on bith jdk7 and jdk8.
Both of them passed fine on my machine.
{noformat}
Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.54 sec <<< 
FAILURE! - in org.apache.hadoop.mapreduce.tools.TestCLI
testGetJob(org.apache.hadoop.mapreduce.tools.TestCLI)  Time elapsed: 0.084 sec  
<<< FAILURE!
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.hadoop.mapreduce.tools.TestCLI.testGetJob(TestCLI.java:181)
{noformat}

> AM should retry map attempts if the reduce task encounters commpression 
> related errors.
> ---
>
> Key: MAPREDUCE-6633
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6633
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: MAPREDUCE-6633.patch
>
>
> When reduce task encounters compression related errors, AM  doesn't retry the 
> corresponding map task.
> In one of the case we encountered, here is the stack trace.
> {noformat}
> 2016-01-27 13:44:28,915 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
> shuffle in fetcher#29
>   at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>   at 
> com.hadoop.compression.lzo.LzoDecompressor.setInput(LzoDecompressor.java:196)
>   at 
> org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
>   at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
>   at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:537)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:336)
>   at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)
> {noformat}
> In this case, the node on which the map task ran had a bad drive.
> If the AM had retried running that map task somewhere else, the job 
> definitely would have succeeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6633) AM should retry map attempts if the reduce task encounters commpression related errors.

2016-04-06 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229032#comment-15229032
 ] 

Rushabh S Shah commented on MAPREDUCE-6633:
---

bq.  If there is a runtime exception on the reducer (memory error, NPE, etc.), 
maps would be re-run unnecessarily. 
In this case the decompressor threw RuntimeException 
(ArrayIndexOutOfBondsException is a subclass).
If we had re run the map on another node, the job would have succeeded.

bq. I am a little nervous about re-fetching for any exception.
I understand your concern but I think its a good change according to me.

> AM should retry map attempts if the reduce task encounters commpression 
> related errors.
> ---
>
> Key: MAPREDUCE-6633
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6633
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: MAPREDUCE-6633.patch
>
>
> When reduce task encounters compression related errors, AM  doesn't retry the 
> corresponding map task.
> In one of the case we encountered, here is the stack trace.
> {noformat}
> 2016-01-27 13:44:28,915 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
> shuffle in fetcher#29
>   at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>   at 
> com.hadoop.compression.lzo.LzoDecompressor.setInput(LzoDecompressor.java:196)
>   at 
> org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
>   at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
>   at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:537)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:336)
>   at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)
> {noformat}
> In this case, the node on which the map task ran had a bad drive.
> If the AM had retried running that map task somewhere else, the job 
> definitely would have succeeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6633) AM should retry map attempts if the reduce task encounters commpression related errors.

2016-03-25 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-6633:
--
Status: Patch Available  (was: In Progress)

In Fetcher#copyMapOutput method, I added Exception to catch block so that it 
will retry on any compression related Exception.
{noformat}
  try {
// Go!
LOG.info("fetcher#" + id + " about to shuffle output of map "
+ mapOutput.getMapId() + " decomp: " + decompressedLength
+ " len: " + compressedLength + " to " + 
mapOutput.getDescription());
mapOutput.shuffle(host, is, compressedLength, decompressedLength,
metrics, reporter);
  } catch (java.lang.InternalError e) {
LOG.warn("Failed to shuffle for fetcher#"+id, e);
throw new IOException(e);
  }
{noformat}


> AM should retry map attempts if the reduce task encounters commpression 
> related errors.
> ---
>
> Key: MAPREDUCE-6633
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6633
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: MAPREDUCE-6633.patch
>
>
> When reduce task encounters compression related errors, AM  doesn't retry the 
> corresponding map task.
> In one of the case we encountered, here is the stack trace.
> {noformat}
> 2016-01-27 13:44:28,915 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
> shuffle in fetcher#29
>   at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>   at 
> com.hadoop.compression.lzo.LzoDecompressor.setInput(LzoDecompressor.java:196)
>   at 
> org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
>   at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
>   at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:537)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:336)
>   at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)
> {noformat}
> In this case, the node on which the map task ran had a bad drive.
> If the AM had retried running that map task somewhere else, the job 
> definitely would have succeeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6633) AM should retry map attempts if the reduce task encounters commpression related errors.

2016-03-25 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-6633:
--
Attachment: MAPREDUCE-6633.patch

> AM should retry map attempts if the reduce task encounters commpression 
> related errors.
> ---
>
> Key: MAPREDUCE-6633
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6633
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: MAPREDUCE-6633.patch
>
>
> When reduce task encounters compression related errors, AM  doesn't retry the 
> corresponding map task.
> In one of the case we encountered, here is the stack trace.
> {noformat}
> 2016-01-27 13:44:28,915 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
> shuffle in fetcher#29
>   at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>   at 
> com.hadoop.compression.lzo.LzoDecompressor.setInput(LzoDecompressor.java:196)
>   at 
> org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
>   at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
>   at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:537)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:336)
>   at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)
> {noformat}
> In this case, the node on which the map task ran had a bad drive.
> If the AM had retried running that map task somewhere else, the job 
> definitely would have succeeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (MAPREDUCE-6633) AM should retry map attempts if the reduce task encounters commpression related errors.

2016-03-07 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on MAPREDUCE-6633 started by Rushabh S Shah.
-
> AM should retry map attempts if the reduce task encounters commpression 
> related errors.
> ---
>
> Key: MAPREDUCE-6633
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6633
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>
> When reduce task encounters compression related errors, AM  doesn't retry the 
> corresponding map task.
> In one of the case we encountered, here is the stack trace.
> {noformat}
> 2016-01-27 13:44:28,915 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
> shuffle in fetcher#29
>   at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>   at 
> com.hadoop.compression.lzo.LzoDecompressor.setInput(LzoDecompressor.java:196)
>   at 
> org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
>   at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
>   at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:537)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:336)
>   at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)
> {noformat}
> In this case, the node on which the map task ran had a bad drive.
> If the AM had retried running that map task somewhere else, the job 
> definitely would have succeeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6633) AM should retry map attempts if the reduce task encounters commpression related errors.

2016-02-10 Thread Rushabh S Shah (JIRA)
Rushabh S Shah created MAPREDUCE-6633:
-

 Summary: AM should retry map attempts if the reduce task 
encounters commpression related errors.
 Key: MAPREDUCE-6633
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6633
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.7.2
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah


When reduce task encounters compression related errors, AM  doesn't retry the 
corresponding map task.
In one of the case we encountered, here is the stack trace.
{noformat}
2016-01-27 13:44:28,915 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : 
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle 
in fetcher#29
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ArrayIndexOutOfBoundsException
at 
com.hadoop.compression.lzo.LzoDecompressor.setInput(LzoDecompressor.java:196)
at 
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
at 
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
at 
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:537)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:336)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)
{noformat}
In this case, the node on which the map task ran had a bad drive.
If the AM had retried running that map task somewhere else, the jib definitely 
would have succeeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6633) AM should retry map attempts if the reduce task encounters commpression related errors.

2016-02-10 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-6633:
--
Description: 
When reduce task encounters compression related errors, AM  doesn't retry the 
corresponding map task.
In one of the case we encountered, here is the stack trace.
{noformat}
2016-01-27 13:44:28,915 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : 
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle 
in fetcher#29
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ArrayIndexOutOfBoundsException
at 
com.hadoop.compression.lzo.LzoDecompressor.setInput(LzoDecompressor.java:196)
at 
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
at 
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
at 
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:537)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:336)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)
{noformat}
In this case, the node on which the map task ran had a bad drive.
If the AM had retried running that map task somewhere else, the job definitely 
would have succeeded.

  was:
When reduce task encounters compression related errors, AM  doesn't retry the 
corresponding map task.
In one of the case we encountered, here is the stack trace.
{noformat}
2016-01-27 13:44:28,915 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : 
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle 
in fetcher#29
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ArrayIndexOutOfBoundsException
at 
com.hadoop.compression.lzo.LzoDecompressor.setInput(LzoDecompressor.java:196)
at 
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
at 
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
at 
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:537)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:336)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)
{noformat}
In this case, the node on which the map task ran had a bad drive.
If the AM had retried running that map task somewhere else, the jib definitely 
would have succeeded.


> AM should retry map attempts if the reduce task encounters commpression 
> related errors.
> ---
>
> Key: MAPREDUCE-6633
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6633
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>
> When reduce task encounters compression related errors, AM  doesn't retry the 
> corresponding map task.
> In one of the case we encountered, here is the stack trace.
> {noformat}
> 2016-01-27 13:44:28,915 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
> shuffle in fetcher#29
>   at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>   at 

[jira] [Commented] (MAPREDUCE-5948) org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well

2015-06-02 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14569101#comment-14569101
 ] 

Rushabh S Shah commented on MAPREDUCE-5948:
---

Sorry this fell off my radar too.
I don't have enough cycles to work on this right now.
We can move this to next release.
Or if someone is interested to work on this, I am more than happy to let 
him/her take.

 org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record 
 delimiters well
 --

 Key: MAPREDUCE-5948
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5948
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2, 0.23.9, 2.2.0
 Environment: CDH3U2 Redhat linux 5.7
Reporter: Kris Geusebroek
Assignee: Rushabh S Shah
Priority: Critical
 Attachments: HADOOP-9867.patch, HADOOP-9867.patch, HADOOP-9867.patch, 
 HADOOP-9867.patch


 Having defined a recorddelimiter of multiple bytes in a new InputFileFormat 
 sometimes has the effect of skipping records from the input.
 This happens when the input splits are split off just after a 
 recordseparator. Starting point for the next split would be non zero and 
 skipFirstLine would be true. A seek into the file is done to start - 1 and 
 the text until the first recorddelimiter is ignored (due to the presumption 
 that this record is already handled by the previous maptask). Since the re 
 ord delimiter is multibyte the seek only got the last byte of the delimiter 
 into scope and its not recognized as a full delimiter. So the text is skipped 
 until the next delimiter (ignoring a full record!!)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-20 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003526#comment-14003526
 ] 

Rushabh S Shah commented on MAPREDUCE-5309:
---

Thanks Jason for reviewing and  committing the patch.

 2.0.4 JobHistoryParser can't parse certain failed job history files generated 
 by 2.0.3 history server
 -

 Key: MAPREDUCE-5309
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 2.0.4-alpha
Reporter: Vrushali C
Assignee: Rushabh S Shah
 Fix For: 3.0.0, 2.5.0

 Attachments: MAPREDUCE-5309-v2.patch, MAPREDUCE-5309-v3.patch, 
 MAPREDUCE-5309-v4.patch, MAPREDUCE-5309-v5.patch, MAPREDUCE-5309.patch, 
 Test20JobHistoryParsing.java, job_2_0_3-KILLED.jhist


 When the 2.0.4 JobHistoryParser tries to parse a job history file generated 
 by hadoop 2.0.3, the jobhistoryparser throws as an error as
 java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array 
 cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
 at 
 org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
 at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
 at 
 org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
 at 
 com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
 at 
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
 Test code and the job history file are attached.
 Test code:
 package com.twitter.somepackagel;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
 import org.junit.Test;
 import org.apache.hadoop.yarn.YarnException;
 public class Test20JobHistoryParsing {

   

[jira] [Updated] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-19 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5309:
--

Status: Open  (was: Patch Available)

Current patch has a typo in one of the log statements.

 2.0.4 JobHistoryParser can't parse certain failed job history files generated 
 by 2.0.3 history server
 -

 Key: MAPREDUCE-5309
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 2.0.4-alpha
Reporter: Vrushali C
Assignee: Rushabh S Shah
 Attachments: MAPREDUCE-5309-v2.patch, MAPREDUCE-5309-v3.patch, 
 MAPREDUCE-5309-v4.patch, MAPREDUCE-5309.patch, Test20JobHistoryParsing.java, 
 job_2_0_3-KILLED.jhist


 When the 2.0.4 JobHistoryParser tries to parse a job history file generated 
 by hadoop 2.0.3, the jobhistoryparser throws as an error as
 java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array 
 cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
 at 
 org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
 at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
 at 
 org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
 at 
 com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
 at 
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
 Test code and the job history file are attached.
 Test code:
 package com.twitter.somepackagel;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
 import org.junit.Test;
 import org.apache.hadoop.yarn.YarnException;
 public class Test20JobHistoryParsing {

   @Test
   public void testFileAvro() throws IOException
   {
   Path 

[jira] [Updated] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-19 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5309:
--

Attachment: MAPREDUCE-5309-v5.patch

Initially EventReader#reader was initialized like:
this.reader = new SpecificDatumReader(schema, schema);
This assumed the reader schema and writer schema is the same.
But when the schema was upgraded from 2.0.3 to 2.0.4, new fields were added in 
2.0.4 which were not present in 2.0.3. When the parser tried to parse 2.0.3 
logs (which doesn't have the new fields), the parser returned with errors.
So basically we need to differentiate between the new schema and the schema of 
the input jhist files and avro will do the rest of the mapping by field name.
For the fields that were recently added, we need to assign the default values. 
So in case if we are parsing the old schema jhist files, it will assign the 
default value.
[~vinodkv]: I hope this helps.
[~viraj]: Yes, this patch will parse both 0.23.x and 2.4.x logs.

 2.0.4 JobHistoryParser can't parse certain failed job history files generated 
 by 2.0.3 history server
 -

 Key: MAPREDUCE-5309
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 2.0.4-alpha
Reporter: Vrushali C
Assignee: Rushabh S Shah
 Attachments: MAPREDUCE-5309-v2.patch, MAPREDUCE-5309-v3.patch, 
 MAPREDUCE-5309-v4.patch, MAPREDUCE-5309-v5.patch, MAPREDUCE-5309.patch, 
 Test20JobHistoryParsing.java, job_2_0_3-KILLED.jhist


 When the 2.0.4 JobHistoryParser tries to parse a job history file generated 
 by hadoop 2.0.3, the jobhistoryparser throws as an error as
 java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array 
 cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
 at 
 org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
 at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
 at 
 org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
 at 
 com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
 at 
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
 at 
 

[jira] [Updated] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-19 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5309:
--

Status: Patch Available  (was: Open)

 2.0.4 JobHistoryParser can't parse certain failed job history files generated 
 by 2.0.3 history server
 -

 Key: MAPREDUCE-5309
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 2.0.4-alpha
Reporter: Vrushali C
Assignee: Rushabh S Shah
 Attachments: MAPREDUCE-5309-v2.patch, MAPREDUCE-5309-v3.patch, 
 MAPREDUCE-5309-v4.patch, MAPREDUCE-5309-v5.patch, MAPREDUCE-5309.patch, 
 Test20JobHistoryParsing.java, job_2_0_3-KILLED.jhist


 When the 2.0.4 JobHistoryParser tries to parse a job history file generated 
 by hadoop 2.0.3, the jobhistoryparser throws as an error as
 java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array 
 cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
 at 
 org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
 at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
 at 
 org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
 at 
 com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
 at 
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
 Test code and the job history file are attached.
 Test code:
 package com.twitter.somepackagel;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
 import org.junit.Test;
 import org.apache.hadoop.yarn.YarnException;
 public class Test20JobHistoryParsing {

   @Test
   public void testFileAvro() throws IOException
   {
   Path local_path2 = new 

[jira] [Commented] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-16 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998768#comment-13998768
 ] 

Rushabh S Shah commented on MAPREDUCE-5309:
---

Hey Viraj,
This is not a stable fix. This will not parse the history files that are 
generated since 2.0.4.

 2.0.4 JobHistoryParser can't parse certain failed job history files generated 
 by 2.0.3 history server
 -

 Key: MAPREDUCE-5309
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 2.0.4-alpha
Reporter: Vrushali C
Assignee: Rushabh S Shah
 Attachments: MAPREDUCE-5309.patch, Test20JobHistoryParsing.java, 
 job_2_0_3-KILLED.jhist


 When the 2.0.4 JobHistoryParser tries to parse a job history file generated 
 by hadoop 2.0.3, the jobhistoryparser throws as an error as
 java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array 
 cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
 at 
 org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
 at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
 at 
 org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
 at 
 com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
 at 
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
 Test code and the job history file are attached.
 Test code:
 package com.twitter.somepackagel;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
 import org.junit.Test;
 import org.apache.hadoop.yarn.YarnException;
 public class Test20JobHistoryParsing {

   @Test
   public void testFileAvro() throws IOException
   {
   Path local_path2 = 

[jira] [Updated] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-16 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5309:
--

Attachment: MAPREDUCE-5309-v2.patch

 2.0.4 JobHistoryParser can't parse certain failed job history files generated 
 by 2.0.3 history server
 -

 Key: MAPREDUCE-5309
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 2.0.4-alpha
Reporter: Vrushali C
Assignee: Rushabh S Shah
 Attachments: MAPREDUCE-5309-v2.patch, MAPREDUCE-5309.patch, 
 Test20JobHistoryParsing.java, job_2_0_3-KILLED.jhist


 When the 2.0.4 JobHistoryParser tries to parse a job history file generated 
 by hadoop 2.0.3, the jobhistoryparser throws as an error as
 java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array 
 cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
 at 
 org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
 at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
 at 
 org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
 at 
 com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
 at 
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
 Test code and the job history file are attached.
 Test code:
 package com.twitter.somepackagel;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
 import org.junit.Test;
 import org.apache.hadoop.yarn.YarnException;
 public class Test20JobHistoryParsing {

   @Test
   public void testFileAvro() throws IOException
   {
   Path local_path2 = new Path(/tmp/job_2_0_3-KILLED.jhist);
  JobHistoryParser parser2 = new 

[jira] [Updated] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-16 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5309:
--

Status: Patch Available  (was: Open)

This patch will be parsing all the logs.

 2.0.4 JobHistoryParser can't parse certain failed job history files generated 
 by 2.0.3 history server
 -

 Key: MAPREDUCE-5309
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 2.0.4-alpha
Reporter: Vrushali C
Assignee: Rushabh S Shah
 Attachments: MAPREDUCE-5309-v2.patch, MAPREDUCE-5309.patch, 
 Test20JobHistoryParsing.java, job_2_0_3-KILLED.jhist


 When the 2.0.4 JobHistoryParser tries to parse a job history file generated 
 by hadoop 2.0.3, the jobhistoryparser throws as an error as
 java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array 
 cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
 at 
 org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
 at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
 at 
 org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
 at 
 com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
 at 
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
 Test code and the job history file are attached.
 Test code:
 package com.twitter.somepackagel;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
 import org.junit.Test;
 import org.apache.hadoop.yarn.YarnException;
 public class Test20JobHistoryParsing {

   @Test
   public void testFileAvro() throws IOException
   {
   Path local_path2 = new Path(/tmp/job_2_0_3-KILLED.jhist);
  

[jira] [Updated] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-16 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5309:
--

Status: Open  (was: Patch Available)

Broke a couple of test cases.

 2.0.4 JobHistoryParser can't parse certain failed job history files generated 
 by 2.0.3 history server
 -

 Key: MAPREDUCE-5309
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 2.0.4-alpha
Reporter: Vrushali C
Assignee: Rushabh S Shah
 Attachments: MAPREDUCE-5309-v2.patch, MAPREDUCE-5309-v3.patch, 
 MAPREDUCE-5309.patch, Test20JobHistoryParsing.java, job_2_0_3-KILLED.jhist


 When the 2.0.4 JobHistoryParser tries to parse a job history file generated 
 by hadoop 2.0.3, the jobhistoryparser throws as an error as
 java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array 
 cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
 at 
 org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
 at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
 at 
 org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
 at 
 com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
 at 
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
 Test code and the job history file are attached.
 Test code:
 package com.twitter.somepackagel;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
 import org.junit.Test;
 import org.apache.hadoop.yarn.YarnException;
 public class Test20JobHistoryParsing {

   @Test
   public void testFileAvro() throws IOException
   {
   Path local_path2 = new Path(/tmp/job_2_0_3-KILLED.jhist);

[jira] [Updated] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-16 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5309:
--

Attachment: MAPREDUCE-5309-v3.patch

 2.0.4 JobHistoryParser can't parse certain failed job history files generated 
 by 2.0.3 history server
 -

 Key: MAPREDUCE-5309
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 2.0.4-alpha
Reporter: Vrushali C
Assignee: Rushabh S Shah
 Attachments: MAPREDUCE-5309-v2.patch, MAPREDUCE-5309-v3.patch, 
 MAPREDUCE-5309.patch, Test20JobHistoryParsing.java, job_2_0_3-KILLED.jhist


 When the 2.0.4 JobHistoryParser tries to parse a job history file generated 
 by hadoop 2.0.3, the jobhistoryparser throws as an error as
 java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array 
 cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
 at 
 org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
 at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
 at 
 org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
 at 
 com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
 at 
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
 Test code and the job history file are attached.
 Test code:
 package com.twitter.somepackagel;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
 import org.junit.Test;
 import org.apache.hadoop.yarn.YarnException;
 public class Test20JobHistoryParsing {

   @Test
   public void testFileAvro() throws IOException
   {
   Path local_path2 = new Path(/tmp/job_2_0_3-KILLED.jhist);
  JobHistoryParser parser2 = 

[jira] [Updated] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-16 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5309:
--

Status: Patch Available  (was: Open)

Attaching a new patch correcting the previous test failures.

 2.0.4 JobHistoryParser can't parse certain failed job history files generated 
 by 2.0.3 history server
 -

 Key: MAPREDUCE-5309
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 2.0.4-alpha
Reporter: Vrushali C
Assignee: Rushabh S Shah
 Attachments: MAPREDUCE-5309-v2.patch, MAPREDUCE-5309-v3.patch, 
 MAPREDUCE-5309.patch, Test20JobHistoryParsing.java, job_2_0_3-KILLED.jhist


 When the 2.0.4 JobHistoryParser tries to parse a job history file generated 
 by hadoop 2.0.3, the jobhistoryparser throws as an error as
 java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array 
 cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
 at 
 org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
 at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
 at 
 org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
 at 
 com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
 at 
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
 Test code and the job history file are attached.
 Test code:
 package com.twitter.somepackagel;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
 import org.junit.Test;
 import org.apache.hadoop.yarn.YarnException;
 public class Test20JobHistoryParsing {

   @Test
   public void testFileAvro() throws IOException
   {
   Path local_path2 = new 

[jira] [Updated] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-16 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5309:
--

Status: Open  (was: Patch Available)

 2.0.4 JobHistoryParser can't parse certain failed job history files generated 
 by 2.0.3 history server
 -

 Key: MAPREDUCE-5309
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 2.0.4-alpha
Reporter: Vrushali C
Assignee: Rushabh S Shah
 Attachments: MAPREDUCE-5309-v2.patch, MAPREDUCE-5309-v3.patch, 
 MAPREDUCE-5309.patch, Test20JobHistoryParsing.java, job_2_0_3-KILLED.jhist


 When the 2.0.4 JobHistoryParser tries to parse a job history file generated 
 by hadoop 2.0.3, the jobhistoryparser throws as an error as
 java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array 
 cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
 at 
 org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
 at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
 at 
 org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
 at 
 com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
 at 
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
 Test code and the job history file are attached.
 Test code:
 package com.twitter.somepackagel;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
 import org.junit.Test;
 import org.apache.hadoop.yarn.YarnException;
 public class Test20JobHistoryParsing {

   @Test
   public void testFileAvro() throws IOException
   {
   Path local_path2 = new Path(/tmp/job_2_0_3-KILLED.jhist);
  JobHistoryParser parser2 = 

[jira] [Updated] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-16 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5309:
--

Status: Patch Available  (was: Open)

Jason, 
Thanks for reviewing my patch.
Submitting a new patch incorporating all of the comments.

 2.0.4 JobHistoryParser can't parse certain failed job history files generated 
 by 2.0.3 history server
 -

 Key: MAPREDUCE-5309
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 2.0.4-alpha
Reporter: Vrushali C
Assignee: Rushabh S Shah
 Attachments: MAPREDUCE-5309-v2.patch, MAPREDUCE-5309-v3.patch, 
 MAPREDUCE-5309-v4.patch, MAPREDUCE-5309.patch, Test20JobHistoryParsing.java, 
 job_2_0_3-KILLED.jhist


 When the 2.0.4 JobHistoryParser tries to parse a job history file generated 
 by hadoop 2.0.3, the jobhistoryparser throws as an error as
 java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array 
 cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
 at 
 org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
 at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
 at 
 org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
 at 
 com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
 at 
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
 Test code and the job history file are attached.
 Test code:
 package com.twitter.somepackagel;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
 import org.junit.Test;
 import org.apache.hadoop.yarn.YarnException;
 public class Test20JobHistoryParsing {

   @Test
   public void 

[jira] [Updated] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-16 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5309:
--

Attachment: MAPREDUCE-5309-v4.patch

 2.0.4 JobHistoryParser can't parse certain failed job history files generated 
 by 2.0.3 history server
 -

 Key: MAPREDUCE-5309
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 2.0.4-alpha
Reporter: Vrushali C
Assignee: Rushabh S Shah
 Attachments: MAPREDUCE-5309-v2.patch, MAPREDUCE-5309-v3.patch, 
 MAPREDUCE-5309-v4.patch, MAPREDUCE-5309.patch, Test20JobHistoryParsing.java, 
 job_2_0_3-KILLED.jhist


 When the 2.0.4 JobHistoryParser tries to parse a job history file generated 
 by hadoop 2.0.3, the jobhistoryparser throws as an error as
 java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array 
 cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
 at 
 org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
 at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
 at 
 org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
 at 
 com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
 at 
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
 Test code and the job history file are attached.
 Test code:
 package com.twitter.somepackagel;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
 import org.junit.Test;
 import org.apache.hadoop.yarn.YarnException;
 public class Test20JobHistoryParsing {

   @Test
   public void testFileAvro() throws IOException
   {
   Path local_path2 = new Path(/tmp/job_2_0_3-KILLED.jhist);
  

[jira] [Updated] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-15 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5309:
--

Status: Patch Available  (was: Open)

Changed the order of counters field in Events.avpr 

 2.0.4 JobHistoryParser can't parse certain failed job history files generated 
 by 2.0.3 history server
 -

 Key: MAPREDUCE-5309
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 2.0.4-alpha
Reporter: Vrushali C
Assignee: Rushabh S Shah
 Attachments: MAPREDUCE-5309.patch, Test20JobHistoryParsing.java, 
 job_2_0_3-KILLED.jhist


 When the 2.0.4 JobHistoryParser tries to parse a job history file generated 
 by hadoop 2.0.3, the jobhistoryparser throws as an error as
 java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array 
 cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
 at 
 org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
 at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
 at 
 org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
 at 
 com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
 at 
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
 Test code and the job history file are attached.
 Test code:
 package com.twitter.somepackagel;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
 import org.junit.Test;
 import org.apache.hadoop.yarn.YarnException;
 public class Test20JobHistoryParsing {

   @Test
   public void testFileAvro() throws IOException
   {
   Path local_path2 = new Path(/tmp/job_2_0_3-KILLED.jhist);
  JobHistoryParser parser2 

[jira] [Updated] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-15 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5309:
--

Status: Open  (was: Patch Available)

This fix the history files that were generated before 2.4.0 but breaks the 
history files that are generated since 2.4.0.

 2.0.4 JobHistoryParser can't parse certain failed job history files generated 
 by 2.0.3 history server
 -

 Key: MAPREDUCE-5309
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 2.0.4-alpha
Reporter: Vrushali C
Assignee: Rushabh S Shah
 Attachments: MAPREDUCE-5309.patch, Test20JobHistoryParsing.java, 
 job_2_0_3-KILLED.jhist


 When the 2.0.4 JobHistoryParser tries to parse a job history file generated 
 by hadoop 2.0.3, the jobhistoryparser throws as an error as
 java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array 
 cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
 at 
 org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
 at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
 at 
 org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
 at 
 com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
 at 
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
 Test code and the job history file are attached.
 Test code:
 package com.twitter.somepackagel;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
 import org.junit.Test;
 import org.apache.hadoop.yarn.YarnException;
 public class Test20JobHistoryParsing {

   @Test
   public void testFileAvro() throws IOException
   {
   Path local_path2 = 

[jira] [Updated] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-14 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5309:
--

 Description: 
When the 2.0.4 JobHistoryParser tries to parse a job history file generated by 
hadoop 2.0.3, the jobhistoryparser throws as an error as

java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array cannot 
be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
at 
org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
at 
org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
at 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
at 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
at 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
at 
com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
at 
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at 
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)


Test code and the job history file are attached.

Test code:
package com.twitter.somepackagel;

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
import org.junit.Test;
import org.apache.hadoop.yarn.YarnException;

public class Test20JobHistoryParsing {
   
  @Test
  public void testFileAvro() throws IOException
  {
  Path local_path2 = new Path(/tmp/job_2_0_3-KILLED.jhist);
 JobHistoryParser parser2 = new JobHistoryParser(FileSystem.getLocal(new 
Configuration()), local_path2);
 try {
   JobInfo ji2 = parser2.parse();
   System.out.println( job info:  + ji2.getJobname() +  
 + ji2.getFinishedMaps() +  
 + ji2.getTotalMaps() +  
 + ji2.getJobId() ) ;
 }
 catch (IOException e) {
throw new YarnException(Could not load history file 
   + local_path2.getName(), e);
 }
  }
}

This seems to stem from the fix in 
https://issues.apache.org/jira/browse/MAPREDUCE-4693
that added counters to the historyserver  for failed tasks.

This breaks backward compatibility with JobHistoryServer. 



  was:

When the 2.0.4 JobHistoryParser tries to parse a job history file generated by 
hadoop 2.0.3, the 

[jira] [Assigned] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-13 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah reassigned MAPREDUCE-5309:
-

Assignee: Rushabh S Shah

 2.0.4 JobHistoryParser can't parse certain failed job history files generated 
 by 2.0.3 history server
 -

 Key: MAPREDUCE-5309
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 2.0.4-alpha
Reporter: Vrushali C
Assignee: Rushabh S Shah
 Attachments: Test20JobHistoryParsing.java, job_2_0_3-KILLED.jhist


 When the 2.0.4 JobHistoryParser tries to parse a job history file generated 
 by hadoop 2.0.3, the jobhistoryparser throws as an error as
 java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array 
 cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
 at 
 org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
 at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
 at 
 org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
 at 
 com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
 at 
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
 Test code and the job history file are attached.
 Test code:
 package com.twitter.somepackagel;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
 import org.junit.Test;
 import org.apache.hadoop.yarn.YarnException;
 public class Test20JobHistoryParsing {

   @Test
   public void testFileAvro() throws IOException
   {
   Path local_path2 = new Path(/tmp/job_2_0_3-KILLED.jhist);
  JobHistoryParser parser2 = new JobHistoryParser(FileSystem.getLocal(new 
 Configuration()), local_path2);
   

[jira] [Updated] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

2014-05-13 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5309:
--

Attachment: MAPREDUCE-5309.patch

 2.0.4 JobHistoryParser can't parse certain failed job history files generated 
 by 2.0.3 history server
 -

 Key: MAPREDUCE-5309
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 2.0.4-alpha
Reporter: Vrushali C
Assignee: Rushabh S Shah
 Attachments: MAPREDUCE-5309.patch, Test20JobHistoryParsing.java, 
 job_2_0_3-KILLED.jhist


 When the 2.0.4 JobHistoryParser tries to parse a job history file generated 
 by hadoop 2.0.3, the jobhistoryparser throws as an error as
 java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array 
 cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
 at 
 org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
 at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
 at 
 org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
 at 
 org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
 at 
 org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
 at 
 org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
 at 
 com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
 at 
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
 at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
 Test code and the job history file are attached.
 Test code:
 package com.twitter.somepackagel;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
 import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
 import org.junit.Test;
 import org.apache.hadoop.yarn.YarnException;
 public class Test20JobHistoryParsing {

   @Test
   public void testFileAvro() throws IOException
   {
   Path local_path2 = new Path(/tmp/job_2_0_3-KILLED.jhist);
  JobHistoryParser parser2 = new JobHistoryParser(FileSystem.getLocal(new 
 

[jira] [Updated] (MAPREDUCE-4766) in diagnostics task ids and task attempt ids should become clickable links

2014-04-16 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-4766:
--

Target Version/s: 3.0.0  (was: 3.0.0, 0.23.11)

Talked with [~revans2] and removing the target version 0.23.11

 in diagnostics task ids and task attempt ids should become clickable links
 --

 Key: MAPREDUCE-4766
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4766
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Affects Versions: 0.23.5
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans

 It would be great if when we see a task id or a task attempt id in the 
 diagnostics that we change it to be a clickable link.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-4901) JobHistoryEventHandler errors should be fatal

2014-04-16 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-4901:
--

Target Version/s: 3.0.0  (was: 3.0.0, 0.23.11)

Talked to [~revans2] offline and removing 0.23.11 from target version.

 JobHistoryEventHandler errors should be fatal
 -

 Key: MAPREDUCE-4901
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4901
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0, 2.0.0-alpha
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: MR-4901-trunk.txt


 To be able to truly fix issues like MAPREDUCE-4819 and MAPREDUCE-4832, we 
 need a 2 phase commit where a subsequent AM can be sure that at a specific 
 point in time it knows exactly if any tasks/jobs are committing.  The job 
 history log is already used for similar functionality so we would like to 
 reuse this, but we need to be sure that errors while writing out to the job 
 history log are now fatal.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-4775) Reducer will never commit suicide

2014-04-16 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-4775:
--

Target Version/s: 3.0.0  (was: 3.0.0, 0.23.11)

Talked to [~revans2] offline and removing 0.23.11 from target version.

 Reducer will never commit suicide
 ---

 Key: MAPREDUCE-4775
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4775
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans

 In 1.0 there are a number of conditions that will cause a reducer to commit 
 suicide and exit.
 This includes if it is stalled, if the error percentage of total fetches is 
 too high.  In the new code it will only commit suicide when the total number 
 of failures for a single task attempt is = max(30, totalMaps/10).  In the 
 best case with the quadratic back-off to get a single map attempt to reach 30 
 failure it would take 20.5 hours.  And unless there is only one reducer 
 running the map task would have been restarted before then.
 We should go back to include the same reducer suicide checks that are in 1.0



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5797) Elapsed time for failed tasks that never started is wrong

2014-03-17 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5797:
--

Status: Patch Available  (was: Open)

Thanks Jonathan for your comments.
Changes incorporated in the new patch

  Elapsed time for failed tasks that never started is  wrong 
 

 Key: MAPREDUCE-5797
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5797
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.9
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5797-v2.patch, 
 patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch


 The elapsed time for tasks in a failed job that were never
 started can be way off.  It looks like we're marking the start time as the
 beginning of the epoch (i.e.: start time = -1) but the finish time is when the
 task was marked as failed when the whole job failed.  That causes the
 calculated elapsed time of the task to be a ridiculous number of hours.
 Tasks that fail without any attempts shouldn't have start/finish/elapsed 
 times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5797) Elapsed time for failed tasks that never started is wrong

2014-03-17 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5797:
--

Status: Open  (was: Patch Available)

  Elapsed time for failed tasks that never started is  wrong 
 

 Key: MAPREDUCE-5797
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5797
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.9
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5797-v2.patch, 
 patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch


 The elapsed time for tasks in a failed job that were never
 started can be way off.  It looks like we're marking the start time as the
 beginning of the epoch (i.e.: start time = -1) but the finish time is when the
 task was marked as failed when the whole job failed.  That causes the
 calculated elapsed time of the task to be a ridiculous number of hours.
 Tasks that fail without any attempts shouldn't have start/finish/elapsed 
 times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5797) Elapsed time for failed tasks that never started is wrong

2014-03-17 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5797:
--

Attachment: patch-MapReduce-5797-v2.patch

  Elapsed time for failed tasks that never started is  wrong 
 

 Key: MAPREDUCE-5797
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5797
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.9
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5797-v2.patch, 
 patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch


 The elapsed time for tasks in a failed job that were never
 started can be way off.  It looks like we're marking the start time as the
 beginning of the epoch (i.e.: start time = -1) but the finish time is when the
 task was marked as failed when the whole job failed.  That causes the
 calculated elapsed time of the task to be a ridiculous number of hours.
 Tasks that fail without any attempts shouldn't have start/finish/elapsed 
 times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5797) Elapsed time for failed tasks that never started is wrong

2014-03-17 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5797:
--

Status: Open  (was: Patch Available)

indentation errors. need to update

  Elapsed time for failed tasks that never started is  wrong 
 

 Key: MAPREDUCE-5797
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5797
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.9
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5797-v2.patch, 
 patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch


 The elapsed time for tasks in a failed job that were never
 started can be way off.  It looks like we're marking the start time as the
 beginning of the epoch (i.e.: start time = -1) but the finish time is when the
 task was marked as failed when the whole job failed.  That causes the
 calculated elapsed time of the task to be a ridiculous number of hours.
 Tasks that fail without any attempts shouldn't have start/finish/elapsed 
 times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5797) Elapsed time for failed tasks that never started is wrong

2014-03-17 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5797:
--

Attachment: MAPREDUCE-5797-v3.patch

  Elapsed time for failed tasks that never started is  wrong 
 

 Key: MAPREDUCE-5797
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5797
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.9
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: MAPREDUCE-5797-v3.patch, patch-MapReduce-5797-v2.patch, 
 patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch


 The elapsed time for tasks in a failed job that were never
 started can be way off.  It looks like we're marking the start time as the
 beginning of the epoch (i.e.: start time = -1) but the finish time is when the
 task was marked as failed when the whole job failed.  That causes the
 calculated elapsed time of the task to be a ridiculous number of hours.
 Tasks that fail without any attempts shouldn't have start/finish/elapsed 
 times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5797) Elapsed time for failed tasks that never started is wrong

2014-03-17 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5797:
--

Status: Patch Available  (was: Open)

  Elapsed time for failed tasks that never started is  wrong 
 

 Key: MAPREDUCE-5797
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5797
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.9
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: MAPREDUCE-5797-v3.patch, patch-MapReduce-5797-v2.patch, 
 patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch


 The elapsed time for tasks in a failed job that were never
 started can be way off.  It looks like we're marking the start time as the
 beginning of the epoch (i.e.: start time = -1) but the finish time is when the
 task was marked as failed when the whole job failed.  That causes the
 calculated elapsed time of the task to be a ridiculous number of hours.
 Tasks that fail without any attempts shouldn't have start/finish/elapsed 
 times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5797) Elapsed time for failed tasks that never started is wrong

2014-03-17 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13937967#comment-13937967
 ] 

Rushabh S Shah commented on MAPREDUCE-5797:
---

Corrected indentation errors.

  Elapsed time for failed tasks that never started is  wrong 
 

 Key: MAPREDUCE-5797
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5797
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.9
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: MAPREDUCE-5797-v3.patch, patch-MapReduce-5797-v2.patch, 
 patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch


 The elapsed time for tasks in a failed job that were never
 started can be way off.  It looks like we're marking the start time as the
 beginning of the epoch (i.e.: start time = -1) but the finish time is when the
 task was marked as failed when the whole job failed.  That causes the
 calculated elapsed time of the task to be a ridiculous number of hours.
 Tasks that fail without any attempts shouldn't have start/finish/elapsed 
 times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5570) Map task attempt with fetch failure has incorrect attempt finish time

2014-03-14 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5570:
--

Attachment: patch-MapReduce-5570-v2.patch

Thanks Jason for the comments.
Incorporated them in the new patch.

 Map task attempt with fetch failure has incorrect attempt finish time
 -

 Key: MAPREDUCE-5570
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5570
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.9, 2.1.1-beta
Reporter: Jason Lowe
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5570-v2.patch, patch-MapReduce-5570.patch


 If a map task attempt is retroactively failed due to excessive fetch failures 
 reported by reducers then the attempt's finish time is set to the time the 
 task was retroactively failed rather than when the task attempt completed.  
 This causes the map task attempt to appear to have run for much longer than 
 it actually did.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5570) Map task attempt with fetch failure has incorrect attempt finish time

2014-03-14 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5570:
--

Status: Patch Available  (was: Open)

 Map task attempt with fetch failure has incorrect attempt finish time
 -

 Key: MAPREDUCE-5570
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5570
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 2.1.1-beta, 0.23.9
Reporter: Jason Lowe
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5570-v2.patch, patch-MapReduce-5570.patch


 If a map task attempt is retroactively failed due to excessive fetch failures 
 reported by reducers then the attempt's finish time is set to the time the 
 task was retroactively failed rather than when the task attempt completed.  
 This causes the map task attempt to appear to have run for much longer than 
 it actually did.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5570) Map task attempt with fetch failure has incorrect attempt finish time

2014-03-14 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5570:
--

Status: Open  (was: Patch Available)

 Map task attempt with fetch failure has incorrect attempt finish time
 -

 Key: MAPREDUCE-5570
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5570
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 2.1.1-beta, 0.23.9
Reporter: Jason Lowe
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5570.patch


 If a map task attempt is retroactively failed due to excessive fetch failures 
 reported by reducers then the attempt's finish time is set to the time the 
 task was retroactively failed rather than when the task attempt completed.  
 This causes the map task attempt to appear to have run for much longer than 
 it actually did.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5797) The elapsed time for tasks in a failed job that were never started can be way off.

2014-03-14 Thread Rushabh S Shah (JIRA)
Rushabh S Shah created MAPREDUCE-5797:
-

 Summary: The elapsed time for tasks in a failed job that were 
never started can be way off. 
 Key: MAPREDUCE-5797
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5797
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.9
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah


The elapsed time for tasks in a failed job that were never
started can be way off.  It looks like we're marking the start time as the
beginning of the epoch (i.e.: start time = -1) but the finish time is when the
task was marked as failed when the whole job failed.  That causes the
calculated elapsed time of the task to be a ridiculous number of hours.

Tasks that fail without any attempts shouldn't have start/finish/elapsed times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5797) The elapsed time for tasks in a failed job that were never started can be way off.

2014-03-14 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5797:
--

Attachment: patch-MapReduce-5797.patch

 The elapsed time for tasks in a failed job that were never started can be way 
 off. 
 ---

 Key: MAPREDUCE-5797
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5797
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.9
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5797.patch


 The elapsed time for tasks in a failed job that were never
 started can be way off.  It looks like we're marking the start time as the
 beginning of the epoch (i.e.: start time = -1) but the finish time is when the
 task was marked as failed when the whole job failed.  That causes the
 calculated elapsed time of the task to be a ridiculous number of hours.
 Tasks that fail without any attempts shouldn't have start/finish/elapsed 
 times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5797) The elapsed time for tasks in a failed job that were never started can be way off.

2014-03-14 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5797:
--

Status: Patch Available  (was: Open)

Add a new check in javascript if the returned date is '-1'. If it is then 
return N/A.
Minor changes to Times.java  also and added a test case to confirm that.

 The elapsed time for tasks in a failed job that were never started can be way 
 off. 
 ---

 Key: MAPREDUCE-5797
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5797
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.9
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5797.patch


 The elapsed time for tasks in a failed job that were never
 started can be way off.  It looks like we're marking the start time as the
 beginning of the epoch (i.e.: start time = -1) but the finish time is when the
 task was marked as failed when the whole job failed.  That causes the
 calculated elapsed time of the task to be a ridiculous number of hours.
 Tasks that fail without any attempts shouldn't have start/finish/elapsed 
 times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5797) The elapsed time for tasks in a failed job is wrong

2014-03-14 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5797:
--

Summary: The elapsed time for tasks in a failed job is  wrong   (was: The 
elapsed time for tasks in a failed job that were never started can be way off. )

 The elapsed time for tasks in a failed job is  wrong 
 -

 Key: MAPREDUCE-5797
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5797
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.9
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5797.patch


 The elapsed time for tasks in a failed job that were never
 started can be way off.  It looks like we're marking the start time as the
 beginning of the epoch (i.e.: start time = -1) but the finish time is when the
 task was marked as failed when the whole job failed.  That causes the
 calculated elapsed time of the task to be a ridiculous number of hours.
 Tasks that fail without any attempts shouldn't have start/finish/elapsed 
 times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5797) Elapsed time for failed tasks that never started is wrong

2014-03-14 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5797:
--

Summary:  Elapsed time for failed tasks that never started is  wrong   
(was: The elapsed time for tasks in a failed job is  wrong )

  Elapsed time for failed tasks that never started is  wrong 
 

 Key: MAPREDUCE-5797
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5797
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.9
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5797.patch


 The elapsed time for tasks in a failed job that were never
 started can be way off.  It looks like we're marking the start time as the
 beginning of the epoch (i.e.: start time = -1) but the finish time is when the
 task was marked as failed when the whole job failed.  That causes the
 calculated elapsed time of the task to be a ridiculous number of hours.
 Tasks that fail without any attempts shouldn't have start/finish/elapsed 
 times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5797) Elapsed time for failed tasks that never started is wrong

2014-03-14 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5797:
--

Status: Open  (was: Patch Available)

  Elapsed time for failed tasks that never started is  wrong 
 

 Key: MAPREDUCE-5797
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5797
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.9
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch


 The elapsed time for tasks in a failed job that were never
 started can be way off.  It looks like we're marking the start time as the
 beginning of the epoch (i.e.: start time = -1) but the finish time is when the
 task was marked as failed when the whole job failed.  That causes the
 calculated elapsed time of the task to be a ridiculous number of hours.
 Tasks that fail without any attempts shouldn't have start/finish/elapsed 
 times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5797) Elapsed time for failed tasks that never started is wrong

2014-03-14 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5797:
--

Attachment: patch-MapReduce-5797-v2.patch

  Elapsed time for failed tasks that never started is  wrong 
 

 Key: MAPREDUCE-5797
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5797
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.9
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch


 The elapsed time for tasks in a failed job that were never
 started can be way off.  It looks like we're marking the start time as the
 beginning of the epoch (i.e.: start time = -1) but the finish time is when the
 task was marked as failed when the whole job failed.  That causes the
 calculated elapsed time of the task to be a ridiculous number of hours.
 Tasks that fail without any attempts shouldn't have start/finish/elapsed 
 times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5797) Elapsed time for failed tasks that never started is wrong

2014-03-14 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5797:
--

Status: Patch Available  (was: Open)

Added Apache License Agreement TestTimes.java

  Elapsed time for failed tasks that never started is  wrong 
 

 Key: MAPREDUCE-5797
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5797
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.9
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch


 The elapsed time for tasks in a failed job that were never
 started can be way off.  It looks like we're marking the start time as the
 beginning of the epoch (i.e.: start time = -1) but the finish time is when the
 task was marked as failed when the whole job failed.  That causes the
 calculated elapsed time of the task to be a ridiculous number of hours.
 Tasks that fail without any attempts shouldn't have start/finish/elapsed 
 times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5789) Average Reduce time is incorrect on Job Overview page

2014-03-12 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5789:
--

Status: Open  (was: Patch Available)

 Average Reduce time is incorrect on Job Overview page
 -

 Key: MAPREDUCE-5789
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5789
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 2.3.0, 0.23.10
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5789-v2.patch, patch-MapReduce-5789.patch


 The Average Reduce time displayed on the job overview page is incorrect.
 Previously Reduce time was calculated as difference between finishTime and 
 shuffleFinishTime.
 It should be difference of finishTime and sortFinishTime



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5789) Average Reduce time is incorrect on Job Overview page

2014-03-12 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932274#comment-13932274
 ] 

Rushabh S Shah commented on MAPREDUCE-5789:
---

Thanks Jason for the comments.
Changes incorporated.


 Average Reduce time is incorrect on Job Overview page
 -

 Key: MAPREDUCE-5789
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5789
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.10, 2.3.0
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5789-v2.patch, patch-MapReduce-5789.patch


 The Average Reduce time displayed on the job overview page is incorrect.
 Previously Reduce time was calculated as difference between finishTime and 
 shuffleFinishTime.
 It should be difference of finishTime and sortFinishTime



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5789) Average Reduce time is incorrect on Job Overview page

2014-03-12 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5789:
--

Attachment: patch-MapReduce-5789-v2.patch

 Average Reduce time is incorrect on Job Overview page
 -

 Key: MAPREDUCE-5789
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5789
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.10, 2.3.0
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5789-v2.patch, patch-MapReduce-5789.patch


 The Average Reduce time displayed on the job overview page is incorrect.
 Previously Reduce time was calculated as difference between finishTime and 
 shuffleFinishTime.
 It should be difference of finishTime and sortFinishTime



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5789) Average Reduce time is incorrect on Job Overview page

2014-03-12 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5789:
--

Status: Patch Available  (was: Open)

Changes incorporated

 Average Reduce time is incorrect on Job Overview page
 -

 Key: MAPREDUCE-5789
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5789
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 2.3.0, 0.23.10
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5789-v2.patch, patch-MapReduce-5789.patch


 The Average Reduce time displayed on the job overview page is incorrect.
 Previously Reduce time was calculated as difference between finishTime and 
 shuffleFinishTime.
 It should be difference of finishTime and sortFinishTime



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5570) Map task attempt with fetch failure has incorrect attempt finish time

2014-03-12 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5570:
--

Attachment: patch-MapReduce-5570.patch

 Map task attempt with fetch failure has incorrect attempt finish time
 -

 Key: MAPREDUCE-5570
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5570
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.9, 2.1.1-beta
Reporter: Jason Lowe
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5570.patch


 If a map task attempt is retroactively failed due to excessive fetch failures 
 reported by reducers then the attempt's finish time is set to the time the 
 task was retroactively failed rather than when the task attempt completed.  
 This causes the map task attempt to appear to have run for much longer than 
 it actually did.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5570) Map task attempt with fetch failure has incorrect attempt finish time

2014-03-12 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5570:
--

Status: Patch Available  (was: Open)

Removed the code where its updating the finishTime.
Added a test case to verify.

 Map task attempt with fetch failure has incorrect attempt finish time
 -

 Key: MAPREDUCE-5570
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5570
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 2.1.1-beta, 0.23.9
Reporter: Jason Lowe
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5570.patch


 If a map task attempt is retroactively failed due to excessive fetch failures 
 reported by reducers then the attempt's finish time is set to the time the 
 task was retroactively failed rather than when the task attempt completed.  
 This causes the map task attempt to appear to have run for much longer than 
 it actually did.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (MAPREDUCE-5570) Map task attempt with fetch failure has incorrect attempt finish time

2014-03-11 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah reassigned MAPREDUCE-5570:
-

Assignee: Rushabh S Shah

 Map task attempt with fetch failure has incorrect attempt finish time
 -

 Key: MAPREDUCE-5570
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5570
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am, mrv2
Affects Versions: 0.23.9, 2.1.1-beta
Reporter: Jason Lowe
Assignee: Rushabh S Shah

 If a map task attempt is retroactively failed due to excessive fetch failures 
 reported by reducers then the attempt's finish time is set to the time the 
 task was retroactively failed rather than when the task attempt completed.  
 This causes the map task attempt to appear to have run for much longer than 
 it actually did.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5789) Average Reduce time is incorrect on Job Overview page

2014-03-10 Thread Rushabh S Shah (JIRA)
Rushabh S Shah created MAPREDUCE-5789:
-

 Summary: Average Reduce time is incorrect on Job Overview page
 Key: MAPREDUCE-5789
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5789
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 2.3.0, 0.23.10
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah


The Average Reduce time displayed on the job overview page is incorrect.
Previously Reduce time was calculated as difference between finishTime and 
shuffleFinishTime.
It should be difference of finishTime and sortFinishTime



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5789) Average Reduce time is incorrect on Job Overview page

2014-03-10 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5789:
--

Status: Patch Available  (was: Open)

Fixed the issue and confirmed with test case

 Average Reduce time is incorrect on Job Overview page
 -

 Key: MAPREDUCE-5789
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5789
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 2.3.0, 0.23.10
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5789.patch


 The Average Reduce time displayed on the job overview page is incorrect.
 Previously Reduce time was calculated as difference between finishTime and 
 shuffleFinishTime.
 It should be difference of finishTime and sortFinishTime



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5789) Average Reduce time is incorrect on Job Overview page

2014-03-10 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-5789:
--

Attachment: patch-MapReduce-5789.patch

 Average Reduce time is incorrect on Job Overview page
 -

 Key: MAPREDUCE-5789
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5789
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 0.23.10, 2.3.0
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Attachments: patch-MapReduce-5789.patch


 The Average Reduce time displayed on the job overview page is incorrect.
 Previously Reduce time was calculated as difference between finishTime and 
 shuffleFinishTime.
 It should be difference of finishTime and sortFinishTime



--
This message was sent by Atlassian JIRA
(v6.2#6252)