Rushabh S Shah created MAPREDUCE-7076:

             Summary: TestNNBench#testNNBenchCreateReadAndDelete failing in our 
internal build
                 Key: MAPREDUCE-7076
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: test
    Affects Versions: 2.8.0
            Reporter: Rushabh S Shah

TestNNBench#testNNBenchCreateReadAndDelete failed couple of times in our 
internal jenkins build.
java.lang.AssertionError: create_write should create the file
        at org.junit.Assert.assertTrue(

Below is my analysis for why it didn't create the file.
// Some comments here
  public void map(Text key, 
            LongWritable value,
            OutputCollector<Text, Text> output,
            Reporter reporter) throws IOException {
      if (barrier()) {
        String fileName = "file_" + value;
        if (op.equals(OP_CREATE_WRITE)) {
          startTimeTPmS = System.currentTimeMillis();
          doCreateWriteOp(fileName, reporter);
        } ...
      } else {
        output.collect(new Text("l:latemaps"), new Text("1"));
  // Below are the relevant parts of barrier() method
  private boolean barrier() {
    // If the sleep time is greater than 0, then sleep and return
    ..."Waiting in barrier for: " + sleepTime + " ms");
    return retVal;
  // Below are the relevant parts of the doCreateWriteOp
  private void doCreateWriteOp(String name,
                                 Reporter reporter) {
    FSDataOutputStream out;
    byte[] buffer = new byte[bytesToWrite];  
    for (long l = 0l; l < numberOfFiles; l++) {
      Path filePath = new Path(new Path(baseDir, dataDirName), 
              name + "_" + l);

This file {{BASE_DIR/data/file_0_0}} is getting created only if the map task 
starts before the time mentioned by {{startTime}}.
Refer the chunk which I pasted above.
{{map(..)}} --> {{barrier()}} and *only if* {{barrier()}} evaluates to true it 
will call {{doCreateWriteOp}} which will eventually create the file.
In test case, the delay value is 3 seconds as per {{"-startTime", "" + 
( / 1000 + 3)}}
In this failing test case, I can see the task starting minimum 6 seconds after 
the test case started.
2017-01-27 03:11:15,387 INFO  [Thread-4] mapreduce.JobSubmitter 
( - Submitting tokens for job: 
2017-01-27 03:11:23,405 INFO  [Thread-4] mapreduce.Job ( 
- The url to track the job: http://localhost:8080/

Also when I run this test on my laptop, I see the following line being printed.
2017-01-27 17:09:27,982 INFO  [LocalJobRunner Map Task Executor #0] 
hdfs.NNBench ( - Waiting in barrier for: 1018 ms
This line will be printed only in {{barrier()}} method and I don't see this 
line in the logs of failed test.
In our environment, the jenkins server was very slow and it took more than 6 
seconds to launch a map task.
The correct fix in my opinion would be to return true in case there is no sleep 
in {{barrier() method}}. Only in exception, it will return false.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

Reply via email to