[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-7076:
--------------------------------------
    Description: 
TestNNBench#testNNBenchCreateReadAndDelete failed couple of times in our 
internal jenkins build.
{noformat}
java.lang.AssertionError: create_write should create the file
        at org.junit.Assert.fail(Assert.java:88)
        at org.junit.Assert.assertTrue(Assert.java:41)
        at 
org.apache.hadoop.hdfs.TestNNBench.testNNBenchCreateReadAndDelete(TestNNBench.java:55)
{noformat}
Below is my analysis for why it didn't create the file.
{code:java|title=NNBench.java|borderStyle=solid}
// Some comments here
  public void map(Text key, 
            LongWritable value,
            OutputCollector<Text, Text> output,
            Reporter reporter) throws IOException {
      if (barrier()) {
        String fileName = "file_" + value;
        if (op.equals(OP_CREATE_WRITE)) {
          startTimeTPmS = System.currentTimeMillis();
          doCreateWriteOp(fileName, reporter);
        } ...
      } else {
        output.collect(new Text("l:latemaps"), new Text("1"));
      }
  // Below are the relevant parts of barrier() method
  private boolean barrier() {
    ..
    // If the sleep time is greater than 0, then sleep and return
    ...
    LOG.info("Waiting in barrier for: " + sleepTime + " ms");
    return retVal;
  }
  // Below are the relevant parts of the doCreateWriteOp
  private void doCreateWriteOp(String name,
                                 Reporter reporter) {
    FSDataOutputStream out;
    byte[] buffer = new byte[bytesToWrite];  
    for (long l = 0l; l < numberOfFiles; l++) {
      Path filePath = new Path(new Path(baseDir, dataDirName), 
              name + "_" + l);
    }
  ....
  }
{code}
This file {{BASE_DIR/data/file_0_0}} is getting created only if the map task 
starts before the time mentioned by {{startTime}}.
 Refer the chunk which I pasted above.
 {{map(..)}} --> {{barrier()}} and *only if* {{barrier()}} evaluates to true it 
will call {{doCreateWriteOp}} which will eventually create the file.
 In test case, the delay value is 3 seconds as per {{"-startTime", "" + 
(Time.now() / 1000 + 3)}}
 In this failing test case, I can see the task starting minimum 6 seconds after 
the test case started.
{noformat}
2017-01-27 03:11:15,387 INFO  [Thread-4] mapreduce.JobSubmitter 
(JobSubmitter.java:printTokens(289)) - Submitting tokens for job: 
job_local1711545156_0001
2017-01-27 03:11:23,405 INFO  [Thread-4] mapreduce.Job (Job.java:submit(1345)) 
- The url to track the job: http://localhost:8080/
{noformat}
Also when I run this test on my laptop, I see the following line being printed.
{noformat}
2017-01-27 17:09:27,982 INFO  [LocalJobRunner Map Task Executor #0] 
hdfs.NNBench (NNBench.java:barrier(676)) - Waiting in barrier for: 1018 ms
{noformat}
This line will be printed only in {{barrier()}} method and I don't see this 
line in the logs of failed test.

 
 In our environment, the jenkins server was very slow and it took more than 6 
seconds to launch a map task.
 The correct fix in my opinion would be to return true in case there is no 
sleep in {{barrier() method}}. Only in exception, it should return false.

  was:
TestNNBench#testNNBenchCreateReadAndDelete failed couple of times in our 
internal jenkins build.
{noformat}
java.lang.AssertionError: create_write should create the file
        at org.junit.Assert.fail(Assert.java:88)
        at org.junit.Assert.assertTrue(Assert.java:41)
        at 
org.apache.hadoop.hdfs.TestNNBench.testNNBenchCreateReadAndDelete(TestNNBench.java:55)
{noformat}

Below is my analysis for why it didn't create the file.
{code:title=NNBench.java|borderStyle=solid}
// Some comments here
  public void map(Text key, 
            LongWritable value,
            OutputCollector<Text, Text> output,
            Reporter reporter) throws IOException {
      if (barrier()) {
        String fileName = "file_" + value;
        if (op.equals(OP_CREATE_WRITE)) {
          startTimeTPmS = System.currentTimeMillis();
          doCreateWriteOp(fileName, reporter);
        } ...
      } else {
        output.collect(new Text("l:latemaps"), new Text("1"));
      }
  // Below are the relevant parts of barrier() method
  private boolean barrier() {
    ..
    // If the sleep time is greater than 0, then sleep and return
    ...
    LOG.info("Waiting in barrier for: " + sleepTime + " ms");
    return retVal;
  }
  // Below are the relevant parts of the doCreateWriteOp
  private void doCreateWriteOp(String name,
                                 Reporter reporter) {
    FSDataOutputStream out;
    byte[] buffer = new byte[bytesToWrite];  
    for (long l = 0l; l < numberOfFiles; l++) {
      Path filePath = new Path(new Path(baseDir, dataDirName), 
              name + "_" + l);
    }
  ....
  }
{code}   

This file {{BASE_DIR/data/file_0_0}} is getting created only if the map task 
starts before the time mentioned by {{startTime}}.
Refer the chunk which I pasted above.
{{map(..)}} --> {{barrier()}} and *only if* {{barrier()}} evaluates to true it 
will call {{doCreateWriteOp}} which will eventually create the file.
In test case, the delay value is 3 seconds as per {{"-startTime", "" + 
(Time.now() / 1000 + 3)}}
In this failing test case, I can see the task starting minimum 6 seconds after 
the test case started.
{noformat}
2017-01-27 03:11:15,387 INFO  [Thread-4] mapreduce.JobSubmitter 
(JobSubmitter.java:printTokens(289)) - Submitting tokens for job: 
job_local1711545156_0001
2017-01-27 03:11:23,405 INFO  [Thread-4] mapreduce.Job (Job.java:submit(1345)) 
- The url to track the job: http://localhost:8080/
{noformat}

Also when I run this test on my laptop, I see the following line being printed.
{noformat}
2017-01-27 17:09:27,982 INFO  [LocalJobRunner Map Task Executor #0] 
hdfs.NNBench (NNBench.java:barrier(676)) - Waiting in barrier for: 1018 ms
{noformat}
This line will be printed only in {{barrier()}} method and I don't see this 
line in the logs of failed test.
In our environment, the jenkins server was very slow and it took more than 6 
seconds to launch a map task.
The correct fix in my opinion would be to return true in case there is no sleep 
in {{barrier() method}}. Only in exception, it will return false.



> TestNNBench#testNNBenchCreateReadAndDelete failing in our internal build
> ------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-7076
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7076
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 2.8.0
>            Reporter: Rushabh S Shah
>            Priority: Minor
>              Labels: newbie
>
> TestNNBench#testNNBenchCreateReadAndDelete failed couple of times in our 
> internal jenkins build.
> {noformat}
> java.lang.AssertionError: create_write should create the file
>       at org.junit.Assert.fail(Assert.java:88)
>       at org.junit.Assert.assertTrue(Assert.java:41)
>       at 
> org.apache.hadoop.hdfs.TestNNBench.testNNBenchCreateReadAndDelete(TestNNBench.java:55)
> {noformat}
> Below is my analysis for why it didn't create the file.
> {code:java|title=NNBench.java|borderStyle=solid}
> // Some comments here
>   public void map(Text key, 
>             LongWritable value,
>             OutputCollector<Text, Text> output,
>             Reporter reporter) throws IOException {
>       if (barrier()) {
>         String fileName = "file_" + value;
>         if (op.equals(OP_CREATE_WRITE)) {
>           startTimeTPmS = System.currentTimeMillis();
>           doCreateWriteOp(fileName, reporter);
>         } ...
>       } else {
>         output.collect(new Text("l:latemaps"), new Text("1"));
>       }
>   // Below are the relevant parts of barrier() method
>   private boolean barrier() {
>     ..
>     // If the sleep time is greater than 0, then sleep and return
>     ...
>     LOG.info("Waiting in barrier for: " + sleepTime + " ms");
>     return retVal;
>   }
>   // Below are the relevant parts of the doCreateWriteOp
>   private void doCreateWriteOp(String name,
>                                  Reporter reporter) {
>     FSDataOutputStream out;
>     byte[] buffer = new byte[bytesToWrite];  
>     for (long l = 0l; l < numberOfFiles; l++) {
>       Path filePath = new Path(new Path(baseDir, dataDirName), 
>               name + "_" + l);
>     }
>   ....
>   }
> {code}
> This file {{BASE_DIR/data/file_0_0}} is getting created only if the map task 
> starts before the time mentioned by {{startTime}}.
>  Refer the chunk which I pasted above.
>  {{map(..)}} --> {{barrier()}} and *only if* {{barrier()}} evaluates to true 
> it will call {{doCreateWriteOp}} which will eventually create the file.
>  In test case, the delay value is 3 seconds as per {{"-startTime", "" + 
> (Time.now() / 1000 + 3)}}
>  In this failing test case, I can see the task starting minimum 6 seconds 
> after the test case started.
> {noformat}
> 2017-01-27 03:11:15,387 INFO  [Thread-4] mapreduce.JobSubmitter 
> (JobSubmitter.java:printTokens(289)) - Submitting tokens for job: 
> job_local1711545156_0001
> 2017-01-27 03:11:23,405 INFO  [Thread-4] mapreduce.Job 
> (Job.java:submit(1345)) - The url to track the job: http://localhost:8080/
> {noformat}
> Also when I run this test on my laptop, I see the following line being 
> printed.
> {noformat}
> 2017-01-27 17:09:27,982 INFO  [LocalJobRunner Map Task Executor #0] 
> hdfs.NNBench (NNBench.java:barrier(676)) - Waiting in barrier for: 1018 ms
> {noformat}
> This line will be printed only in {{barrier()}} method and I don't see this 
> line in the logs of failed test.
>  
>  In our environment, the jenkins server was very slow and it took more than 6 
> seconds to launch a map task.
>  The correct fix in my opinion would be to return true in case there is no 
> sleep in {{barrier() method}}. Only in exception, it should return false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to