Edwina Lu created MAPREDUCE-6913:
------------------------------------

             Summary: TestDFSIO: error if file size is >= 2G for random read
                 Key: MAPREDUCE-6913
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6913
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: benchmarks
    Affects Versions: 2.7.4
            Reporter: Edwina Lu


For the TestDFSIO benchmark, if the test file created are 2G or larger:

hadoop jar 
$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.4.1-tests.jar
 TestDFSIO  -Dtest.build.data=/user/edlu/DFSIO-8  -write -nrFiles 1024 -size 2GB

And TestDFSIO is run with options "-read -random":

hadoop jar 
$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.4.1-tests.jar
 TestDFSIO -Dtest.build.data=/user/edlu/DFSIO-8  -read -random -nrFiles 1024 
-size 1GB

Then the following error is raised:

17/07/14 21:20:55 INFO mapreduce.Job: Task Id : 
attempt_1496991431717_9344_m_000226_0, Status : FAILED
Error: java.lang.IllegalArgumentException: bound must be positive
        at java.util.Random.nextInt(Random.java:388)
        at 
org.apache.hadoop.fs.TestDFSIO$RandomReadMapper.nextOffset(TestDFSIO.java:615)
        at 
org.apache.hadoop.fs.TestDFSIO$RandomReadMapper.doIO(TestDFSIO.java:594)
        at 
org.apache.hadoop.fs.TestDFSIO$RandomReadMapper.doIO(TestDFSIO.java:560)
        at org.apache.hadoop.fs.IOMapperBase.map(IOMapperBase.java:134)
        at org.apache.hadoop.fs.IOMapperBase.map(IOMapperBase.java:37)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

The code is casting fileSize to int when generating a random number. It should 
generate a random long instead:

    /**                                                                         
                                                                                
                                                                                
                
     * Get next offset for reading.                                             
                                                                                
                                                                                
                
     * If current < 0 then choose initial offset according to the read type.    
                                                                                
                                                                                
                
     *                                                                          
                                                                                
                                                                                
                
     * @param current offset                                                    
                                                                                
                                                                                
                
     * @return                                                                  
                                                                                
                                                                                
                
     */
    private long nextOffset(long current) {
      if(skipSize == 0)
        return rnd.nextInt((int)(fileSize));
      if(skipSize > 0)
        return (current < 0) ? 0 : (current + bufferSize + skipSize);
      // skipSize < 0                                                           
                                                                                
                                                                                
                
      return (current < 0) ? Math.max(0, fileSize - bufferSize) :
                             Math.max(0, current + skipSize);
    }
  }





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Reply via email to