mingleizhang created MAPREDUCE-6729:
---------------------------------------

             Summary: Hitting performance and error when lots of files to write 
or read
                 Key: MAPREDUCE-6729
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: benchmarks, performance, test
            Reporter: mingleizhang
            Priority: Minor


When doing DFSIO test as distributed i/o benchmark tool. Then especially writes 
plenty of files to disk or read from, both can cause performance issue and 
imprecise value in a way. The question is that existing practices needs to 
delete files when before running a job and that will cause time consumption and 
furthermore cause performance issue, statistical time error and imprecise 
throughput for us. We need to replace or improve this hack to prevent this from 
happening in the future.

{code}
public static void testWrite() throws Exception {
    FileSystem fs = cluster.getFileSystem();
    long tStart = System.currentTimeMillis();
    bench.writeTest(fs);
    long execTime = System.currentTimeMillis() - tStart;
    bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime);
  }

private void writeTest(FileSystem fs) throws IOException {
  Path writeDir = getWriteDir(config);
  fs.delete(getDataDir(config), true);
  fs.delete(writeDir, true);    
  runIOTest(WriteMapper.class, writeDir);
  }
{code} 

[https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java]






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to