mingleizhang created MAPREDUCE-6729:
---------------------------------------
Summary: Hitting performance and error when lots of files to write
or read
Key: MAPREDUCE-6729
URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: benchmarks, performance, test
Reporter: mingleizhang
Priority: Minor
When doing DFSIO test as distributed i/o benchmark tool. Then especially writes
plenty of files to disk or read from, both can cause performance issue and
imprecise value in a way. The question is that existing practices needs to
delete files when before running a job and that will cause time consumption and
furthermore cause performance issue, statistical time error and imprecise
throughput for us. We need to replace or improve this hack to prevent this from
happening in the future.
{code}
public static void testWrite() throws Exception {
FileSystem fs = cluster.getFileSystem();
long tStart = System.currentTimeMillis();
bench.writeTest(fs);
long execTime = System.currentTimeMillis() - tStart;
bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime);
}
private void writeTest(FileSystem fs) throws IOException {
Path writeDir = getWriteDir(config);
fs.delete(getDataDir(config), true);
fs.delete(writeDir, true);
runIOTest(WriteMapper.class, writeDir);
}
{code}
[https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]