[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Description: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput as the files are lots of. So we need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); // this line of code will cause extra time consumption as fs.delete(*,*) by the writeTest method long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] was: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput while the files are lots of. So we need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); // this line of code will cause extra time consumption as fs.delete(*,*) by the writeTest method long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: MAPREDUCE-6729.001.patch, MAPREDUCE-6729.002.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput as the files are lots of. So we need to replace or > improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption as fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Description: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput while the files are lots of. So we need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); // this line of code will cause extra time consumption as fs.delete(*,*) by the writeTest method long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] was: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput while the files are lots of. So we need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); // this line of code will cause extra time consumption because of fs.delete(*,*) by the writeTest method long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: MAPREDUCE-6729.001.patch, MAPREDUCE-6729.002.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption as fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated MAPREDUCE-6729: --- Flags: (was: Important) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: MAPREDUCE-6729.001.patch, MAPREDUCE-6729.002.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated MAPREDUCE-6729: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha2 2.8.0 Status: Resolved (was: Patch Available) Committed this to trunk, branch-2, and branch-2.8. Thanks [~mingleizhang] for the contribution and thanks [~ozawa] and [~drankye] for the reviews. > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: MAPREDUCE-6729.001.patch, MAPREDUCE-6729.002.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Patch Available (was: Open) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729.001.patch, MAPREDUCE-6729.002.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Open (was: Patch Available) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729.001.patch, MAPREDUCE-6729.002.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated MAPREDUCE-6729: - Attachment: MAPREDUCE-6729.002.patch Re-uploaded the same patch to trigger the building. > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729.001.patch, MAPREDUCE-6729.002.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi Ozawa updated MAPREDUCE-6729: -- Status: Patch Available (was: Open) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729.001.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Open (was: Patch Available) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729.001.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: MAPREDUCE-6729.001.patch > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729.001.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: (was: MAPREDUCE-6729.001.patch) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Open (was: Patch Available) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729.001.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Patch Available (was: Open) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729.001.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: MAPREDUCE-6729.001.patch > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729.001.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: (was: MAPREDUCE-6729-v1.patch) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Patch Available (was: Open) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: MAPREDUCE-6729-v1.patch > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Open (was: Patch Available) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: (was: MAPREDUCE-6729-v1.patch) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Patch Available (was: Open) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: MAPREDUCE-6729-v1.patch > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Open (was: Patch Available) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: (was: MAPREDUCE-6729-v1.patch) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Patch Available (was: Open) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Open (was: Patch Available) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: MAPREDUCE-6729-v1.patch > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: (was: MR-6729.txt) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: (was: MAPREDUCE-6729-v1.patch.txt) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Affects Version/s: 2.9.0 Status: Patch Available (was: Open) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch.txt, MR-6729.txt > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: MAPREDUCE-6729-v1.patch.txt > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch.txt, MR-6729.txt > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated MAPREDUCE-6729: - Summary: Accurately compute the test execute time in DFSIO (was: Hitting performance and error when lots of files to write or read) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MR-6729.txt > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org