[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Description: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput as the files are lots of. So we need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); // this line of code will cause extra time consumption as fs.delete(*,*) by the writeTest method long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] was: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput while the files are lots of. So we need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); // this line of code will cause extra time consumption as fs.delete(*,*) by the writeTest method long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: MAPREDUCE-6729.001.patch, MAPREDUCE-6729.002.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput as the files are lots of. So we need to replace or > improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption as fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Description: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput while the files are lots of. So we need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); // this line of code will cause extra time consumption as fs.delete(*,*) by the writeTest method long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] was: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput while the files are lots of. So we need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); // this line of code will cause extra time consumption because of fs.delete(*,*) by the writeTest method long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: MAPREDUCE-6729.001.patch, MAPREDUCE-6729.002.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption as fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Comment: was deleted (was: Thanks to all and I am glad to enjoy this time.) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: MAPREDUCE-6729.001.patch, MAPREDUCE-6729.002.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400674#comment-15400674 ] mingleizhang commented on MAPREDUCE-6729: - Thanks to all and I am glad to enjoy this time. > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: MAPREDUCE-6729.001.patch, MAPREDUCE-6729.002.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398848#comment-15398848 ] mingleizhang commented on MAPREDUCE-6729: - [~ajisakaa] Thanks for your review and I am looking forward this commit coming soon. > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729.001.patch, MAPREDUCE-6729.002.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374421#comment-15374421 ] mingleizhang commented on MAPREDUCE-6729: - Thanks to Kai's help. And I've done this. > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729.001.patch, MAPREDUCE-6729.002.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Patch Available (was: Open) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729.001.patch, MAPREDUCE-6729.002.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Open (was: Patch Available) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729.001.patch, MAPREDUCE-6729.002.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Open (was: Patch Available) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729.001.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: MAPREDUCE-6729.001.patch > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729.001.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: (was: MAPREDUCE-6729.001.patch) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Open (was: Patch Available) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729.001.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Patch Available (was: Open) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729.001.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: MAPREDUCE-6729.001.patch > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729.001.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: (was: MAPREDUCE-6729-v1.patch) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367204#comment-15367204 ] mingleizhang commented on MAPREDUCE-6729: - I am always happy to help. > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367200#comment-15367200 ] mingleizhang commented on MAPREDUCE-6729: - Thanks Tsuyoshi Ozawa for review and I will try it soon. > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Patch Available (was: Open) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: MAPREDUCE-6729-v1.patch > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Open (was: Patch Available) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: (was: MAPREDUCE-6729-v1.patch) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Patch Available (was: Open) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: MAPREDUCE-6729-v1.patch > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Open (was: Patch Available) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: (was: MAPREDUCE-6729-v1.patch) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Patch Available (was: Open) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Status: Open (was: Patch Available) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: MAPREDUCE-6729-v1.patch > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: (was: MR-6729.txt) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: (was: MAPREDUCE-6729-v1.patch.txt) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Affects Version/s: 2.9.0 Status: Patch Available (was: Open) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch.txt, MR-6729.txt > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366117#comment-15366117 ] mingleizhang commented on MAPREDUCE-6729: - Thanks Kai Zheng for the useful comment. And I have renamed the patch name just now. > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch.txt, MR-6729.txt > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Comment: was deleted (was: Thanks @Kai Zheng for the useful comment. And I have renamed the patch name just now.) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch.txt, MR-6729.txt > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366115#comment-15366115 ] mingleizhang commented on MAPREDUCE-6729: - Thanks @Kai Zheng for the useful comment. And I have renamed the patch name just now. > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch.txt, MR-6729.txt > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Comment: was deleted (was: Thanks Kai Zheng for the useful comment. And I have renamed the patch name just now.) > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch.txt, MR-6729.txt > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366106#comment-15366106 ] mingleizhang commented on MAPREDUCE-6729: - Thanks Kai Zheng for the useful comment. And I have renamed the patch name just now. > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch.txt, MR-6729.txt > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: MAPREDUCE-6729-v1.patch.txt > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729-v1.patch.txt, MR-6729.txt > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (MAPREDUCE-6729) Hitting performance and error when lots of files to write or read
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Comment: was deleted (was: Thanks Kai Zheng for review this code.) > Hitting performance and error when lots of files to write or read > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MR-6729.txt > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6729) Hitting performance and error when lots of files to write or read
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366081#comment-15366081 ] mingleizhang commented on MAPREDUCE-6729: - Thanks Kai Zheng for review this code. > Hitting performance and error when lots of files to write or read > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MR-6729.txt > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6729) Hitting performance and error when lots of files to write or read
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366080#comment-15366080 ] mingleizhang commented on MAPREDUCE-6729: - Thanks Kai Zheng for review this code. > Hitting performance and error when lots of files to write or read > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MR-6729.txt > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Hitting performance and error when lots of files to write or read
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Attachment: MR-6729.txt > Hitting performance and error when lots of files to write or read > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MR-6729.txt > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6729) Hitting performance and error when lots of files to write or read
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365456#comment-15365456 ] mingleizhang commented on MAPREDUCE-6729: - Is there anyone wanna get this jira ? If not, I will work on this soon. > Hitting performance and error when lots of files to write or read > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Hitting performance and error when lots of files to write or read
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Description: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput while the files are lots of. So we need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); // this line of code will cause extra time consumption because of fs.delete(*,*) by the writeTest method long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] was: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput while the files are lots of. So we need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); // this line of code will cause extra time consumption because of fs.delete(*,*) long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] > Hitting performance and error when lots of files to write or read > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail:
[jira] [Updated] (MAPREDUCE-6729) Hitting performance and error when lots of files to write or read
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Description: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput while the files are lots of. So we need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] was: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput while the files are lots of. We need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] > Hitting performance and error when lots of files to write or read > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Hitting performance and error when lots of files to write or read
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Description: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput while the files are lots of. So we need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); // this line of code will cause extra time consumption because of fs.delete(*,*) long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] was: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput while the files are lots of. So we need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); // this line of code will cause extra time consumption because of fs.delete(*,*) long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] > Hitting performance and error when lots of files to write or read > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Hitting performance and error when lots of files to write or read
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Description: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput while the files are lots of. So we need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); // this line of code will cause extra time consumption because of fs.delete(*,*) long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] was: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput while the files are lots of. So we need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); // this line of code will cause extra time because of fs.delete(*,*) long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] > Hitting performance and error when lots of files to write or read > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Hitting performance and error when lots of files to write or read
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Description: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput while the files are lots of. So we need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); // this line of code will cause extra time because of fs.delete(*,*) long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] was: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput while the files are lots of. So we need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] > Hitting performance and error when lots of files to write or read > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time because > of fs.delete(*,*) > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Hitting performance and error when lots of files to write or read
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Description: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput while the files are lots of. We need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] was: When doing DFSIO test as a distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput for us. We need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] > Hitting performance and error when lots of files to write or read > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. We need to replace or > improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6729) Hitting performance and error when lots of files to write or read
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-6729: Description: When doing DFSIO test as distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause extra time consumption and furthermore cause performance issue, statistical time error and imprecise throughput for us. We need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] was: When doing DFSIO test as distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause time consumption and furthermore cause performance issue, statistical time error and imprecise throughput for us. We need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] > Hitting performance and error when lots of files to write or read > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Reporter: mingleizhang >Priority: Minor > Labels: performance, test > > When doing DFSIO test as distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput for us. We need to replace or improve this hack to > prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-6729) Hitting performance and error when lots of files to write or read
mingleizhang created MAPREDUCE-6729: --- Summary: Hitting performance and error when lots of files to write or read Key: MAPREDUCE-6729 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 Project: Hadoop Map/Reduce Issue Type: Improvement Components: benchmarks, performance, test Reporter: mingleizhang Priority: Minor When doing DFSIO test as distributed i/o benchmark tool. Then especially writes plenty of files to disk or read from, both can cause performance issue and imprecise value in a way. The question is that existing practices needs to delete files when before running a job and that will cause time consumption and furthermore cause performance issue, statistical time error and imprecise throughput for us. We need to replace or improve this hack to prevent this from happening in the future. {code} public static void testWrite() throws Exception { FileSystem fs = cluster.getFileSystem(); long tStart = System.currentTimeMillis(); bench.writeTest(fs); long execTime = System.currentTimeMillis() - tStart; bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); } private void writeTest(FileSystem fs) throws IOException { Path writeDir = getWriteDir(config); fs.delete(getDataDir(config), true); fs.delete(writeDir, true); runIOTest(WriteMapper.class, writeDir); } {code} [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-3919) Redirecting to job history server takes hours
[ https://issues.apache.org/jira/browse/MAPREDUCE-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang updated MAPREDUCE-3919: Assignee: (was: mingleizhang) > Redirecting to job history server takes hours > - > > Key: MAPREDUCE-3919 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3919 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 0.23.0 >Reporter: Daniel Dai >Priority: Critical > > Saw the following message happening regularly, the job end up success, but > reconnecting job history server takes a long time (>10 hours sometimes). > 2012-02-24 03:49:05,226 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: hrt11n31.cc1.ygridcore.net/98.137.234.159:44716. Already > tried 0 time(s). > 2012-02-24 03:49:05,229 [main] INFO > org.apache.hadoop.mapred.ClientServiceDelegate - Application state is > completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server > 2012-02-24 03:49:06,233 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s). > 2012-02-24 03:49:07,236 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s). > 2012-02-24 03:49:08,239 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s). > 2012-02-24 03:49:09,242 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s). > 2012-02-24 03:49:10,245 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s). > 2012-02-24 03:49:11,248 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 5 time(s). > 2012-02-24 03:49:12,251 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 6 time(s). > 2012-02-24 03:49:13,254 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 7 time(s). > 2012-02-24 03:49:14,257 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 8 time(s). > 2012-02-24 03:49:15,260 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s). > .. > 2012-02-24 18:10:35,711 [main] INFO > org.apache.hadoop.mapred.ClientServiceDelegate - Application state is > completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server > 2012-02-24 18:10:36,714 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s). > 2012-02-24 18:10:37,717 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s).2012-02-24 > 18:10:38,784 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s).2012-02-24 > 18:10:39,787 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s). > 2012-02-24 18:10:40,791 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s). > 2012-02-24 18:10:41,793 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 5 time(s).2012-02-24 > 18:10:42,796 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 6 time(s).2012-02-24 > 18:10:43,799 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 7 time(s). > 2012-02-24 18:10:44,802 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 8 time(s). > 2012-02-24 18:10:45,805 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s). > 2012-02-24 18:10:45,808 [main] INFO > org.apache.hadoop.mapred.ClientServiceDelegate - Application state is > completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server > 2012-02-24 18:10:46,810 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s).2012-02-24 > 18:10:47,813 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s).2012-02-24 > 18:10:48,815 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 2
[jira] [Assigned] (MAPREDUCE-3919) Redirecting to job history server takes hours
[ https://issues.apache.org/jira/browse/MAPREDUCE-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mingleizhang reassigned MAPREDUCE-3919: --- Assignee: mingleizhang > Redirecting to job history server takes hours > - > > Key: MAPREDUCE-3919 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3919 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 0.23.0 >Reporter: Daniel Dai >Assignee: mingleizhang >Priority: Critical > > Saw the following message happening regularly, the job end up success, but > reconnecting job history server takes a long time (>10 hours sometimes). > 2012-02-24 03:49:05,226 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: hrt11n31.cc1.ygridcore.net/98.137.234.159:44716. Already > tried 0 time(s). > 2012-02-24 03:49:05,229 [main] INFO > org.apache.hadoop.mapred.ClientServiceDelegate - Application state is > completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server > 2012-02-24 03:49:06,233 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s). > 2012-02-24 03:49:07,236 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s). > 2012-02-24 03:49:08,239 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s). > 2012-02-24 03:49:09,242 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s). > 2012-02-24 03:49:10,245 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s). > 2012-02-24 03:49:11,248 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 5 time(s). > 2012-02-24 03:49:12,251 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 6 time(s). > 2012-02-24 03:49:13,254 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 7 time(s). > 2012-02-24 03:49:14,257 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 8 time(s). > 2012-02-24 03:49:15,260 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s). > .. > 2012-02-24 18:10:35,711 [main] INFO > org.apache.hadoop.mapred.ClientServiceDelegate - Application state is > completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server > 2012-02-24 18:10:36,714 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s). > 2012-02-24 18:10:37,717 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s).2012-02-24 > 18:10:38,784 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s).2012-02-24 > 18:10:39,787 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s). > 2012-02-24 18:10:40,791 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s). > 2012-02-24 18:10:41,793 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 5 time(s).2012-02-24 > 18:10:42,796 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 6 time(s).2012-02-24 > 18:10:43,799 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 7 time(s). > 2012-02-24 18:10:44,802 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 8 time(s). > 2012-02-24 18:10:45,805 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s). > 2012-02-24 18:10:45,808 [main] INFO > org.apache.hadoop.mapred.ClientServiceDelegate - Application state is > completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server > 2012-02-24 18:10:46,810 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s).2012-02-24 > 18:10:47,813 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s).2012-02-24 > 18:10:48,815 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server:
[jira] [Commented] (MAPREDUCE-3919) Redirecting to job history server takes hours
[ https://issues.apache.org/jira/browse/MAPREDUCE-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14742323#comment-14742323 ] mingleizhang commented on MAPREDUCE-3919: - I have the same question. How do you solve it ? > Redirecting to job history server takes hours > - > > Key: MAPREDUCE-3919 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3919 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 0.23.0 >Reporter: Daniel Dai >Priority: Critical > > Saw the following message happening regularly, the job end up success, but > reconnecting job history server takes a long time (>10 hours sometimes). > 2012-02-24 03:49:05,226 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: hrt11n31.cc1.ygridcore.net/98.137.234.159:44716. Already > tried 0 time(s). > 2012-02-24 03:49:05,229 [main] INFO > org.apache.hadoop.mapred.ClientServiceDelegate - Application state is > completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server > 2012-02-24 03:49:06,233 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s). > 2012-02-24 03:49:07,236 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s). > 2012-02-24 03:49:08,239 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s). > 2012-02-24 03:49:09,242 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s). > 2012-02-24 03:49:10,245 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s). > 2012-02-24 03:49:11,248 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 5 time(s). > 2012-02-24 03:49:12,251 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 6 time(s). > 2012-02-24 03:49:13,254 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 7 time(s). > 2012-02-24 03:49:14,257 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 8 time(s). > 2012-02-24 03:49:15,260 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s). > .. > 2012-02-24 18:10:35,711 [main] INFO > org.apache.hadoop.mapred.ClientServiceDelegate - Application state is > completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server > 2012-02-24 18:10:36,714 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s). > 2012-02-24 18:10:37,717 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s).2012-02-24 > 18:10:38,784 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s).2012-02-24 > 18:10:39,787 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s). > 2012-02-24 18:10:40,791 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s). > 2012-02-24 18:10:41,793 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 5 time(s).2012-02-24 > 18:10:42,796 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 6 time(s).2012-02-24 > 18:10:43,799 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 7 time(s). > 2012-02-24 18:10:44,802 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 8 time(s). > 2012-02-24 18:10:45,805 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s). > 2012-02-24 18:10:45,808 [main] INFO > org.apache.hadoop.mapred.ClientServiceDelegate - Application state is > completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server > 2012-02-24 18:10:46,810 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s).2012-02-24 > 18:10:47,813 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s).2012-02-24 > 18:10:48,815 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to >
[jira] [Commented] (MAPREDUCE-3919) Redirecting to job history server takes hours
[ https://issues.apache.org/jira/browse/MAPREDUCE-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14742324#comment-14742324 ] mingleizhang commented on MAPREDUCE-3919: - I have the same question. How do you solve it ? > Redirecting to job history server takes hours > - > > Key: MAPREDUCE-3919 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3919 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 0.23.0 >Reporter: Daniel Dai >Priority: Critical > > Saw the following message happening regularly, the job end up success, but > reconnecting job history server takes a long time (>10 hours sometimes). > 2012-02-24 03:49:05,226 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: hrt11n31.cc1.ygridcore.net/98.137.234.159:44716. Already > tried 0 time(s). > 2012-02-24 03:49:05,229 [main] INFO > org.apache.hadoop.mapred.ClientServiceDelegate - Application state is > completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server > 2012-02-24 03:49:06,233 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s). > 2012-02-24 03:49:07,236 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s). > 2012-02-24 03:49:08,239 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s). > 2012-02-24 03:49:09,242 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s). > 2012-02-24 03:49:10,245 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s). > 2012-02-24 03:49:11,248 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 5 time(s). > 2012-02-24 03:49:12,251 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 6 time(s). > 2012-02-24 03:49:13,254 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 7 time(s). > 2012-02-24 03:49:14,257 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 8 time(s). > 2012-02-24 03:49:15,260 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s). > .. > 2012-02-24 18:10:35,711 [main] INFO > org.apache.hadoop.mapred.ClientServiceDelegate - Application state is > completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server > 2012-02-24 18:10:36,714 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s). > 2012-02-24 18:10:37,717 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s).2012-02-24 > 18:10:38,784 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s).2012-02-24 > 18:10:39,787 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s). > 2012-02-24 18:10:40,791 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s). > 2012-02-24 18:10:41,793 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 5 time(s).2012-02-24 > 18:10:42,796 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 6 time(s).2012-02-24 > 18:10:43,799 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 7 time(s). > 2012-02-24 18:10:44,802 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 8 time(s). > 2012-02-24 18:10:45,805 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s). > 2012-02-24 18:10:45,808 [main] INFO > org.apache.hadoop.mapred.ClientServiceDelegate - Application state is > completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server > 2012-02-24 18:10:46,810 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s).2012-02-24 > 18:10:47,813 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to > server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s).2012-02-24 > 18:10:48,815 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to >