[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15367340#comment-15367340
 ] 

ASF GitHub Bot commented on MAPREDUCE-6729:
-------------------------------------------

GitHub user zhangminglei opened a pull request:

    https://github.com/apache/hadoop/pull/112

    MAPREDUCE-6729. Accurately compute the test execute time in DFSIO

    Update github-side PR to works well.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zhangminglei/hadoop trunk

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/hadoop/pull/112.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #112
    
----
commit 2a295d0a1e80df0f9153b7600ff3f38b7c3faee5
Author: zhangminglei <[email protected]>
Date:   2016-07-08T03:29:04Z

    MAPREDUCE-6729. Accurately compute the test execute time in DFSIO

----


> Accurately compute the test execute time in DFSIO
> -------------------------------------------------
>
>                 Key: MAPREDUCE-6729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: benchmarks, performance, test
>    Affects Versions: 2.9.0
>            Reporter: mingleizhang
>            Assignee: mingleizhang
>            Priority: Minor
>              Labels: performance, test
>         Attachments: MAPREDUCE-6729.001.patch
>
>
> When doing DFSIO test as a distributed i/o benchmark tool. Then especially 
> writes plenty of files to disk or read from, both can cause performance issue 
> and imprecise value in a way. The question is that existing practices needs 
> to delete files when before running a job and that will cause extra time 
> consumption and furthermore cause performance issue, statistical time error 
> and imprecise throughput while the files are lots of. So we need to replace 
> or improve this hack to prevent this from happening in the future.
> {code}
> public static void testWrite() throws Exception {
>     FileSystem fs = cluster.getFileSystem();
>     long tStart = System.currentTimeMillis();
>     bench.writeTest(fs); // this line of code will cause extra time 
> consumption because of fs.delete(*,*) by the writeTest method
>     long execTime = System.currentTimeMillis() - tStart;
>     bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime);
>   }
> private void writeTest(FileSystem fs) throws IOException {
>   Path writeDir = getWriteDir(config);
>   fs.delete(getDataDir(config), true);
>   fs.delete(writeDir, true);    
>   runIOTest(WriteMapper.class, writeDir);
>   }
> {code} 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to