[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2018-04-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16450753#comment-16450753
 ] 

Hudson commented on HADOOP-14407:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14057 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14057/])
HADOOP-14407. DistCp - Introduce a configurable copy buffer size. (Omkar (xyao: 
rev 1252aa37811892a269f3feb298cf66faee81d9c0)
* (edit) 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpOptions.java
* (edit) 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpContext.java
* (edit) 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpConstants.java
* (edit) 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/OptionsParser.java
* (edit) 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java
* (edit) 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpOptionSwitch.java
* (edit) 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestDistCpOptions.java
* (edit) hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm


> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
>Priority: Major
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> HADOOP-14407.002.patch, HADOOP-14407.003.patch, 
> HADOOP-14407.004.branch2.patch, HADOOP-14407.004.patch, 
> HADOOP-14407.004.patch, HADOOP-14407.branch2.002.patch, 
> TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-25 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024882#comment-16024882
 ] 

Yongjun Zhang commented on HADOOP-14407:


Welcome [~omkarksa]. 

Thanks [~ste...@apache.org], I already committed to branch-2 yesterday.


> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> HADOOP-14407.002.patch, HADOOP-14407.003.patch, 
> HADOOP-14407.004.branch2.patch, HADOOP-14407.004.patch, 
> HADOOP-14407.004.patch, HADOOP-14407.branch2.002.patch, 
> TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-25 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024791#comment-16024791
 ] 

Steve Loughran commented on HADOOP-14407:
-

re-opened the issue while a branch-2 version is done. Alternativelly, create a 
new JIRA, "backport HADOOP-14407 to branch-2" and work on things there.

We now have distcp tests for S3 and Azure; it'd be good test those, which isn't 
automatically done by yetus

> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> HADOOP-14407.002.patch, HADOOP-14407.003.patch, 
> HADOOP-14407.004.branch2.patch, HADOOP-14407.004.patch, 
> HADOOP-14407.004.patch, HADOOP-14407.branch2.002.patch, 
> TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-24 Thread Omkar Aradhya K S (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024138#comment-16024138
 ] 

Omkar Aradhya K S commented on HADOOP-14407:


Thanks [~yzhangal].

> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> HADOOP-14407.002.patch, HADOOP-14407.003.patch, 
> HADOOP-14407.004.branch2.patch, HADOOP-14407.004.patch, 
> HADOOP-14407.004.patch, HADOOP-14407.branch2.002.patch, 
> TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-24 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024063#comment-16024063
 ] 

Yongjun Zhang commented on HADOOP-14407:


HI [~omkarksa],

Thanks for looking into, and good point! This helped me to find a similar 
problem with HADOOP-11794's branch-2 patch. In trunk, the test exists in 
TestOptionParser, however, similar test exists in TestDistCpOptions in 
branch-2. 

I just updated branch-2 patches for both HADOOP-11794 and HADOOP-14407. So the 
issues are resolved.

Thanks.


> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> HADOOP-14407.002.patch, HADOOP-14407.003.patch, 
> HADOOP-14407.004.branch2.patch, HADOOP-14407.004.patch, 
> HADOOP-14407.004.patch, HADOOP-14407.branch2.002.patch, 
> TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-24 Thread Omkar Aradhya K S (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022643#comment-16022643
 ] 

Omkar Aradhya K S commented on HADOOP-14407:


[~yzhangal] Thanks for pointing this out! I have couple of doubts:
1. The 
"/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestDistCpOptions.java" 
doesn't have "blocksPerChunk=0" so how was it passing earlier?
2. Can I re-open this issue and add the patch here itself?
3. What is the procedure to test branch-2 patches? I see a "-1" from Hadoop QA 
for the branch-2 patch above!



> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> HADOOP-14407.002.patch, HADOOP-14407.003.patch, 
> HADOOP-14407.004.branch2.patch, HADOOP-14407.004.patch, 
> HADOOP-14407.004.patch, TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-23 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022336#comment-16022336
 ] 

Yongjun Zhang commented on HADOOP-14407:


Welcome [~omkarksa].

My bad, did not catch an issue in your branch-2 patch in time.
{code}
--
Running org.apache.hadoop.tools.TestDistCpOptions
Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.266 sec <<< 
FAILURE! - in org.apache.hadoop.tools.TestDistCpOptions
testToString(org.apache.hadoop.tools.TestDistCpOptions)  Time elapsed: 0.018 
sec  <<< FAILURE!
org.junit.ComparisonFailure: expected:<..., filtersFile='null'[]}> but 
was:<..., filtersFile='null'[, blocksPerChunk=0, copyBufferSize=8192]}>
at org.junit.Assert.assertEquals(Assert.java:115)
at org.junit.Assert.assertEquals(Assert.java:144)
at 
org.apache.hadoop.tools.TestDistCpOptions.testToString(TestDistCpOptions.java:317)
{code}

TestDistCpOptions.java is somehow missing from the branch-2 patch, would you 
please create a new jira for branch-2 only and submit the patch asap?

Thanks.




> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> HADOOP-14407.002.patch, HADOOP-14407.003.patch, 
> HADOOP-14407.004.branch2.patch, HADOOP-14407.004.patch, 
> HADOOP-14407.004.patch, TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-21 Thread Omkar Aradhya K S (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019155#comment-16019155
 ] 

Omkar Aradhya K S commented on HADOOP-14407:


Thanks [~yzhangal] for the quick reviews and commits.

> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> HADOOP-14407.002.patch, HADOOP-14407.003.patch, 
> HADOOP-14407.004.branch2.patch, HADOOP-14407.004.patch, 
> HADOOP-14407.004.patch, TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-19 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018325#comment-16018325
 ] 

Yongjun Zhang commented on HADOOP-14407:


Committed to trunk and branch-2. Many thanks to [~omkarksa] for the 
contribution!




> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> HADOOP-14407.002.patch, HADOOP-14407.003.patch, 
> HADOOP-14407.004.branch2.patch, HADOOP-14407.004.patch, 
> HADOOP-14407.004.patch, TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017773#comment-16017773
 ] 

Hadoop QA commented on HADOOP-14407:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 10s{color} 
| {color:red} HADOOP-14407 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HADOOP-14407 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869020/HADOOP-14407.004.branch2.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12363/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> HADOOP-14407.002.patch, HADOOP-14407.003.patch, 
> HADOOP-14407.004.branch2.patch, HADOOP-14407.004.patch, 
> HADOOP-14407.004.patch, TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-19 Thread Omkar Aradhya K S (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017770#comment-16017770
 ] 

Omkar Aradhya K S commented on HADOOP-14407:


[~yzhangal] I have backported the feature to branch-2 and uploaded the patch 
(HADOOP-14407.004.branch2.patch). Please do the needful.

> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> HADOOP-14407.002.patch, HADOOP-14407.003.patch, 
> HADOOP-14407.004.branch2.patch, HADOOP-14407.004.patch, 
> HADOOP-14407.004.patch, TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-19 Thread Omkar Aradhya K S (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017038#comment-16017038
 ] 

Omkar Aradhya K S commented on HADOOP-14407:


{quote}
I committed to trunk. When trying to backport to branch-2, saw quite some 
conflicts. Would you please help doing branch-2 version and other ones you 
prefer?
{quote}

[~yzhangal] Thanks. OK, I will work on the branch-2 patch today.


> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> HADOOP-14407.002.patch, HADOOP-14407.003.patch, HADOOP-14407.004.patch, 
> HADOOP-14407.004.patch, TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016644#comment-16016644
 ] 

Hudson commented on HADOOP-14407:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11752 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11752/])
HADOOP-14407. DistCp - Introduce a configurable copy buffer size. (Omkar 
(yzhang: rev b4adc8392c1314d6d6fbdd00f2afb306ef20a650)
* (edit) 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java
* (edit) hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm
* (edit) 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpConstants.java
* (edit) 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpOptionSwitch.java
* (edit) 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpContext.java
* (edit) 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/OptionsParser.java
* (edit) 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestDistCpOptions.java
* (edit) 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpOptions.java


> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> HADOOP-14407.002.patch, HADOOP-14407.003.patch, HADOOP-14407.004.patch, 
> HADOOP-14407.004.patch, TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-18 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016606#comment-16016606
 ] 

Yongjun Zhang commented on HADOOP-14407:


Hi [~omkarksa],

I committed to trunk. When trying to backport to branch-2, saw quite some 
conflicts. Would you please help doing branch-2 version and other ones you 
prefer?

Thanks much.


> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> HADOOP-14407.002.patch, HADOOP-14407.003.patch, HADOOP-14407.004.patch, 
> HADOOP-14407.004.patch, TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-18 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016569#comment-16016569
 ] 

Yongjun Zhang commented on HADOOP-14407:


Hi [~omkarksa],

Thanks for the updated patch. +1 and I will commit soon.



> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> HADOOP-14407.002.patch, HADOOP-14407.003.patch, HADOOP-14407.004.patch, 
> HADOOP-14407.004.patch, TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016171#comment-16016171
 ] 

Hadoop QA commented on HADOOP-14407:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} hadoop-tools/hadoop-distcp: The patch generated 0 
new + 52 unchanged - 2 fixed = 52 total (was 54) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 
42s{color} | {color:green} hadoop-distcp in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HADOOP-14407 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12868796/HADOOP-14407.004.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux ca75e484de72 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 40e6a85 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12359/testReport/ |
| modules | C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12359/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 

[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015369#comment-16015369
 ] 

Hadoop QA commented on HADOOP-14407:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} hadoop-tools/hadoop-distcp: The patch generated 0 
new + 51 unchanged - 2 fixed = 51 total (was 53) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 
54s{color} | {color:green} hadoop-distcp in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 35s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HADOOP-14407 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12868693/HADOOP-14407.003.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 25e4f3c9034e 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / b46cd31 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12353/testReport/ |
| modules | C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12353/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> 

[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-17 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015234#comment-16015234
 ] 

Yongjun Zhang commented on HADOOP-14407:


Hi [~omkarksa],

Thanks [~ajisakaa] for pointing out that the build was using the "jpg" file you 
uploaded as the patch, thus the test failed. I uoloaded the patch rev2 again, 
and it triggered a new run. And the result shows checkstyle issues I commented 
earlier.

https://builds.apache.org/job/PreCommit-HADOOP-Build/12351/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-distcp.txt

Would you please address them and upload a new rev?

Thanks.




> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> HADOOP-14407.002.patch, TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-17 Thread Omkar Aradhya K S (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015158#comment-16015158
 ] 

Omkar Aradhya K S commented on HADOOP-14407:


{quote}
Thanks for the updated patch Omkar Aradhya K S, are you guys still looking into 
setting input and output buffer to different size? Or any chance we need to do 
that in the future?
{quote}
[~yzhangal] Thanks for checking the patch. As explained in the previous commit, 
we don't need to do this change since even a small copybiffersize can give huge 
boos in performance.

{quote}
Somehow your submitting the patch did not trigger a jenkins test, maybe there 
is an infra issue.
{quote}
Thanks for pointing this out. Yes, even I waited for quite some time, but there 
was no result! Do I need any additional permissions for this?
Also, can you point out how exactly you triggered the build? May be I missed 
something?

> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> HADOOP-14407.002.patch, TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015124#comment-16015124
 ] 

Hadoop QA commented on HADOOP-14407:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 12s{color} | {color:orange} hadoop-tools/hadoop-distcp: The patch generated 
5 new + 52 unchanged - 1 fixed = 57 total (was 53) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 
13s{color} | {color:green} hadoop-distcp in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 35m 58s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HADOOP-14407 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12868655/HADOOP-14407.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux d540388b81b0 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / ef9e536 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12351/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-distcp.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12351/testReport/ |
| modules | C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12351/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>   

[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-17 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014635#comment-16014635
 ] 

Yongjun Zhang commented on HADOOP-14407:


I triggered the jenkins test manually at 
https://builds.apache.org/job/PreCommit-HADOOP-Build//12347.



> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-17 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014519#comment-16014519
 ] 

Yongjun Zhang commented on HADOOP-14407:


Thanks [~steve_l]. 

{quote}
.that's something whoever commits it needs to do
{quote}
I personally think it's better for developer to clean it up so jenkins test 
reports clean result. Especially for new contributors, once they are aware of 
this, they can easily do so before submitting the patch. It'd be nice that the 
last patch rev attached the jira is what get committed without change.

Thanks for the updated patch [~omkarksa], are you guys still looking into 
setting input and output buffer to different size? Or any chance we need to do 
that in the future? Somehow your submitting the patch did not trigger a jenkins 
test, maybe there is an infra issue.
 




> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-17 Thread Omkar Aradhya K S (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014375#comment-16014375
 ] 

Omkar Aradhya K S commented on HADOOP-14407:


{quote}
Also, we will do more investigations into introducing both input and output 
copybuffersize configurations.
{quote}
We did a small set of runs to see at what level the copybuffersize stagnates in 
performance:
!TotalTime-vs-CopyBufferSize.jpg!
In this case, with copybuffersize set to just 128KB, we get >3x performance!

{quote}
If there is benefit of doing this I will submit a new patch with the changes or 
else we will go ahead with this patch.
{quote}
Since even a small increase in the copybuffersize give the desired performance, 
we will not need two separate copybuffersize for input and output.

> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch, 
> TotalTime-vs-CopyBufferSize.jpg
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013830#comment-16013830
 ] 

Steve Loughran commented on HADOOP-14407:
-

(I wouldn't worry about trailing whitespace...that's something whoever commits 
it needs to do)

git alias for a "git apx" command to do the merge
{code}
alias.apx=apply -3 --verbose --whitespace=fix
{code}

> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-17 Thread Omkar Aradhya K S (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013785#comment-16013785
 ] 

Omkar Aradhya K S commented on HADOOP-14407:


[~yzhangal] Thanks for reviewing. Please find attached the new patch with the 
fixes.

> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch, HADOOP-14407.002.patch
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-16 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013280#comment-16013280
 ] 

Yongjun Zhang commented on HADOOP-14407:


Hi [~omkarksa],

The patch looks good, except a few cosmetic things.

- The patch failed to apply. Try using "git diff --no-prefix HEAD~ " to 
generate the patch.
- some trailing white spaces, use "git apply --whitespace fix" to remove.
- DistCpOptions.java, line exceeding 80 chars

Thanks.



> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-15 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011736#comment-16011736
 ] 

Yongjun Zhang commented on HADOOP-14407:


Hi [~omkarksa], 

Sorry for the delayed review, would you please update the patch to fix the 
issue reported here
https://issues.apache.org/jira/browse/HADOOP-14407?focusedCommentId=16011312=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16011312

Thanks.



> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011312#comment-16011312
 ] 

Hadoop QA commented on HADOOP-14407:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} HADOOP-14407 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HADOOP-14407 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12867756/HADOOP-14407.001.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12319/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-12 Thread Omkar Aradhya K S (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007968#comment-16007968
 ] 

Omkar Aradhya K S commented on HADOOP-14407:


[~yzhangal] I have submitted the patch. Could you please check this 
(HADOOP-14407.001.patch)?
Also, we will do more investigations into introducing both input and output 
copybuffersize configurations.
If there is benefit of doing this I will submit a new patch with the changes or 
else we will go ahead with this patch.

> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14407.001.patch
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-11 Thread Omkar Aradhya K S (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007606#comment-16007606
 ] 

Omkar Aradhya K S commented on HADOOP-14407:


Thanks [~yzhangal]. 
We found that we don't need 2 buffers. 
Just making the existing copy buffer size configurable should do. 
I will submit the patch soon.

> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Omkar Aradhya K S
> Fix For: 2.9.0, 3.0.0-alpha3
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14407) DistCp - Introduce a configurable copy buffer size

2017-05-10 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004854#comment-16004854
 ] 

Yongjun Zhang commented on HADOOP-14407:


Hi [~omkarksa], thanks for the good finding. Do we want to have separate config 
for the input and output buffer size? In your test, did you use the same buf 
size for both?

Thanks.



> DistCp - Introduce a configurable copy buffer size
> --
>
> Key: HADOOP-14407
> URL: https://issues.apache.org/jira/browse/HADOOP-14407
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Omkar Aradhya K S
>Assignee: Yongjun Zhang
> Fix For: 2.9.0, 3.0.0-alpha3
>
>
> Currently, the RetriableFileCopyCommand has a fixed copy buffer size of just 
> 8KB. We have noticed in our performance tests that with bigger buffer sizes 
> we saw upto ~3x performance boost. Hence, making the copy buffer size a 
> configurable setting via the new parameter .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org