[jira] [Commented] (HADOOP-15076) Enhance s3a troubleshooting docs, add perf section

2018-02-14 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16365139#comment-16365139
 ] 

Aaron Fabbri commented on HADOOP-15076:
---

{noformat}
It is very easy to run out of memory when buffering to disk; the option
+`fs.s3a.fast.upload.active.blocks"` exists to tune how many active blocks
+a single output stream writing to S3 may have queued at a time.
{noformat}

/when buffering to disk/when buffering to memory/  right?

Other than that minor typo.. +1 LGTM (I'll leave running `mvn site` and 
checking links and formatting to you--I need to run).. 



> Enhance s3a troubleshooting docs, add perf section
> --
>
> Key: HADOOP-15076
> URL: https://issues.apache.org/jira/browse/HADOOP-15076
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation, fs/s3
>Affects Versions: 2.8.2
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15076-001.patch, HADOOP-15076-002.patch, 
> HADOOP-15076-003.patch, HADOOP-15076-004.patch, HADOOP-15076-005.patch, 
> HADOOP-15076-006.patch
>
>
> A recurrent theme in s3a-related JIRAs, support calls etc is "tried upgrading 
> the AWS SDK JAR and then I got the error ...". We know here "don't do that", 
> but its not something immediately obvious to lots of downstream users who 
> want to be able to drop in the new JAR to fix things/add new features
> We need to spell this out quite clearlyi "you cannot safely expect to do 
> this. If you want to upgrade the SDK, you will need to rebuild the whole of 
> hadoop-aws with the maven POM updated to the latest version, ideally 
> rerunning all the tests to make sure something hasn't broken. 
> Maybe near the top of the index.md file, along with "never share your AWS 
> credentials with anyone"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15076) Enhance s3a troubleshooting docs, add perf section

2018-02-14 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16365103#comment-16365103
 ] 

Wangda Tan commented on HADOOP-15076:
-

Looks good, thanks [~ste...@apache.org], please feel free to commit whenever 
you think it's ready.

> Enhance s3a troubleshooting docs, add perf section
> --
>
> Key: HADOOP-15076
> URL: https://issues.apache.org/jira/browse/HADOOP-15076
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation, fs/s3
>Affects Versions: 2.8.2
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15076-001.patch, HADOOP-15076-002.patch, 
> HADOOP-15076-003.patch, HADOOP-15076-004.patch, HADOOP-15076-005.patch, 
> HADOOP-15076-006.patch
>
>
> A recurrent theme in s3a-related JIRAs, support calls etc is "tried upgrading 
> the AWS SDK JAR and then I got the error ...". We know here "don't do that", 
> but its not something immediately obvious to lots of downstream users who 
> want to be able to drop in the new JAR to fix things/add new features
> We need to spell this out quite clearlyi "you cannot safely expect to do 
> this. If you want to upgrade the SDK, you will need to rebuild the whole of 
> hadoop-aws with the maven POM updated to the latest version, ideally 
> rerunning all the tests to make sure something hasn't broken. 
> Maybe near the top of the index.md file, along with "never share your AWS 
> credentials with anyone"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15076) Enhance s3a troubleshooting docs, add perf section

2018-02-14 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363773#comment-16363773
 ] 

Steve Loughran commented on HADOOP-15076:
-

Reviews please! 

> Enhance s3a troubleshooting docs, add perf section
> --
>
> Key: HADOOP-15076
> URL: https://issues.apache.org/jira/browse/HADOOP-15076
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation, fs/s3
>Affects Versions: 2.8.2
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15076-001.patch, HADOOP-15076-002.patch, 
> HADOOP-15076-003.patch, HADOOP-15076-004.patch, HADOOP-15076-005.patch, 
> HADOOP-15076-006.patch
>
>
> A recurrent theme in s3a-related JIRAs, support calls etc is "tried upgrading 
> the AWS SDK JAR and then I got the error ...". We know here "don't do that", 
> but its not something immediately obvious to lots of downstream users who 
> want to be able to drop in the new JAR to fix things/add new features
> We need to spell this out quite clearlyi "you cannot safely expect to do 
> this. If you want to upgrade the SDK, you will need to rebuild the whole of 
> hadoop-aws with the maven POM updated to the latest version, ideally 
> rerunning all the tests to make sure something hasn't broken. 
> Maybe near the top of the index.md file, along with "never share your AWS 
> credentials with anyone"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15076) Enhance s3a troubleshooting docs, add perf section

2018-02-13 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363153#comment-16363153
 ] 

genericqa commented on HADOOP-15076:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
27m 50s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 48 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 59s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 41m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HADOOP-15076 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12910454/HADOOP-15076-006.patch
 |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux b5311ef18644 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 332269d |
| maven | version: Apache Maven 3.3.9 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14110/artifact/out/whitespace-eol.txt
 |
| Max. process+thread count | 312 (vs. ulimit of 5500) |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14110/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Enhance s3a troubleshooting docs, add perf section
> --
>
> Key: HADOOP-15076
> URL: https://issues.apache.org/jira/browse/HADOOP-15076
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation, fs/s3
>Affects Versions: 2.8.2
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15076-001.patch, HADOOP-15076-002.patch, 
> HADOOP-15076-003.patch, HADOOP-15076-004.patch, HADOOP-15076-005.patch, 
> HADOOP-15076-006.patch
>
>
> A recurrent theme in s3a-related JIRAs, support calls etc is "tried upgrading 
> the AWS SDK JAR and then I got the error ...". We know here "don't do that", 
> but its not something immediately obvious to lots of downstream users who 
> want to be able to drop in the new JAR to fix things/add new features
> We need to spell this out quite clearlyi "you cannot safely expect to do 
> this. If you want to upgrade the SDK, you will need to rebuild the whole of 
> hadoop-aws with the maven POM updated to the latest version, ideally 
> rerunning all the tests to make sure something hasn't broken. 
> Maybe near the top of the index.md file, along with "never share your AWS 
> credentials with anyone"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15076) Enhance s3a troubleshooting docs, add perf section

2018-02-13 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363104#comment-16363104
 ] 

Steve Loughran commented on HADOOP-15076:
-

patch 006: 
* pull out AccessDenied into its own section, 
* fix performance ToC
* pull in more stacks and explanations, from other JIRAs

> Enhance s3a troubleshooting docs, add perf section
> --
>
> Key: HADOOP-15076
> URL: https://issues.apache.org/jira/browse/HADOOP-15076
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation, fs/s3
>Affects Versions: 2.8.2
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15076-001.patch, HADOOP-15076-002.patch, 
> HADOOP-15076-003.patch, HADOOP-15076-004.patch, HADOOP-15076-005.patch, 
> HADOOP-15076-006.patch
>
>
> A recurrent theme in s3a-related JIRAs, support calls etc is "tried upgrading 
> the AWS SDK JAR and then I got the error ...". We know here "don't do that", 
> but its not something immediately obvious to lots of downstream users who 
> want to be able to drop in the new JAR to fix things/add new features
> We need to spell this out quite clearlyi "you cannot safely expect to do 
> this. If you want to upgrade the SDK, you will need to rebuild the whole of 
> hadoop-aws with the maven POM updated to the latest version, ideally 
> rerunning all the tests to make sure something hasn't broken. 
> Maybe near the top of the index.md file, along with "never share your AWS 
> credentials with anyone"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15076) Enhance s3a troubleshooting docs, add perf section

2018-02-13 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362465#comment-16362465
 ] 

Steve Loughran commented on HADOOP-15076:
-

messages in HADOOP-14531 should be covered, even if the stacks are no longer 
what you'd see

> Enhance s3a troubleshooting docs, add perf section
> --
>
> Key: HADOOP-15076
> URL: https://issues.apache.org/jira/browse/HADOOP-15076
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation, fs/s3
>Affects Versions: 2.8.2
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15076-001.patch, HADOOP-15076-002.patch, 
> HADOOP-15076-003.patch, HADOOP-15076-004.patch, HADOOP-15076-005.patch
>
>
> A recurrent theme in s3a-related JIRAs, support calls etc is "tried upgrading 
> the AWS SDK JAR and then I got the error ...". We know here "don't do that", 
> but its not something immediately obvious to lots of downstream users who 
> want to be able to drop in the new JAR to fix things/add new features
> We need to spell this out quite clearlyi "you cannot safely expect to do 
> this. If you want to upgrade the SDK, you will need to rebuild the whole of 
> hadoop-aws with the maven POM updated to the latest version, ideally 
> rerunning all the tests to make sure something hasn't broken. 
> Maybe near the top of the index.md file, along with "never share your AWS 
> credentials with anyone"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15076) Enhance s3a troubleshooting docs, add perf section

2018-02-13 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362458#comment-16362458
 ] 

Steve Loughran commented on HADOOP-15076:
-

Include stacks from HADOOP-14621 and HADOOP-14530

> Enhance s3a troubleshooting docs, add perf section
> --
>
> Key: HADOOP-15076
> URL: https://issues.apache.org/jira/browse/HADOOP-15076
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation, fs/s3
>Affects Versions: 2.8.2
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15076-001.patch, HADOOP-15076-002.patch, 
> HADOOP-15076-003.patch, HADOOP-15076-004.patch, HADOOP-15076-005.patch
>
>
> A recurrent theme in s3a-related JIRAs, support calls etc is "tried upgrading 
> the AWS SDK JAR and then I got the error ...". We know here "don't do that", 
> but its not something immediately obvious to lots of downstream users who 
> want to be able to drop in the new JAR to fix things/add new features
> We need to spell this out quite clearlyi "you cannot safely expect to do 
> this. If you want to upgrade the SDK, you will need to rebuild the whole of 
> hadoop-aws with the maven POM updated to the latest version, ideally 
> rerunning all the tests to make sure something hasn't broken. 
> Maybe near the top of the index.md file, along with "never share your AWS 
> credentials with anyone"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15076) Enhance s3a troubleshooting docs, add perf section

2018-01-12 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324359#comment-16324359
 ] 

genericqa commented on HADOOP-15076:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  9m 
34s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
27m 52s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-aws in the patch failed. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 33 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 49m 44s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HADOOP-15076 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12905896/HADOOP-15076-005.patch
 |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux 6174a3f56dc5 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / b278f7b |
| maven | version: Apache Maven 3.3.9 |
| mvnsite | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/13962/artifact/out/patch-mvnsite-hadoop-tools_hadoop-aws.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/13962/artifact/out/whitespace-eol.txt
 |
| Max. process+thread count | 301 (vs. ulimit of 5000) |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/13962/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Enhance s3a troubleshooting docs, add perf section
> --
>
> Key: HADOOP-15076
> URL: https://issues.apache.org/jira/browse/HADOOP-15076
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation, fs/s3
>Affects Versions: 2.8.2
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-15076-001.patch, HADOOP-15076-002.patch, 
> HADOOP-15076-003.patch, HADOOP-15076-004.patch, HADOOP-15076-005.patch
>
>
> A recurrent theme in s3a-related JIRAs, support calls etc is "tried upgrading 
> the AWS SDK JAR and then I got the error ...". We know here "don't do that", 
> but its not something immediately obvious to lots of downstream users who 
> want to be able to drop in the new JAR to fix things/add new features
> We need to spell this out quite clearlyi "you cannot safely expect to do 
> this. If you want to upgrade the SDK, you will need to rebuild the whole of 
> hadoop-aws with the maven POM updated to the latest version, ideally 
> rerunning all the tests to make sure something hasn't broken. 
> Maybe near the top of the index.md file, along with "never share your AWS 
> credentials with anyone"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15076) Enhance s3a troubleshooting docs, add perf section

2018-01-12 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324290#comment-16324290
 ] 

Steve Loughran commented on HADOOP-15076:
-

bq.  Isn't the commit protocol being slow part of "performance"? Can this be 
rephrased?

more "people complain think their jobs slow", when they should be worrying that 
the data may actually be invalid. That is, yes, slow is the performance, but 
the real problem is that commit algorithms based on atomic renames don't work 
if renames aren't atomic, even if they are fast.

Attached Patch 5

* intro has a table
* moved commit section down past I/O
* fadvise options reordered
* added a bit on good code too
* spell check

> Enhance s3a troubleshooting docs, add perf section
> --
>
> Key: HADOOP-15076
> URL: https://issues.apache.org/jira/browse/HADOOP-15076
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation, fs/s3
>Affects Versions: 2.8.2
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-15076-001.patch, HADOOP-15076-002.patch, 
> HADOOP-15076-003.patch, HADOOP-15076-004.patch, HADOOP-15076-005.patch
>
>
> A recurrent theme in s3a-related JIRAs, support calls etc is "tried upgrading 
> the AWS SDK JAR and then I got the error ...". We know here "don't do that", 
> but its not something immediately obvious to lots of downstream users who 
> want to be able to drop in the new JAR to fix things/add new features
> We need to spell this out quite clearlyi "you cannot safely expect to do 
> this. If you want to upgrade the SDK, you will need to rebuild the whole of 
> hadoop-aws with the maven POM updated to the latest version, ideally 
> rerunning all the tests to make sure something hasn't broken. 
> Maybe near the top of the index.md file, along with "never share your AWS 
> credentials with anyone"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15076) Enhance s3a troubleshooting docs, add perf section

2018-01-12 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323851#comment-16323851
 ] 

Steve Loughran commented on HADOOP-15076:
-

thanks: always good to have feedback,

I'll take on all the perf stuff

w.r.t the "heavyhanded" change in troubleshooting, yes it is that blunt. You 
can look at all the JIRAs off the previous S3a changes and see that something 
invariably fails, including changed APIs stopping linking working. I know minor 
diffs are more nuanced, but I don't want to encourage people to go that way, as 
even point releases change things (HADOOP-14283, HADOOP-13050, HADOOP-13044)

Saying  "a point release may work" isn't valid unless we say "a point release 
may update but you can't be sure without running all the hadoop-aws test suite 
with s3guard and scale enabled then build downstream projects and run any tests 
you have to push them against s3 to see that all is well". 

I'm trying to make clear that no, JAR-swap-debugging isn't going to do any 
good. Which I believe, based on experience, is generally the case.

> Enhance s3a troubleshooting docs, add perf section
> --
>
> Key: HADOOP-15076
> URL: https://issues.apache.org/jira/browse/HADOOP-15076
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation, fs/s3
>Affects Versions: 2.8.2
>Reporter: Steve Loughran
>Assignee: Abraham Fine
> Attachments: HADOOP-15076-001.patch, HADOOP-15076-002.patch, 
> HADOOP-15076-003.patch, HADOOP-15076-004.patch
>
>
> A recurrent theme in s3a-related JIRAs, support calls etc is "tried upgrading 
> the AWS SDK JAR and then I got the error ...". We know here "don't do that", 
> but its not something immediately obvious to lots of downstream users who 
> want to be able to drop in the new JAR to fix things/add new features
> We need to spell this out quite clearlyi "you cannot safely expect to do 
> this. If you want to upgrade the SDK, you will need to rebuild the whole of 
> hadoop-aws with the maven POM updated to the latest version, ideally 
> rerunning all the tests to make sure something hasn't broken. 
> Maybe near the top of the index.md file, along with "never share your AWS 
> credentials with anyone"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15076) Enhance s3a troubleshooting docs, add perf section

2018-01-11 Thread Abraham Fine (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323239#comment-16323239
 ] 

Abraham Fine commented on HADOOP-15076:
---

I'm new to this codebase so I think I was able to point out a few parts of the 
documentation that may be confusing to new users.

h3. performance.md
* Would it be possible to change the introduction setting from two sequential 
lists to a table? That may make it easier to compare S3 and HDFS.
* {{list files a lot. This includes the setup of all queries agains data:}} 
typo in agains
* {{The MapReduce `FileOutputCommitter`. This also used by Apache Spark.}} I'm 
not sure what this sentence is trying to express
* {{Your problem may appear to be performance, but really it is that the commit 
protocol is both slow and unreliable}} Isn't the commit protocol being slow 
part of "performance"? Can this be rephrased? 
* {{This is leads to maximum read throughput}} "This will lead to..."?
* Perhaps describe the {{random}} policy before {{normal}} as one needs to 
understand {{random}} before understanding {{normal}}.
* {{may consume large amounts of resources if each query is working with a 
different set of s3 buckets}} Why wouldn't a large amount of resources be 
consumed if working with the same set of s3 buckets?
* {{When uploading data, it is uploaded in blocks set by the option}} Consider 
changing to "Data is uploaded in blocks set by the option..."
* Extra newline on 451

h3. troubleshooting_s3a.md
* {{Whatever problem you have, changing the AWS SDK version will not fix 
things, only change the stack traces you see.}} Again, I'm new here so I'm not 
sure about the history of this issue but this section seems a little heavy 
handed to me. Does amazon never release "bug fix" versions of their client that 
are API compatible? How can we make this statement with such certainty?
* 

> Enhance s3a troubleshooting docs, add perf section
> --
>
> Key: HADOOP-15076
> URL: https://issues.apache.org/jira/browse/HADOOP-15076
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation, fs/s3
>Affects Versions: 2.8.2
>Reporter: Steve Loughran
>Assignee: Abraham Fine
> Attachments: HADOOP-15076-001.patch, HADOOP-15076-002.patch, 
> HADOOP-15076-003.patch, HADOOP-15076-004.patch
>
>
> A recurrent theme in s3a-related JIRAs, support calls etc is "tried upgrading 
> the AWS SDK JAR and then I got the error ...". We know here "don't do that", 
> but its not something immediately obvious to lots of downstream users who 
> want to be able to drop in the new JAR to fix things/add new features
> We need to spell this out quite clearlyi "you cannot safely expect to do 
> this. If you want to upgrade the SDK, you will need to rebuild the whole of 
> hadoop-aws with the maven POM updated to the latest version, ideally 
> rerunning all the tests to make sure something hasn't broken. 
> Maybe near the top of the index.md file, along with "never share your AWS 
> credentials with anyone"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15076) Enhance s3a troubleshooting docs, add perf section

2018-01-11 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322933#comment-16322933
 ] 

genericqa commented on HADOOP-15076:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
26m 44s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
24s{color} | {color:red} hadoop-aws in the patch failed. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 26 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 19s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 40m 18s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HADOOP-15076 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12905737/HADOOP-15076-004.patch
 |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux e1cf2c98091a 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / bc285da |
| maven | version: Apache Maven 3.3.9 |
| mvnsite | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/13956/artifact/out/patch-mvnsite-hadoop-tools_hadoop-aws.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/13956/artifact/out/whitespace-eol.txt
 |
| Max. process+thread count | 341 (vs. ulimit of 5000) |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/13956/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Enhance s3a troubleshooting docs, add perf section
> --
>
> Key: HADOOP-15076
> URL: https://issues.apache.org/jira/browse/HADOOP-15076
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation, fs/s3
>Affects Versions: 2.8.2
>Reporter: Steve Loughran
>Assignee: Abraham Fine
> Attachments: HADOOP-15076-001.patch, HADOOP-15076-002.patch, 
> HADOOP-15076-003.patch, HADOOP-15076-004.patch
>
>
> A recurrent theme in s3a-related JIRAs, support calls etc is "tried upgrading 
> the AWS SDK JAR and then I got the error ...". We know here "don't do that", 
> but its not something immediately obvious to lots of downstream users who 
> want to be able to drop in the new JAR to fix things/add new features
> We need to spell this out quite clearlyi "you cannot safely expect to do 
> this. If you want to upgrade the SDK, you will need to rebuild the whole of 
> hadoop-aws with the maven POM updated to the latest version, ideally 
> rerunning all the tests to make sure something hasn't broken. 
> Maybe near the top of the index.md file, along with "never share your AWS 
> credentials with anyone"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org