date:20170602

[jira] [Commented] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads

2017-06-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035743#comment-16035743
 ] 

Hadoop QA commented on HADOOP-14478:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
43s{color} | {color:green} hadoop-azure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 38s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HADOOP-14478 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12871098/HADOOP-14478.003.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 3895ea5516dd 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 73ecb19 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12439/testReport/ |
| modules | C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12439/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Optimize NativeAzureFsInputStream for positional reads
> --
>
> Key: HADOOP-14478
> URL: https://issues.apache.org/jira/browse/HADOOP-14478
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HADOOP-14478.001.patch, HADOOP-14478.002.patch, 
> HADOOP-14478.003.patch
>
>
> Azure's

[jira] [Commented] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads

2017-06-02 Thread Rajesh Balamohan (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035724#comment-16035724
 ] 

Rajesh Balamohan commented on HADOOP-14478:
---

Thanks [~liuml07]

> Optimize NativeAzureFsInputStream for positional reads
> --
>
> Key: HADOOP-14478
> URL: https://issues.apache.org/jira/browse/HADOOP-14478
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HADOOP-14478.001.patch, HADOOP-14478.002.patch, 
> HADOOP-14478.003.patch
>
>
> Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of 
> the data length requested for. This would be beneficial for sequential reads. 
> However, for positional reads (seek to specific location, read x number of 
> bytes, seek back to original location) this may not be beneficial and might 
> even download lot more data which are not used later.
> It would be good to override {{readFully(long position, byte[] buffer, int 
> offset, int length)}} for {{NativeAzureFsInputStream}} and make use of 
> {{mark(readLimit)}} as a hint to Azure's BlobInputStream.
> BlobInputStream reference: 
> https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448
> BlobInputStream can consider this as a hint later to determine the amount of 
> data to be read ahead. Changes to BlobInputStream would not be addressed in 
> this JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads

2017-06-02 Thread Rajesh Balamohan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HADOOP-14478:
--
Attachment: HADOOP-14478.003.patch

Attaching .3 patch to address checkstyle issue (removed unused import statement)

> Optimize NativeAzureFsInputStream for positional reads
> --
>
> Key: HADOOP-14478
> URL: https://issues.apache.org/jira/browse/HADOOP-14478
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HADOOP-14478.001.patch, HADOOP-14478.002.patch, 
> HADOOP-14478.003.patch
>
>
> Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of 
> the data length requested for. This would be beneficial for sequential reads. 
> However, for positional reads (seek to specific location, read x number of 
> bytes, seek back to original location) this may not be beneficial and might 
> even download lot more data which are not used later.
> It would be good to override {{readFully(long position, byte[] buffer, int 
> offset, int length)}} for {{NativeAzureFsInputStream}} and make use of 
> {{mark(readLimit)}} as a hint to Azure's BlobInputStream.
> BlobInputStream reference: 
> https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448
> BlobInputStream can consider this as a hint later to determine the amount of 
> data to be read ahead. Changes to BlobInputStream would not be addressed in 
> this JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14476) make InconsistentAmazonS3Client usable in downstream tests

2017-06-02 Thread Aaron Fabbri (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035691#comment-16035691
 ] 

Aaron Fabbri commented on HADOOP-14476:
---

I started on this but shout if you wanted to work on it [~ste...@apache.org].

I'd like to make it configurable but I'm not sure if we want to pollute the 
config space with failure injection stuff.  Maybe leave the values out of 
core-default.xml and just document them in testing.md? Thoughts?  

I'm currently thinking of adding a couple of knobs. 

1. Delay time in milliseconds (how long the inconsistency lasts).
2. Substring for matching paths to be delayed. 
3. A probability for random failure injection.



> make InconsistentAmazonS3Client usable in downstream tests
> --
>
> Key: HADOOP-14476
> URL: https://issues.apache.org/jira/browse/HADOOP-14476
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: HADOOP-13345
>Reporter: Steve Loughran
>Assignee: Aaron Fabbri
>
> It's important for downstream apps to be able to verify that s3guard works by 
> making the AWS client inconsistent (so demonstrate problems), then turn 
> s3guard on to verify that they go away. 
> This can be done by exposing the {{InconsistentAmazonS3Client}}
> # move the factory to the production source
> # make delay configurable for when you want a really long delay
> # have factory code log @ warn when a non-default factory is used.
> # mention in s3a testing.md
> I think we could look at the name of the option, 
> {{fs.s3a.s3.client.factory.impl}} too. I'd like something which has 
> "internal" in it, and without the duplication of s3a.s3



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads

2017-06-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035677#comment-16035677
 ] 

Hadoop QA commented on HADOOP-14478:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 11s{color} | {color:orange} hadoop-tools/hadoop-azure: The patch generated 1 
new + 62 unchanged - 0 fixed = 63 total (was 62) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
21s{color} | {color:green} hadoop-azure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 35s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HADOOP-14478 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12870981/HADOOP-14478.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux f484abf72143 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 73ecb19 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12438/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-azure.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12438/testReport/ |
| modules | C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12438/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Optimize NativeAzureFsInputStream for positional reads
> --
>
> Key: HADOOP-14478
> URL: https://issues.apache.org/jira/browse/HADOOP-14478
> Project: Hadoop Common
>  Issue

[jira] [Commented] (HADOOP-14459) SerializationFactory shouldn't throw a NullPointerException if the serializations list is not defined

2017-06-02 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035678#comment-16035678
 ] 

Daniel Templeton commented on HADOOP-14459:
---

I think that works for me.  I think the last thing that would make it really 
crisp for me would be to add a statement to the log warning that says the 
default settings are being used.  Other than that, LGTM.

> SerializationFactory shouldn't throw a NullPointerException if the 
> serializations list is not defined
> -
>
> Key: HADOOP-14459
> URL: https://issues.apache.org/jira/browse/HADOOP-14459
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
> Attachments: HADOOP-14459_2.patch, HADOOP-14459.patch
>
>
> The SerializationFactory throws an NPE if 
> CommonConfigurationKeys.IO_SERIALIZATIONS_KEY is not defined in the config.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14394) Provide Builder pattern for DistributedFileSystem.create

2017-06-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035657#comment-16035657
 ] 

Hadoop QA commented on HADOOP-14394:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 20m 
45s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
37s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
27s{color} | {color:red} hadoop-common-project/hadoop-common in trunk has 19 
extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
5s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
59s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 51s{color} | {color:orange} root: The patch generated 2 new + 257 unchanged 
- 0 fixed = 259 total (was 257) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  7m 11s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
27s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 35s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}185m 58s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.security.TestRaceWhenRelogin |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HADOOP-14394 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12871055/HADOOP-14394.03.patch 
|
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux a3f78222c919 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 73ecb19 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12437/artifact/patchprocess/branch-findbugs-hadoop-common-project_hadoop-common-warnings.html
 |
| checkstyle |

[jira] [Commented] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads

2017-06-02 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035644#comment-16035644
 ] 

Mingliang Liu commented on HADOOP-14478:


+1 pending on Jenkins.

Will commit next Monday if no more input.

> Optimize NativeAzureFsInputStream for positional reads
> --
>
> Key: HADOOP-14478
> URL: https://issues.apache.org/jira/browse/HADOOP-14478
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HADOOP-14478.001.patch, HADOOP-14478.002.patch
>
>
> Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of 
> the data length requested for. This would be beneficial for sequential reads. 
> However, for positional reads (seek to specific location, read x number of 
> bytes, seek back to original location) this may not be beneficial and might 
> even download lot more data which are not used later.
> It would be good to override {{readFully(long position, byte[] buffer, int 
> offset, int length)}} for {{NativeAzureFsInputStream}} and make use of 
> {{mark(readLimit)}} as a hint to Azure's BlobInputStream.
> BlobInputStream reference: 
> https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448
> BlobInputStream can consider this as a hint later to determine the amount of 
> data to be read ahead. Changes to BlobInputStream would not be addressed in 
> this JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-14476) make InconsistentAmazonS3Client usable in downstream tests

2017-06-02 Thread Aaron Fabbri (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Fabbri reassigned HADOOP-14476:
-

Assignee: Aaron Fabbri

> make InconsistentAmazonS3Client usable in downstream tests
> --
>
> Key: HADOOP-14476
> URL: https://issues.apache.org/jira/browse/HADOOP-14476
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: HADOOP-13345
>Reporter: Steve Loughran
>Assignee: Aaron Fabbri
>
> It's important for downstream apps to be able to verify that s3guard works by 
> making the AWS client inconsistent (so demonstrate problems), then turn 
> s3guard on to verify that they go away. 
> This can be done by exposing the {{InconsistentAmazonS3Client}}
> # move the factory to the production source
> # make delay configurable for when you want a really long delay
> # have factory code log @ warn when a non-default factory is used.
> # mention in s3a testing.md
> I think we could look at the name of the option, 
> {{fs.s3a.s3.client.factory.impl}} too. I'd like something which has 
> "internal" in it, and without the duplication of s3a.s3



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads

2017-06-02 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-14478:
---
Status: Patch Available  (was: Open)

> Optimize NativeAzureFsInputStream for positional reads
> --
>
> Key: HADOOP-14478
> URL: https://issues.apache.org/jira/browse/HADOOP-14478
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HADOOP-14478.001.patch, HADOOP-14478.002.patch
>
>
> Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of 
> the data length requested for. This would be beneficial for sequential reads. 
> However, for positional reads (seek to specific location, read x number of 
> bytes, seek back to original location) this may not be beneficial and might 
> even download lot more data which are not used later.
> It would be good to override {{readFully(long position, byte[] buffer, int 
> offset, int length)}} for {{NativeAzureFsInputStream}} and make use of 
> {{mark(readLimit)}} as a hint to Azure's BlobInputStream.
> BlobInputStream reference: 
> https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448
> BlobInputStream can consider this as a hint later to determine the amount of 
> data to be read ahead. Changes to BlobInputStream would not be addressed in 
> this JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-12360) Create StatsD metrics2 sink

2017-06-02 Thread Dave Marion (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035619#comment-16035619
 ] 

Dave Marion edited comment on HADOOP-12360 at 6/2/17 11:27 PM:
---

Honestly, I don't remember, it's been too long. Can you provide an example of 
how it's broken for the JVM metrics?


was (Author: dlmarion):
Honestly, I don't remember, it's been too long. 

> Create StatsD metrics2 sink
> ---
>
> Key: HADOOP-12360
> URL: https://issues.apache.org/jira/browse/HADOOP-12360
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 2.7.1
>Reporter: Dave Marion
>Assignee: Dave Marion
>Priority: Minor
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: HADOOP-12360.001.patch, HADOOP-12360.002.patch, 
> HADOOP-12360.003.patch, HADOOP-12360.004.patch, HADOOP-12360.005.patch, 
> HADOOP-12360.006.patch, HADOOP-12360.007.patch, HADOOP-12360.008.patch, 
> HADOOP-12360.009.patch, HADOOP-12360.010.patch
>
>
> Create a metrics sink that pushes to a StatsD daemon.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-12360) Create StatsD metrics2 sink

2017-06-02 Thread Dave Marion (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035619#comment-16035619
 ] 

Dave Marion edited comment on HADOOP-12360 at 6/2/17 11:26 PM:
---

Honestly, I don't remember, it's been too long. 


was (Author: dlmarion):
Honestly, I don't remember, it's been too long. Is this causing an issue?

> Create StatsD metrics2 sink
> ---
>
> Key: HADOOP-12360
> URL: https://issues.apache.org/jira/browse/HADOOP-12360
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 2.7.1
>Reporter: Dave Marion
>Assignee: Dave Marion
>Priority: Minor
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: HADOOP-12360.001.patch, HADOOP-12360.002.patch, 
> HADOOP-12360.003.patch, HADOOP-12360.004.patch, HADOOP-12360.005.patch, 
> HADOOP-12360.006.patch, HADOOP-12360.007.patch, HADOOP-12360.008.patch, 
> HADOOP-12360.009.patch, HADOOP-12360.010.patch
>
>
> Create a metrics sink that pushes to a StatsD daemon.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-12360) Create StatsD metrics2 sink

2017-06-02 Thread Dave Marion (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035619#comment-16035619
 ] 

Dave Marion commented on HADOOP-12360:
--

Honestly, I don't remember, it's been too long. Is this causing an issue?

> Create StatsD metrics2 sink
> ---
>
> Key: HADOOP-12360
> URL: https://issues.apache.org/jira/browse/HADOOP-12360
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 2.7.1
>Reporter: Dave Marion
>Assignee: Dave Marion
>Priority: Minor
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: HADOOP-12360.001.patch, HADOOP-12360.002.patch, 
> HADOOP-12360.003.patch, HADOOP-12360.004.patch, HADOOP-12360.005.patch, 
> HADOOP-12360.006.patch, HADOOP-12360.007.patch, HADOOP-12360.008.patch, 
> HADOOP-12360.009.patch, HADOOP-12360.010.patch
>
>
> Create a metrics sink that pushes to a StatsD daemon.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14481) Print stack trace when native bzip2 library does not load

2017-06-02 Thread Chen Liang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035586#comment-16035586
 ] 

Chen Liang commented on HADOOP-14481:
-

Thanks [~jojochuang] for the catch! v001 patch LGTM.

> Print stack trace when native bzip2 library does not load
> -
>
> Key: HADOOP-14481
> URL: https://issues.apache.org/jira/browse/HADOOP-14481
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: io
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
> Attachments: HADOOP-14481.001.patch
>
>
> When I ran hadoop checknative on my machine, it was not able to load system 
> bzip2 library and printed the following message.
> 17/06/02 09:25:42 WARN bzip2.Bzip2Factory: Failed to load/initialize 
> native-bzip2 library system-native, will use pure-Java version
> Reviewing the relevant code, it fails because of an exception. However, that 
> exception is not logged. We should print the stacktrace, at least at debug 
> log level.
> {code:title=Bzip2Factory#isNativeBzip2Loaded()}
> try {
>   // Initialize the native library.
>   Bzip2Compressor.initSymbols(libname);
>   Bzip2Decompressor.initSymbols(libname);
>   nativeBzip2Loaded = true;
>   LOG.info("Successfully loaded & initialized native-bzip2 library " +
>libname);
> } catch (Throwable t) {
>   LOG.warn("Failed to load/initialize native-bzip2 library " + 
>libname + ", will use pure-Java version");
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13786) Add S3Guard committer for zero-rename commits to consistent S3 endpoints

2017-06-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035546#comment-16035546
 ] 

Hadoop QA commented on HADOOP-13786:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 
51s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 41 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
23s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
38s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
10s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
41s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  2m 
51s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
15s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
14s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
36s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 13m 36s{color} 
| {color:red} root generated 1 new + 777 unchanged - 1 fixed = 778 total (was 
778) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m  3s{color} | {color:orange} root: The patch generated 43 new + 120 unchanged 
- 23 fixed = 163 total (was 143) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-registry 
generated 0 new + 45 unchanged - 3 fixed = 45 total (was 48) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
36s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
52s{color} | {color:green} hadoop-yarn-registry in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
50s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
56s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
44s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} |

[jira] [Updated] (HADOOP-14394) Provide Builder pattern for DistributedFileSystem.create

2017-06-02 Thread Lei (Eddy) Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HADOOP-14394:
---
Attachment: HADOOP-14394.03.patch

Attach a new patch to fix test failures.

> Provide Builder pattern for DistributedFileSystem.create
> 
>
> Key: HADOOP-14394
> URL: https://issues.apache.org/jira/browse/HADOOP-14394
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs
>Affects Versions: 2.9.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HADOOP-14394.00.patch, HADOOP-14394.01.patch, 
> HADOOP-14394.02.patch, HADOOP-14394.03.patch
>
>
> This JIRA continues to refine the {{FSOutputStreamBuilder}} interface 
> introduced in HDFS-11170. 
> It should also provide a spec for the Builder API.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14457) create() does not notify metadataStore of parent directories or ensure they're not existing files

2017-06-02 Thread Aaron Fabbri (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035396#comment-16035396
 ] 

Aaron Fabbri commented on HADOOP-14457:
---

If we end up adding an "ancestor is a directory" check to create() in the 
future, we could accumulate the list of missing parents during the ancestor 
checks and pass them through the operation to finishedWrite() as a precomputed 
list of the things to create in the metadatastore.  It widens some race 
conditions around other clients modifying our directory tree, but it seems like 
it would be optimal WRT round trips.  We'd have MetadataStore reads, then 
writing the outputstream, then close()->finishedWrite() does MetadataStore 
writes.

> create() does not notify metadataStore of parent directories or ensure 
> they're not existing files
> -
>
> Key: HADOOP-14457
> URL: https://issues.apache.org/jira/browse/HADOOP-14457
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
> Attachments: HADOOP-14457-HADOOP-13345.001.patch, 
> HADOOP-14457-HADOOP-13345.002.patch
>
>
> Not a great test yet, but it at least reliably demonstrates the issue. 
> LocalMetadataStore will sometimes erroneously report that a directory is 
> empty with isAuthoritative = true when it *definitely* has children the 
> metadatastore should know about. It doesn't appear to happen if the children 
> are just directory. The fact that it's returning an empty listing is 
> concerning, but the fact that it says it's authoritative *might* be a second 
> bug.
> {code}
> diff --git 
> a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
>  
> b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> index 78b3970..1821d19 100644
> --- 
> a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> +++ 
> b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> @@ -965,7 +965,7 @@ public boolean hasMetadataStore() {
>}
>  
>@VisibleForTesting
> -  MetadataStore getMetadataStore() {
> +  public MetadataStore getMetadataStore() {
>  return metadataStore;
>}
>  
> diff --git 
> a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
>  
> b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> index 4339649..881bdc9 100644
> --- 
> a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> +++ 
> b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> @@ -23,6 +23,11 @@
>  import org.apache.hadoop.fs.contract.AbstractFSContract;
>  import org.apache.hadoop.fs.FileSystem;
>  import org.apache.hadoop.fs.Path;
> +import org.apache.hadoop.fs.s3a.S3AFileSystem;
> +import org.apache.hadoop.fs.s3a.Tristate;
> +import org.apache.hadoop.fs.s3a.s3guard.DirListingMetadata;
> +import org.apache.hadoop.fs.s3a.s3guard.MetadataStore;
> +import org.junit.Test;
>  
>  import static org.apache.hadoop.fs.contract.ContractTestUtils.dataset;
>  import static org.apache.hadoop.fs.contract.ContractTestUtils.writeDataset;
> @@ -72,4 +77,24 @@ public void testRenameDirIntoExistingDir() throws 
> Throwable {
>  boolean rename = fs.rename(srcDir, destDir);
>  assertFalse("s3a doesn't support rename to non-empty directory", rename);
>}
> +
> +  @Test
> +  public void testMkdirPopulatesFileAncestors() throws Exception {
> +final FileSystem fs = getFileSystem();
> +final MetadataStore ms = ((S3AFileSystem) fs).getMetadataStore();
> +final Path parent = path("testMkdirPopulatesFileAncestors/source");
> +try {
> +  fs.mkdirs(parent);
> +  final Path nestedFile = new Path(parent, "dir1/dir2/dir3/file4");
> +  byte[] srcDataset = dataset(256, 'a', 'z');
> +  writeDataset(fs, nestedFile, srcDataset, srcDataset.length,
> +  1024, false);
> +
> +  DirListingMetadata list = ms.listChildren(parent);
> +  assertTrue("MetadataStore falsely reports authoritative empty list",
> +  list.isEmpty() == Tristate.FALSE || !list.isAuthoritative());
> +} finally {
> +  fs.delete(parent, true);
> +}
> +  }
>  }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13786) Add S3Guard committer for zero-rename commits to consistent S3 endpoints

2017-06-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035358#comment-16035358
 ] 

Steve Loughran commented on HADOOP-13786:
-

Patch 030: evolution based on integration testing with the 
InconsistentAmazonS3Client enabled, s3guard on/off, in Spark, so using its 
workflow.

* the _SUCCESS marker contains more information & diagnostics
* various bits of tuning shown (making cleanup resilient to inconsistencies in 
list vs actual)
* docs

It's in sync with commit 0fbb4aa in 
[https://github.com/hortonworks-spark/cloud-integration]; as is [the 
documentation|https://github.com/hortonworks-spark/cloud-integration/blob/master/cloud-committer/src/main/site/markdown/index.md]

The core integration tests are working; more is always welcome...I plan to 
scale things up & create 1+ test designed to work on large clusters. This is 
all just querying data, but it adds validation of the data from the _SUCCESS 
marker, which is new.

Example printing of success marker data
{code}
2017-06-02 19:59:19,780 [ScalaTest-main-running-S3ACommitDataframeSuite] INFO  
s3.S3AOperations (Logging.scala:logInfo(54)) - success data at 
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS
 : SuccessData{committer='PartitionedStagingCommitter', 
hostname='HW13176.cotham.uk', description='Task committer 
attempt_20170602195913__m_00_0', date='Fri Jun 02 19:59:17 BST 2017', 
filenames=[/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/part-0-f22d488c-dad0-4fa5-8ca4-8d00b058c77c-c000.snappy.orc]}
2017-06-02 19:59:19,781 [ScalaTest-main-running-S3ACommitDataframeSuite] INFO  
s3.S3AOperations (Logging.scala:logInfo(54)) - Metrics:
  S3guard_metadatastore_put_path_latency50thPercentileLatency = 548156
  S3guard_metadatastore_put_path_latency75thPercentileLatency = 548156
  S3guard_metadatastore_put_path_latency90thPercentileLatency = 548156
  S3guard_metadatastore_put_path_latency95thPercentileLatency = 548156
  S3guard_metadatastore_put_path_latency99thPercentileLatency = 548156
  S3guard_metadatastore_put_path_latencyNumOps = 1
  committer_bytes_committed = 384
  committer_commits_aborted = 0
  committer_commits_completed = 1
  committer_commits_created = 1
  committer_commits_failed = 0
  committer_commits_reverted = 0
  committer_jobs_completed = 1
  committer_jobs_failed = 0
  committer_tasks_completed = 1
  committer_tasks_failed = 0
  directories_created = 1
  directories_deleted = 0
  fake_directories_deleted = 6
  files_copied = 0
  files_copied_bytes = 0
  files_created = 0
  files_deleted = 2
  ignored_errors = 1
  object_continue_list_requests = 0
  object_copy_requests = 0
  object_delete_requests = 2
  object_list_requests = 5
  object_metadata_requests = 8
  object_multipart_aborted = 0
  object_put_bytes = 384
  object_put_bytes_pending = 0
  object_put_requests = 2
  object_put_requests_active = 0
  object_put_requests_completed = 2
  op_copy_from_local_file = 0
  op_exists = 2
  op_get_file_status = 4
  op_glob_status = 0
  op_is_directory = 0
  op_is_file = 0
  op_list_files = 0
  op_list_located_status = 0
  op_list_status = 0
  op_mkdirs = 0
  op_rename = 0
  s3guard_metadatastore_initialization = 0
  s3guard_metadatastore_put_path_request = 2
  stream_aborted = 0
  stream_backward_seek_operations = 0
  stream_bytes_backwards_on_seek = 0
  stream_bytes_discarded_in_abort = 0
  stream_bytes_read = 0
  stream_bytes_read_in_close = 0
  stream_bytes_skipped_on_seek = 0
  stream_close_operations = 0
  stream_closed = 0
  stream_forward_seek_operations = 0
  stream_opened = 0
  stream_read_exceptions = 0
  stream_read_fully_operations = 0
  stream_read_operations = 0
  stream_read_operations_incomplete = 0
  stream_seek_operations = 0
  stream_write_block_uploads = 0
  stream_write_block_uploads_aborted = 0
  stream_write_block_uploads_active = 0
  stream_write_block_uploads_committed = 0
  stream_write_block_uploads_data_pending = 0
  stream_write_block_uploads_pending = 0
  stream_write_failures = 0
  stream_write_total_data = 0
  stream_write_total_time = 0

2017-06-02 19:59:19,782 [ScalaTest-main-running-S3ACommitDataframeSuite] INFO  
s3.S3AOperations (Logging.scala:logInfo(54)) - Diagnostics:
  fs.s3a.committer.magic.enabled = true
  fs.s3a.metadatastore.authoritative = false
  fs.s3a.metadatastore.impl = 
org.apache.hadoop.fs.s3a.s3guard.LocalMetadataStore
{code}



> Add S3Guard committer for zero-rename commits to consistent S3 endpoints
> 
>
> Key: HADOOP-13786
> URL: https://issues.apache.org/jira/browse/HADOOP-13786
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>

[jira] [Updated] (HADOOP-13786) Add S3Guard committer for zero-rename commits to consistent S3 endpoints

2017-06-02 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-13786:

Status: Open  (was: Patch Available)

> Add S3Guard committer for zero-rename commits to consistent S3 endpoints
> 
>
> Key: HADOOP-13786
> URL: https://issues.apache.org/jira/browse/HADOOP-13786
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13786-HADOOP-13345-001.patch, 
> HADOOP-13786-HADOOP-13345-002.patch, HADOOP-13786-HADOOP-13345-003.patch, 
> HADOOP-13786-HADOOP-13345-004.patch, HADOOP-13786-HADOOP-13345-005.patch, 
> HADOOP-13786-HADOOP-13345-006.patch, HADOOP-13786-HADOOP-13345-006.patch, 
> HADOOP-13786-HADOOP-13345-007.patch, HADOOP-13786-HADOOP-13345-009.patch, 
> HADOOP-13786-HADOOP-13345-010.patch, HADOOP-13786-HADOOP-13345-011.patch, 
> HADOOP-13786-HADOOP-13345-012.patch, HADOOP-13786-HADOOP-13345-013.patch, 
> HADOOP-13786-HADOOP-13345-015.patch, HADOOP-13786-HADOOP-13345-016.patch, 
> HADOOP-13786-HADOOP-13345-017.patch, HADOOP-13786-HADOOP-13345-018.patch, 
> HADOOP-13786-HADOOP-13345-019.patch, HADOOP-13786-HADOOP-13345-020.patch, 
> HADOOP-13786-HADOOP-13345-021.patch, HADOOP-13786-HADOOP-13345-022.patch, 
> HADOOP-13786-HADOOP-13345-023.patch, HADOOP-13786-HADOOP-13345-024.patch, 
> HADOOP-13786-HADOOP-13345-025.patch, HADOOP-13786-HADOOP-13345-026.patch, 
> HADOOP-13786-HADOOP-13345-027.patch, HADOOP-13786-HADOOP-13345-028.patch, 
> HADOOP-13786-HADOOP-13345-028.patch, HADOOP-13786-HADOOP-13345-029.patch, 
> HADOOP-13786-HADOOP-13345-030.patch, objectstore.pdf, s3committer-master.zip
>
>
> A goal of this code is "support O(1) commits to S3 repositories in the 
> presence of failures". Implement it, including whatever is needed to 
> demonstrate the correctness of the algorithm. (that is, assuming that s3guard 
> provides a consistent view of the presence/absence of blobs, show that we can 
> commit directly).
> I consider ourselves free to expose the blobstore-ness of the s3 output 
> streams (ie. not visible until the close()), if we need to use that to allow 
> us to abort commit operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-13786) Add S3Guard committer for zero-rename commits to consistent S3 endpoints

2017-06-02 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-13786:

Status: Patch Available  (was: Open)

> Add S3Guard committer for zero-rename commits to consistent S3 endpoints
> 
>
> Key: HADOOP-13786
> URL: https://issues.apache.org/jira/browse/HADOOP-13786
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13786-HADOOP-13345-001.patch, 
> HADOOP-13786-HADOOP-13345-002.patch, HADOOP-13786-HADOOP-13345-003.patch, 
> HADOOP-13786-HADOOP-13345-004.patch, HADOOP-13786-HADOOP-13345-005.patch, 
> HADOOP-13786-HADOOP-13345-006.patch, HADOOP-13786-HADOOP-13345-006.patch, 
> HADOOP-13786-HADOOP-13345-007.patch, HADOOP-13786-HADOOP-13345-009.patch, 
> HADOOP-13786-HADOOP-13345-010.patch, HADOOP-13786-HADOOP-13345-011.patch, 
> HADOOP-13786-HADOOP-13345-012.patch, HADOOP-13786-HADOOP-13345-013.patch, 
> HADOOP-13786-HADOOP-13345-015.patch, HADOOP-13786-HADOOP-13345-016.patch, 
> HADOOP-13786-HADOOP-13345-017.patch, HADOOP-13786-HADOOP-13345-018.patch, 
> HADOOP-13786-HADOOP-13345-019.patch, HADOOP-13786-HADOOP-13345-020.patch, 
> HADOOP-13786-HADOOP-13345-021.patch, HADOOP-13786-HADOOP-13345-022.patch, 
> HADOOP-13786-HADOOP-13345-023.patch, HADOOP-13786-HADOOP-13345-024.patch, 
> HADOOP-13786-HADOOP-13345-025.patch, HADOOP-13786-HADOOP-13345-026.patch, 
> HADOOP-13786-HADOOP-13345-027.patch, HADOOP-13786-HADOOP-13345-028.patch, 
> HADOOP-13786-HADOOP-13345-028.patch, HADOOP-13786-HADOOP-13345-029.patch, 
> HADOOP-13786-HADOOP-13345-030.patch, objectstore.pdf, s3committer-master.zip
>
>
> A goal of this code is "support O(1) commits to S3 repositories in the 
> presence of failures". Implement it, including whatever is needed to 
> demonstrate the correctness of the algorithm. (that is, assuming that s3guard 
> provides a consistent view of the presence/absence of blobs, show that we can 
> commit directly).
> I consider ourselves free to expose the blobstore-ness of the s3 output 
> streams (ie. not visible until the close()), if we need to use that to allow 
> us to abort commit operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-13786) Add S3Guard committer for zero-rename commits to consistent S3 endpoints

2017-06-02 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-13786:

Attachment: HADOOP-13786-HADOOP-13345-030.patch

> Add S3Guard committer for zero-rename commits to consistent S3 endpoints
> 
>
> Key: HADOOP-13786
> URL: https://issues.apache.org/jira/browse/HADOOP-13786
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13786-HADOOP-13345-001.patch, 
> HADOOP-13786-HADOOP-13345-002.patch, HADOOP-13786-HADOOP-13345-003.patch, 
> HADOOP-13786-HADOOP-13345-004.patch, HADOOP-13786-HADOOP-13345-005.patch, 
> HADOOP-13786-HADOOP-13345-006.patch, HADOOP-13786-HADOOP-13345-006.patch, 
> HADOOP-13786-HADOOP-13345-007.patch, HADOOP-13786-HADOOP-13345-009.patch, 
> HADOOP-13786-HADOOP-13345-010.patch, HADOOP-13786-HADOOP-13345-011.patch, 
> HADOOP-13786-HADOOP-13345-012.patch, HADOOP-13786-HADOOP-13345-013.patch, 
> HADOOP-13786-HADOOP-13345-015.patch, HADOOP-13786-HADOOP-13345-016.patch, 
> HADOOP-13786-HADOOP-13345-017.patch, HADOOP-13786-HADOOP-13345-018.patch, 
> HADOOP-13786-HADOOP-13345-019.patch, HADOOP-13786-HADOOP-13345-020.patch, 
> HADOOP-13786-HADOOP-13345-021.patch, HADOOP-13786-HADOOP-13345-022.patch, 
> HADOOP-13786-HADOOP-13345-023.patch, HADOOP-13786-HADOOP-13345-024.patch, 
> HADOOP-13786-HADOOP-13345-025.patch, HADOOP-13786-HADOOP-13345-026.patch, 
> HADOOP-13786-HADOOP-13345-027.patch, HADOOP-13786-HADOOP-13345-028.patch, 
> HADOOP-13786-HADOOP-13345-028.patch, HADOOP-13786-HADOOP-13345-029.patch, 
> HADOOP-13786-HADOOP-13345-030.patch, objectstore.pdf, s3committer-master.zip
>
>
> A goal of this code is "support O(1) commits to S3 repositories in the 
> presence of failures". Implement it, including whatever is needed to 
> demonstrate the correctness of the algorithm. (that is, assuming that s3guard 
> provides a consistent view of the presence/absence of blobs, show that we can 
> commit directly).
> I consider ourselves free to expose the blobstore-ness of the s3 output 
> streams (ie. not visible until the close()), if we need to use that to allow 
> us to abort commit operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14481) Print stack trace when native bzip2 library does not load

2017-06-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035330#comment-16035330
 ] 

Hadoop QA commented on HADOOP-14481:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 19m 
35s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
25s{color} | {color:red} hadoop-common-project/hadoop-common in trunk has 19 
extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 12m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
27s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 76m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HADOOP-14481 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12871017/HADOOP-14481.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 39cb0683d304 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 73ecb19 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12435/artifact/patchprocess/branch-findbugs-hadoop-common-project_hadoop-common-warnings.html
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12435/testReport/ |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12435/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Print stack trace when native bzip2 library does not load
> -
>
> Key: HADOOP-14481
> URL: https://issues.apache.org/jira/browse/HADOOP-14481
> Project: Hadoop Common
>

[jira] [Commented] (HADOOP-14457) create() does not notify metadataStore of parent directories or ensure they're not existing files

2017-06-02 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035327#comment-16035327
 ] 

Sean Mackrory commented on HADOOP-14457:


I filed HADOOP-14484 for the missing test case (and will fix it if it does 
indeed fail on Local or Dynamo).

I'll look at moving this to S3Guard.java - although we should be able to save 
some operations by solving this problem and the file-as-parent-dir check in the 
same loop, rather than an S3Guard-specific one in one place and then always 
another check elsewhere.

> create() does not notify metadataStore of parent directories or ensure 
> they're not existing files
> -
>
> Key: HADOOP-14457
> URL: https://issues.apache.org/jira/browse/HADOOP-14457
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
> Attachments: HADOOP-14457-HADOOP-13345.001.patch, 
> HADOOP-14457-HADOOP-13345.002.patch
>
>
> Not a great test yet, but it at least reliably demonstrates the issue. 
> LocalMetadataStore will sometimes erroneously report that a directory is 
> empty with isAuthoritative = true when it *definitely* has children the 
> metadatastore should know about. It doesn't appear to happen if the children 
> are just directory. The fact that it's returning an empty listing is 
> concerning, but the fact that it says it's authoritative *might* be a second 
> bug.
> {code}
> diff --git 
> a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
>  
> b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> index 78b3970..1821d19 100644
> --- 
> a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> +++ 
> b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> @@ -965,7 +965,7 @@ public boolean hasMetadataStore() {
>}
>  
>@VisibleForTesting
> -  MetadataStore getMetadataStore() {
> +  public MetadataStore getMetadataStore() {
>  return metadataStore;
>}
>  
> diff --git 
> a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
>  
> b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> index 4339649..881bdc9 100644
> --- 
> a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> +++ 
> b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> @@ -23,6 +23,11 @@
>  import org.apache.hadoop.fs.contract.AbstractFSContract;
>  import org.apache.hadoop.fs.FileSystem;
>  import org.apache.hadoop.fs.Path;
> +import org.apache.hadoop.fs.s3a.S3AFileSystem;
> +import org.apache.hadoop.fs.s3a.Tristate;
> +import org.apache.hadoop.fs.s3a.s3guard.DirListingMetadata;
> +import org.apache.hadoop.fs.s3a.s3guard.MetadataStore;
> +import org.junit.Test;
>  
>  import static org.apache.hadoop.fs.contract.ContractTestUtils.dataset;
>  import static org.apache.hadoop.fs.contract.ContractTestUtils.writeDataset;
> @@ -72,4 +77,24 @@ public void testRenameDirIntoExistingDir() throws 
> Throwable {
>  boolean rename = fs.rename(srcDir, destDir);
>  assertFalse("s3a doesn't support rename to non-empty directory", rename);
>}
> +
> +  @Test
> +  public void testMkdirPopulatesFileAncestors() throws Exception {
> +final FileSystem fs = getFileSystem();
> +final MetadataStore ms = ((S3AFileSystem) fs).getMetadataStore();
> +final Path parent = path("testMkdirPopulatesFileAncestors/source");
> +try {
> +  fs.mkdirs(parent);
> +  final Path nestedFile = new Path(parent, "dir1/dir2/dir3/file4");
> +  byte[] srcDataset = dataset(256, 'a', 'z');
> +  writeDataset(fs, nestedFile, srcDataset, srcDataset.length,
> +  1024, false);
> +
> +  DirListingMetadata list = ms.listChildren(parent);
> +  assertTrue("MetadataStore falsely reports authoritative empty list",
> +  list.isEmpty() == Tristate.FALSE || !list.isAuthoritative());
> +} finally {
> +  fs.delete(parent, true);
> +}
> +  }
>  }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-14484) Ensure deleted parent directory tombstones are overwritten when implicitly recreated

2017-06-02 Thread Sean Mackrory (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory reassigned HADOOP-14484:
--

Assignee: Sean Mackrory

> Ensure deleted parent directory tombstones are overwritten when implicitly 
> recreated
> 
>
> Key: HADOOP-14484
> URL: https://issues.apache.org/jira/browse/HADOOP-14484
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
>
> As discussed on HADOOP-13998, there may be a test missing (and possibly 
> broken metadata store implementations) for the case where a directory is 
> deleted but is later implicitly recreated by creating a file inside it, where 
> the tombstone is not overwritten. In such a case, listing the parent 
> directory would result in an error.
> This may also be happening because of HADOOP-14457, but we should add a test 
> for this other possibility anyway and fix it if it fails with any 
> implementations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-14484) Ensure deleted parent directory tombstones are overwritten when implicitly recreated

2017-06-02 Thread Sean Mackrory (JIRA)

Sean Mackrory created HADOOP-14484:
--

 Summary: Ensure deleted parent directory tombstones are 
overwritten when implicitly recreated
 Key: HADOOP-14484
 URL: https://issues.apache.org/jira/browse/HADOOP-14484
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sean Mackrory


As discussed on HADOOP-13998, there may be a test missing (and possibly broken 
metadata store implementations) for the case where a directory is deleted but 
is later implicitly recreated by creating a file inside it, where the tombstone 
is not overwritten. In such a case, listing the parent directory would result 
in an error.

This may also be happening because of HADOOP-14457, but we should add a test 
for this other possibility anyway and fix it if it fails with any 
implementations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14457) create() does not notify metadataStore of parent directories or ensure they're not existing files

2017-06-02 Thread Aaron Fabbri (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035313#comment-16035313
 ] 

Aaron Fabbri commented on HADOOP-14457:
---

Thanks for the detail here [~ste...@apache.org].

{quote}
FWIW, I'd like the creation code to be kept ouf S3AFS if possible, just because 
it's getting so big & complex. I've pulled writeOperationsHelper out in the 
committer branch, but there's still a lot of complexity in the core FS now that 
everything is metastore-guarded.
{quote}
By "creation code", I assume you mean the part where create() results in all 
ancestor dirs getting created in the MetadataStore.  I generally agree.

Can this live in S3Guard.java, [~mackrorysd], or is it awkward?

> create() does not notify metadataStore of parent directories or ensure 
> they're not existing files
> -
>
> Key: HADOOP-14457
> URL: https://issues.apache.org/jira/browse/HADOOP-14457
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
> Attachments: HADOOP-14457-HADOOP-13345.001.patch, 
> HADOOP-14457-HADOOP-13345.002.patch
>
>
> Not a great test yet, but it at least reliably demonstrates the issue. 
> LocalMetadataStore will sometimes erroneously report that a directory is 
> empty with isAuthoritative = true when it *definitely* has children the 
> metadatastore should know about. It doesn't appear to happen if the children 
> are just directory. The fact that it's returning an empty listing is 
> concerning, but the fact that it says it's authoritative *might* be a second 
> bug.
> {code}
> diff --git 
> a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
>  
> b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> index 78b3970..1821d19 100644
> --- 
> a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> +++ 
> b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> @@ -965,7 +965,7 @@ public boolean hasMetadataStore() {
>}
>  
>@VisibleForTesting
> -  MetadataStore getMetadataStore() {
> +  public MetadataStore getMetadataStore() {
>  return metadataStore;
>}
>  
> diff --git 
> a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
>  
> b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> index 4339649..881bdc9 100644
> --- 
> a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> +++ 
> b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> @@ -23,6 +23,11 @@
>  import org.apache.hadoop.fs.contract.AbstractFSContract;
>  import org.apache.hadoop.fs.FileSystem;
>  import org.apache.hadoop.fs.Path;
> +import org.apache.hadoop.fs.s3a.S3AFileSystem;
> +import org.apache.hadoop.fs.s3a.Tristate;
> +import org.apache.hadoop.fs.s3a.s3guard.DirListingMetadata;
> +import org.apache.hadoop.fs.s3a.s3guard.MetadataStore;
> +import org.junit.Test;
>  
>  import static org.apache.hadoop.fs.contract.ContractTestUtils.dataset;
>  import static org.apache.hadoop.fs.contract.ContractTestUtils.writeDataset;
> @@ -72,4 +77,24 @@ public void testRenameDirIntoExistingDir() throws 
> Throwable {
>  boolean rename = fs.rename(srcDir, destDir);
>  assertFalse("s3a doesn't support rename to non-empty directory", rename);
>}
> +
> +  @Test
> +  public void testMkdirPopulatesFileAncestors() throws Exception {
> +final FileSystem fs = getFileSystem();
> +final MetadataStore ms = ((S3AFileSystem) fs).getMetadataStore();
> +final Path parent = path("testMkdirPopulatesFileAncestors/source");
> +try {
> +  fs.mkdirs(parent);
> +  final Path nestedFile = new Path(parent, "dir1/dir2/dir3/file4");
> +  byte[] srcDataset = dataset(256, 'a', 'z');
> +  writeDataset(fs, nestedFile, srcDataset, srcDataset.length,
> +  1024, false);
> +
> +  DirListingMetadata list = ms.listChildren(parent);
> +  assertTrue("MetadataStore falsely reports authoritative empty list",
> +  list.isEmpty() == Tristate.FALSE || !list.isAuthoritative());
> +} finally {
> +  fs.delete(parent, true);
> +}
> +  }
>  }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14445) Delegation tokens are not shared between KMS instances

2017-06-02 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035306#comment-16035306
 ] 

Daryn Sharp commented on HADOOP-14445:
--

I had a feeling "nameservice" alluded to the hdfs HA configuration – which is 
horrible for the reasons I've detailed and why we don't use it.  I'll politely 
stress and repeat: *Updating configs of tens of thousands of nodes, launchers, 
oozie, storm, spark, etc and restarting the services is just not logistically 
possible*.

bq. Although, IIRC, the tokens are renewed only if they are expired, so if they 
are renewed serially, it should not be a problem.

The RM renews immediately to verify token validity and to determine the next 
renewal time.  If they are expired, it's too late.  Any kp token using just a 
service authority cannot determine the kp uri and is only renewable via the kp 
uri in the config – enforcing one and only 1 kms cluster.  If the kp client can 
be instantiated via the service, then multi-kms setups are possible.

bq.  I do like the idea of using a nameservice though, as Yongjun Zhang 
suggested which will ensure that we will still have only 1 single entry.

There must be a disconnect here.  1 single entry is the advantage of just 
setting the service to the provider uri.  Adding an extra layer of indirection 
through the config creates a logistical mess with no added benefits.  I'm not 
going to bounce all my services and RMs because I added or changed a KMS 
cluster.

Here's the big picture we are trying to achieve:
* client requests kp uri from NN
* client creates kp client from kp uri
* client gets tokens and sets service to kp uri
* RM calls kms token renewer which uses kp uri in service to create kp client
* tasks use the NN->kp uri mapping established at job submission to locate 
tokens

It's +config-less+ other than a setting on the NN.  This is what we are running 
internally because the current kms client design is completely broken.  We now 
have the ability to enable EZs on a NN and/or change kms cluster configuration 
without changing configs or restarting services.

We only care about this load balancing provider because we need to ensure the 
kp client can be instantiated from the service.

> Delegation tokens are not shared between KMS instances
> --
>
> Key: HADOOP-14445
> URL: https://issues.apache.org/jira/browse/HADOOP-14445
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation, kms
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Wei-Chiu Chuang
>Assignee: Rushabh S Shah
> Attachments: HADOOP-14445-branch-2.8.patch
>
>
> As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do 
> not share delegation tokens. (a client uses KMS address/port as the key for 
> delegation token)
> {code:title=DelegationTokenAuthenticatedURL#openConnection}
> if (!creds.getAllTokens().isEmpty()) {
> InetSocketAddress serviceAddr = new InetSocketAddress(url.getHost(),
> url.getPort());
> Text service = SecurityUtil.buildTokenService(serviceAddr);
> dToken = creds.getToken(service);
> {code}
> But KMS doc states:
> {quote}
> Delegation Tokens
> Similar to HTTP authentication, KMS uses Hadoop Authentication for delegation 
> tokens too.
> Under HA, A KMS instance must verify the delegation token given by another 
> KMS instance, by checking the shared secret used to sign the delegation 
> token. To do this, all KMS instances must be able to retrieve the shared 
> secret from ZooKeeper.
> {quote}
> We should either update the KMS documentation, or fix this code to share 
> delegation tokens.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-14483) increase default value of fs.s3a.multipart.size to 128M

2017-06-02 Thread Steve Loughran (JIRA)

Steve Loughran created HADOOP-14483:
---

 Summary: increase default value of fs.s3a.multipart.size to 128M
 Key: HADOOP-14483
 URL: https://issues.apache.org/jira/browse/HADOOP-14483
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 2.8.0
Reporter: Steve Loughran
Priority: Minor


increment the default value of {{fs.s3a.multipart.size}} from "100M" to "128M".

Why? AWS S3 throttles clients making too many requests; going to a larger size 
will reduce this. Also: document the issue



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14457) create() does not notify metadataStore of parent directories or ensure they're not existing files

2017-06-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035283#comment-16035283
 ] 

Steve Loughran commented on HADOOP-14457:
-

Update: looked at {{finishedWrite}} in more detail. It does

# call {{deleteUnnecessaryFakeDirectories(p.getParent());}}
# if s3guard is enabled, update the metastore with the new value of the file.

we can/should still have the safety checks in the create call for parents being 
file, but can mabye postpone the path creation until the file is written (or do 
it again). FWIW, I'd like the creation code to be kept ouf S3AFS if possible, 
just because it's getting so big & complex. I've pulled writeOperationsHelper 
out in the committer branch, but there's still a lot of complexity in the core 
FS now that everything is metastore-guarded.

I think we should consider that there's another test missing here: a sequence 
of:

# mkdir(parent)
# delete(parent)
# touch(child)
# stat(child)
# ls(parent)

Similarly, do one for calling create() on a path whose parent hasn't been 
created and deleted, but simply doesn't exist. 

# touch(child)
# stat(child)
# ls(parent)



> create() does not notify metadataStore of parent directories or ensure 
> they're not existing files
> -
>
> Key: HADOOP-14457
> URL: https://issues.apache.org/jira/browse/HADOOP-14457
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
> Attachments: HADOOP-14457-HADOOP-13345.001.patch, 
> HADOOP-14457-HADOOP-13345.002.patch
>
>
> Not a great test yet, but it at least reliably demonstrates the issue. 
> LocalMetadataStore will sometimes erroneously report that a directory is 
> empty with isAuthoritative = true when it *definitely* has children the 
> metadatastore should know about. It doesn't appear to happen if the children 
> are just directory. The fact that it's returning an empty listing is 
> concerning, but the fact that it says it's authoritative *might* be a second 
> bug.
> {code}
> diff --git 
> a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
>  
> b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> index 78b3970..1821d19 100644
> --- 
> a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> +++ 
> b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> @@ -965,7 +965,7 @@ public boolean hasMetadataStore() {
>}
>  
>@VisibleForTesting
> -  MetadataStore getMetadataStore() {
> +  public MetadataStore getMetadataStore() {
>  return metadataStore;
>}
>  
> diff --git 
> a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
>  
> b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> index 4339649..881bdc9 100644
> --- 
> a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> +++ 
> b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> @@ -23,6 +23,11 @@
>  import org.apache.hadoop.fs.contract.AbstractFSContract;
>  import org.apache.hadoop.fs.FileSystem;
>  import org.apache.hadoop.fs.Path;
> +import org.apache.hadoop.fs.s3a.S3AFileSystem;
> +import org.apache.hadoop.fs.s3a.Tristate;
> +import org.apache.hadoop.fs.s3a.s3guard.DirListingMetadata;
> +import org.apache.hadoop.fs.s3a.s3guard.MetadataStore;
> +import org.junit.Test;
>  
>  import static org.apache.hadoop.fs.contract.ContractTestUtils.dataset;
>  import static org.apache.hadoop.fs.contract.ContractTestUtils.writeDataset;
> @@ -72,4 +77,24 @@ public void testRenameDirIntoExistingDir() throws 
> Throwable {
>  boolean rename = fs.rename(srcDir, destDir);
>  assertFalse("s3a doesn't support rename to non-empty directory", rename);
>}
> +
> +  @Test
> +  public void testMkdirPopulatesFileAncestors() throws Exception {
> +final FileSystem fs = getFileSystem();
> +final MetadataStore ms = ((S3AFileSystem) fs).getMetadataStore();
> +final Path parent = path("testMkdirPopulatesFileAncestors/source");
> +try {
> +  fs.mkdirs(parent);
> +  final Path nestedFile = new Path(parent, "dir1/dir2/dir3/file4");
> +  byte[] srcDataset = dataset(256, 'a', 'z');
> +  writeDataset(fs, nestedFile, srcDataset, srcDataset.length,
> +  1024, false);
> +
> +  DirListingMetadata list = ms.listChildren(parent);
> +  assertTrue("MetadataStore falsely reports authoritative empty list",
> +  list.isEmpty() == Tristate.FALSE || !list.isAuthoritative());
> +} finally {
> +  fs.delete(parent, true);
> +}
> +  }
>  }
> {code}



--
This

[jira] [Comment Edited] (HADOOP-13998) initial s3guard preview

2017-06-02 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035280#comment-16035280
 ] 

Sean Mackrory edited comment on HADOOP-13998 at 6/2/17 7:23 PM:


[~ste...@apache.org] - regarding that test issue, that would happen if a 
directory was deleted and a file inside it was then created without correctly 
overwriting or removing the tombstone of the parent directories. If you're 
using the DynamoDB implementation, it should definitely be replacing the 
tombstone for the parent directory when the file is created. If you're using 
the Local implementation, I wonder if that's happening as a result of 
HADOOP-14457. I'll take a closer look at that again and see if I can reproduce, 
though I thought I had added test cases for that sequence.


was (Author: mackrorysd):
[~ste...@apache.org] - regarding that test issue, that would happen if a 
directory was deleted, and a file inside it was then created. If you're using 
the DynamoDB implementation, it should definitely be replacing the tombstone 
for the parent directory when the file is created. If you're using the Local 
implementation, I wonder if that's happening as a result of HADOOP-14457. I'll 
take a closer look at that again and see if I can reproduce, though I thought I 
had added test cases for that sequence.

> initial s3guard preview
> ---
>
> Key: HADOOP-13998
> URL: https://issues.apache.org/jira/browse/HADOOP-13998
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Steve Loughran
>
> JIRA to link in all the things we think are needed for a preview/merge into 
> trunk



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13998) initial s3guard preview

2017-06-02 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035280#comment-16035280
 ] 

Sean Mackrory commented on HADOOP-13998:


[~ste...@apache.org] - regarding that test issue, that would happen if a 
directory was deleted, and a file inside it was then created. If you're using 
the DynamoDB implementation, it should definitely be replacing the tombstone 
for the parent directory when the file is created. If you're using the Local 
implementation, I wonder if that's happening as a result of HADOOP-14457. I'll 
take a closer look at that again and see if I can reproduce, though I thought I 
had added test cases for that sequence.

> initial s3guard preview
> ---
>
> Key: HADOOP-13998
> URL: https://issues.apache.org/jira/browse/HADOOP-13998
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Steve Loughran
>
> JIRA to link in all the things we think are needed for a preview/merge into 
> trunk



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13998) initial s3guard preview

2017-06-02 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035268#comment-16035268
 ] 

Sean Mackrory commented on HADOOP-13998:


{quote}If these tests were working before you turned s3guard on then they 
weren't catching inconsistencies & so were lucky (as mine were){quote}

Actually I believe a few of those tests had transient failures at a fairly 
consistent rate (something like 1 in 4 or 1 in 6 test runs if I remember 
correctly) that had always been assumed to be the result of inconsistency. They 
stopped failing entirely once the initial work for list-after-put consistency 
was incorporated.

> initial s3guard preview
> ---
>
> Key: HADOOP-13998
> URL: https://issues.apache.org/jira/browse/HADOOP-13998
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Steve Loughran
>
> JIRA to link in all the things we think are needed for a preview/merge into 
> trunk



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14457) create() does not notify metadataStore of parent directories or ensure they're not existing files

2017-06-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035262#comment-16035262
 ] 

Steve Loughran commented on HADOOP-14457:
-

OK, I am effectively seeing this in my committer tests where the file 
{{s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS}}
 exists, but an attempt to list the parent dir fails as a delete marker is 
being found instead.

{code}
2017-06-02 19:59:19,791 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG 
s3a.S3AFileSystem (S3AFileSystem.java:innerGetFileStatus(1899)) - Getting path 
status for 
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS
  
(cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS)
2017-06-02 19:59:19,791 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG 
s3guard.MetadataStore (LocalMetadataStore.java:get(151)) - 
get(s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS)
 -> file  
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS
 3404UNKNOWN  false 
S3AFileStatus{path=s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS;
 isDirectory=false; length=3404; replication=1; blocksize=1048576; 
modification_time=1496429958524; access_time=0; owner=stevel; group=stevel; 
permission=rw-rw-rw-; isSymlink=false; hasAcl=false; isEncrypted=false; 
isErasureCoded=false} isEmptyDirectory=FALSE
2017-06-02 19:59:19,792 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG 
s3a.S3AFileSystem (S3AFileSystem.java:innerListStatus(1660)) - List status for 
path: 
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc
2017-06-02 19:59:19,792 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG 
s3a.S3AFileSystem (S3AFileSystem.java:innerGetFileStatus(1899)) - Getting path 
status for 
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc
  
(cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc)
2017-06-02 19:59:19,792 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG 
s3guard.MetadataStore (LocalMetadataStore.java:get(151)) - 
get(s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc)
 -> file  
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc
 0   UNKNOWN  true  
FileStatus{path=s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc;
 isDirectory=false; length=0; replication=0; blocksize=0; 
modification_time=1496429951655; access_time=0; owner=; group=; 
permission=rw-rw-rw-; isSymlink=false; hasAcl=false; isEncrypted=false; 
isErasureCoded=false}
2017-06-02 19:59:19,801 [dispatcher-event-loop-6] INFO  
spark.MapOutputTrackerMasterEndpoint (Logging.scala:logInfo(54)) - 
MapOutputTrackerMasterEndpoint stopped!
2017-06-02 19:59:19,811 [dispatcher-event-loop-3] INFO  
scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint 
(Logging.scala:logInfo(54)) - OutputCommitCoordinator stopped!
2017-06-02 19:59:19,814 [ScalaTest-main-running-S3ACommitDataframeSuite] INFO  
spark.SparkContext (Logging.scala:logInfo(54)) - Successfully stopped 
SparkContext
- Dataframe+partitioned *** FAILED ***
  java.io.FileNotFoundException: Path 
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc
 is recorded as deleted by S3Guard
  at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:1906)
  at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1881)
  at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:1664)
  at org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:1640)
  at 
com.hortonworks.spark.cloud.ObjectStoreOperations$class.validateRowCount(ObjectStoreOperations.scala:340)
  at 
com.hortonworks.spark.cloud.CloudSuite.validateRowCount(CloudSuite.scala:37)
  at 
com.hortonworks.spark.cloud.s3.commit.S3ACommitDataframeSuite.testOneFormat(S3ACommitDataframeSuite.scala:111)
  at 
com.hortonworks.spark.cloud.s3.commit.S3ACommitDataframeSuite$$anonfun$1$$anonfun$apply$2.apply$mcV$sp(S3ACommitDataframeSuite.scala:71)
  at 
com.hortonworks.spark.cloud.CloudSuiteTrait$$anonfun$ctest$1.apply$mcV$sp(CloudSuiteTrait.scala:66)
  at 
com.hortonworks.spark.cloud.CloudSuiteTrait$$anonfun$ctest$1.apply(CloudSuiteTrait.scala:64)
{code}

The

[jira] [Updated] (HADOOP-14283) S3A may hang due to bug in AWS SDK 1.11.86

2017-06-02 Thread Aaron Fabbri (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Fabbri updated HADOOP-14283:
--
Attachment: ITestS3AConcurrentRename.java

Attaching related Hadoop scale test but I'm leaning towards not including in 
the codebase it because:

1. It is slow as heck.  One of those things you need to run 4+ hours to get 
some confidence on.
2. The direct-to-SDK [test|https://github.com/ajfabbri/awstest] I posted in the 
description is an easier way to reproduce the SDK hang.



> S3A may hang due to bug in AWS SDK 1.11.86
> --
>
> Key: HADOOP-14283
> URL: https://issues.apache.org/jira/browse/HADOOP-14283
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.0.0-alpha2
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
>Priority: Critical
> Attachments: HADOOP-14283.001.patch, ITestS3AConcurrentRename.java
>
>
> We hit a hang bug when testing S3A with parallel renames.  
> I narrowed this down to the newer AWS Java SDK.  It only happens under load, 
> and appears to be a failure to wake up a waiting thread on timeout/error.
> I've created a github issue here:
> https://github.com/aws/aws-sdk-java/issues/1102
> I can post a Hadoop scale test which reliably reproduces this after some 
> cleanup.  I have posted an SDK-only test here which reproduces the issue 
> without Hadoop:
> https://github.com/ajfabbri/awstest
> I have a support ticket open and am working with Amazon on this bug so I'll 
> take this issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14283) S3A may hang due to bug in AWS SDK 1.11.86

2017-06-02 Thread Aaron Fabbri (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Fabbri updated HADOOP-14283:
--
Attachment: HADOOP-14283.001.patch

Attaching patch which bumps SDK from 1.11.86 to 1.11.134.  There are now newer 
versions but I've done a good amount of testing on .134.

I ran unit, integration, and scale tests in us-west-2.

> S3A may hang due to bug in AWS SDK 1.11.86
> --
>
> Key: HADOOP-14283
> URL: https://issues.apache.org/jira/browse/HADOOP-14283
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.0.0-alpha2
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
>Priority: Critical
> Attachments: HADOOP-14283.001.patch
>
>
> We hit a hang bug when testing S3A with parallel renames.  
> I narrowed this down to the newer AWS Java SDK.  It only happens under load, 
> and appears to be a failure to wake up a waiting thread on timeout/error.
> I've created a github issue here:
> https://github.com/aws/aws-sdk-java/issues/1102
> I can post a Hadoop scale test which reliably reproduces this after some 
> cleanup.  I have posted an SDK-only test here which reproduces the issue 
> without Hadoop:
> https://github.com/ajfabbri/awstest
> I have a support ticket open and am working with Amazon on this bug so I'll 
> take this issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14482) Update BUILDING.txt to include the correct steps to install zstd library

2017-06-02 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HADOOP-14482:
-
Description: 
The current BUILDING.txt includes the following steps for installing zstd 
library:
$ sudo apt-get install zstd

This is incorrect. On my Ubuntu 16 machine, zstd is not a library
{quote}
apt-cache search zstd
libzstd-dev - fast lossless compression algorithm -- development files
libzstd0 - fast lossless compression algorithm
zstd - fast lossless compression algorithm -- CLI tool
{quote}
On a Ubuntu 14 machine, I couldn't even find anything related to zstd.

In fact, to build Hadoop with ZStandard library, I have to install libzstd-dev.
I will also need to install the runtime to use it. libzstd0 is the older 
version. libzstd1 is for zstd 1.x. It's not clear to me if libzstd0 is 
compatible. CentOS does have libzstd1 though.

Perhaps we can provide instruction to compile/install libzstd from source code.

{quote}
  * Use -Dzstd.prefix to specify a nonstandard location for the libzstd
header files and library files. You do not need this option if you have
installed zstandard using a package manager.

  * Use -Dzstd.lib to specify a nonstandard location for the libzstd library
files.  Similarly to zstd.prefix, you do not need this option if you have
installed using a package manager.
{quote}
At least for CentOS, the library installed by rpm was not located and I had to 
specify -Dzstd.prefix to get it installed.

  was:
The current BUILDING.txt includes the following steps for installing zstd 
library:
$ sudo apt-get install zstd

This is incorrect. On my Ubuntu machine, zstd is not a library
{quote}
apt-cache search zstd
libzstd-dev - fast lossless compression algorithm -- development files
libzstd0 - fast lossless compression algorithm
zstd - fast lossless compression algorithm -- CLI tool
{quote}
In fact, to build Hadoop with ZStandard library, I have to install libzstd-dev.
I will also need to install the runtime to use it. libzstd0 is the older 
version. libzstd1 is for zstd 1.x. It's not clear to me if libzstd0 is 
compatible. CentOS does have libzstd1 though.

{quote}
  * Use -Dzstd.prefix to specify a nonstandard location for the libzstd
header files and library files. You do not need this option if you have
installed zstandard using a package manager.

  * Use -Dzstd.lib to specify a nonstandard location for the libzstd library
files.  Similarly to zstd.prefix, you do not need this option if you have
installed using a package manager.
{quote}
At least for CentOS, the library installed by rpm was not located and I had to 
specify -Dzstd.prefix to get it installed.


> Update BUILDING.txt to include the correct steps to install zstd library
> 
>
> Key: HADOOP-14482
> URL: https://issues.apache.org/jira/browse/HADOOP-14482
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: io
>Affects Versions: 3.0.0-alpha2
>Reporter: Wei-Chiu Chuang
>Priority: Minor
>
> The current BUILDING.txt includes the following steps for installing zstd 
> library:
> $ sudo apt-get install zstd
> This is incorrect. On my Ubuntu 16 machine, zstd is not a library
> {quote}
> apt-cache search zstd
> libzstd-dev - fast lossless compression algorithm -- development files
> libzstd0 - fast lossless compression algorithm
> zstd - fast lossless compression algorithm -- CLI tool
> {quote}
> On a Ubuntu 14 machine, I couldn't even find anything related to zstd.
> In fact, to build Hadoop with ZStandard library, I have to install 
> libzstd-dev.
> I will also need to install the runtime to use it. libzstd0 is the older 
> version. libzstd1 is for zstd 1.x. It's not clear to me if libzstd0 is 
> compatible. CentOS does have libzstd1 though.
> Perhaps we can provide instruction to compile/install libzstd from source 
> code.
> {quote}
>   * Use -Dzstd.prefix to specify a nonstandard location for the libzstd
> header files and library files. You do not need this option if you have
> installed zstandard using a package manager.
>   * Use -Dzstd.lib to specify a nonstandard location for the libzstd library
> files.  Similarly to zstd.prefix, you do not need this option if you have
> installed using a package manager.
> {quote}
> At least for CentOS, the library installed by rpm was not located and I had 
> to specify -Dzstd.prefix to get it installed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13998) initial s3guard preview

2017-06-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035217#comment-16035217
 ] 

Steve Loughran commented on HADOOP-13998:
-

regarding tests, I'm seeing something up with the combination of (s3guard and 
the partition committer (and only it)): a newly created file is where it should 
be, but the parent dir is still tagged as missing. I  can GET the file, but if 
I try to list the parent I get rejected:
{code}
2017-06-02 18:19:10,709 [ScalaTest-main-running-S3ACommitDataframeSuite] INFO  
s3.S3AOperations (Logging.scala:logInfo(54)) - 
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/part-0-7573c876-38e5-4024-8a53-51fa1aa9c9c2-c000.snappy.orc
 size=384
2017-06-02 18:19:10,709 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG 
s3a.S3AFileSystem (S3AFileSystem.java:innerGetFileStatus(1899)) - Getting path 
status for 
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS
  
(cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS)
2017-06-02 18:19:10,710 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG 
s3guard.MetadataStore (LocalMetadataStore.java:get(151)) - 
get(s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS)
 -> file  
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS
 3400UNKNOWN  false 
S3AFileStatus{path=s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS;
 isDirectory=false; length=3400; replication=1; blocksize=1048576; 
modification_time=1496423948811; access_time=0; owner=stevel; group=stevel; 
permission=rw-rw-rw-; isSymlink=false; hasAcl=false; isEncrypted=false; 
isErasureCoded=false} isEmptyDirectory=FALSE
2017-06-02 18:19:10,710 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG 
s3a.S3AFileSystem (S3AFileSystem.java:innerListStatus(1660)) - List status for 
path: 
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc
2017-06-02 18:19:10,710 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG 
s3a.S3AFileSystem (S3AFileSystem.java:innerGetFileStatus(1899)) - Getting path 
status for 
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc
  
(cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc)
2017-06-02 18:19:10,711 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG 
s3guard.MetadataStore (LocalMetadataStore.java:get(151)) - 
get(s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc)
 -> file  
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc
 0   UNKNOWN  true  
FileStatus{path=s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc;
 isDirectory=false; length=0; replication=0; blocksize=0; 
modification_time=1496423936532; access_time=0; owner=; group=; 
permission=rw-rw-rw-; isSymlink=false; hasAcl=false; isEncrypted=false; 
isErasureCoded=false}
2017-06-02 18:19:10,719 [dispatcher-event-loop-6] INFO  
spark.MapOutputTrackerMasterEndpoint (Logging.scala:logInfo(54)) - 
MapOutputTrackerMasterEndpoint stopped!
2017-06-02 18:19:10,727 [dispatcher-event-loop-3] INFO  
scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint 
(Logging.scala:logInfo(54)) - OutputCommitCoordinator stopped!
2017-06-02 18:19:10,729 [ScalaTest-main-running-S3ACommitDataframeSuite] INFO  
spark.SparkContext (Logging.scala:logInfo(54)) - Successfully stopped 
SparkContext
- Dataframe+partitioned *** FAILED ***
  java.io.FileNotFoundException: Path 
s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc
 is recorded as deleted by S3Guard
  at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:1906)
  at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1881)
  at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:1664)
  at org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:1640)
  at 
com.hortonworks.spark.cloud.ObjectStoreOperations$class.validateRowCount(ObjectStoreOperations.scala:340)
  at 
com.hortonworks.spark.cloud.CloudSuite.validateRowCount(CloudSuite.scala:37)
  at 
com.hortonworks.spark.cloud.s3.commit.S3ACommitDataframeSuite.testOneFormat(S3ACommitDataframeSuite.scala:107)
  at

[jira] [Commented] (HADOOP-14445) Delegation tokens are not shared between KMS instances

2017-06-02 Thread Arun Suresh (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035218#comment-16035218
 ] 

Arun Suresh commented on HADOOP-14445:
--

[~daryn], agreed that duplicating the entries is bad if the RM will blindly 
renew all of them. Although, IIRC, the tokens are renewed only if they are 
expired, so if they are renewed serially, it should not be a problem. But I do 
agree, since RM also renews DTs on other app events as well (app recovery etc) 
- for which duplicate renewal, might not be preventable.

bq. The cleanest way to manage a kms cluster is transparently via a cname or 
multi-A record
True, in which case one would not even need the LoadBalancingKMSClientProvider.

I do like the idea of using a nameservice though, as [~yzhangal] suggested 
which will ensure that we will still have only 1 single entry.

> Delegation tokens are not shared between KMS instances
> --
>
> Key: HADOOP-14445
> URL: https://issues.apache.org/jira/browse/HADOOP-14445
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation, kms
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Wei-Chiu Chuang
>Assignee: Rushabh S Shah
> Attachments: HADOOP-14445-branch-2.8.patch
>
>
> As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do 
> not share delegation tokens. (a client uses KMS address/port as the key for 
> delegation token)
> {code:title=DelegationTokenAuthenticatedURL#openConnection}
> if (!creds.getAllTokens().isEmpty()) {
> InetSocketAddress serviceAddr = new InetSocketAddress(url.getHost(),
> url.getPort());
> Text service = SecurityUtil.buildTokenService(serviceAddr);
> dToken = creds.getToken(service);
> {code}
> But KMS doc states:
> {quote}
> Delegation Tokens
> Similar to HTTP authentication, KMS uses Hadoop Authentication for delegation 
> tokens too.
> Under HA, A KMS instance must verify the delegation token given by another 
> KMS instance, by checking the shared secret used to sign the delegation 
> token. To do this, all KMS instances must be able to retrieve the shared 
> secret from ZooKeeper.
> {quote}
> We should either update the KMS documentation, or fix this code to share 
> delegation tokens.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14468) S3Guard: make short-circuit getFileStatus() configurable

2017-06-02 Thread Aaron Fabbri (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035211#comment-16035211
 ] 

Aaron Fabbri commented on HADOOP-14468:
---

I created this JIRA to follow up on [your 
comment|https://issues.apache.org/jira/browse/HADOOP-13345?focusedCommentId=16019741=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16019741]
  and the discussion about failing fast when file is not visible in S3 in the 
read path.

I'm not 100% convinced we want this but it could be useful for:

1. Failing fast on open() instead of when we later read the stream.
2. A "safe mode" or fallback that can be enabled.  When this is set to false, 
we could collect stats on any time MetadataStore differs from S3 which would be 
interesting.  I.e. "s3 / metastore length differs" or "visible in metastore but 
not s3"

In general we do not support a mixed mode where some clients use S3Guard and 
others do not: It is not safe.  However, if there is a well-known path where 
only an external process (e.g. ETL) is dropping files for ingest, it may be 
nice to be able to support that more narrow case.  I think the existing 
behavior with list checking S3 + MetadataStore is sufficient without this 
change though.

> S3Guard: make short-circuit getFileStatus() configurable
> 
>
> Key: HADOOP-14468
> URL: https://issues.apache.org/jira/browse/HADOOP-14468
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
>
> Currently, when S3Guard is enabled, getFileStatus() will skip S3 if it gets a 
> result from the MetadataStore (e.g. dynamodb) first.
> I would like to add a new parameter 
> {{fs.s3a.metadatastore.getfilestatus.authoritative}} which, when true, keeps 
> the current behavior.  When false, S3AFileSystem will check both S3 and the 
> MetadataStore.
> I'm not sure yet if we want to have this behavior the same for all callers of 
> getFileStatus(), or if we only want to check both S3 and MetadataStore for 
> some internal callers such as open().



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14481) Print stack trace when native bzip2 library does not load

2017-06-02 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HADOOP-14481:
-
Status: Patch Available  (was: Open)

> Print stack trace when native bzip2 library does not load
> -
>
> Key: HADOOP-14481
> URL: https://issues.apache.org/jira/browse/HADOOP-14481
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: io
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
> Attachments: HADOOP-14481.001.patch
>
>
> When I ran hadoop checknative on my machine, it was not able to load system 
> bzip2 library and printed the following message.
> 17/06/02 09:25:42 WARN bzip2.Bzip2Factory: Failed to load/initialize 
> native-bzip2 library system-native, will use pure-Java version
> Reviewing the relevant code, it fails because of an exception. However, that 
> exception is not logged. We should print the stacktrace, at least at debug 
> log level.
> {code:title=Bzip2Factory#isNativeBzip2Loaded()}
> try {
>   // Initialize the native library.
>   Bzip2Compressor.initSymbols(libname);
>   Bzip2Decompressor.initSymbols(libname);
>   nativeBzip2Loaded = true;
>   LOG.info("Successfully loaded & initialized native-bzip2 library " +
>libname);
> } catch (Throwable t) {
>   LOG.warn("Failed to load/initialize native-bzip2 library " + 
>libname + ", will use pure-Java version");
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-14482) Update BUILDING.txt to include the correct steps to install zstd library

2017-06-02 Thread Wei-Chiu Chuang (JIRA)

Wei-Chiu Chuang created HADOOP-14482:


 Summary: Update BUILDING.txt to include the correct steps to 
install zstd library
 Key: HADOOP-14482
 URL: https://issues.apache.org/jira/browse/HADOOP-14482
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 3.0.0-alpha2
Reporter: Wei-Chiu Chuang
Priority: Minor


The current BUILDING.txt includes the following steps for installing zstd 
library:
$ sudo apt-get install zstd

This is incorrect. On my Ubuntu machine, zstd is not a library
{quote}
apt-cache search zstd
libzstd-dev - fast lossless compression algorithm -- development files
libzstd0 - fast lossless compression algorithm
zstd - fast lossless compression algorithm -- CLI tool
{quote}
In fact, to build Hadoop with ZStandard library, I have to install libzstd-dev.
I will also need to install the runtime to use it. libzstd0 is the older 
version. libzstd1 is for zstd 1.x. It's not clear to me if libzstd0 is 
compatible. CentOS does have libzstd1 though.

{quote}
  * Use -Dzstd.prefix to specify a nonstandard location for the libzstd
header files and library files. You do not need this option if you have
installed zstandard using a package manager.

  * Use -Dzstd.lib to specify a nonstandard location for the libzstd library
files.  Similarly to zstd.prefix, you do not need this option if you have
installed using a package manager.
{quote}
At least for CentOS, the library installed by rpm was not located and I had to 
specify -Dzstd.prefix to get it installed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14445) Delegation tokens are not shared between KMS instances

2017-06-02 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035169#comment-16035169
 ] 

Yongjun Zhang commented on HADOOP-14445:


Thanks [~daryn].

When I say it's unavoidable, I mean, if we use a nameservice, we need to 
consult the config to know what's associated with the nameservice. For example 
NN nameservice, if we do "distcp hdfs://nameservice1:/xyz 
hdfs://nameservice2:/abc", we need to look up nameservice1/2 in config, to know 
the associated NNs.

Similarly, if we use shared delegation token for all KMS servers, we could 
define a kms-nameservice to associate with the set of KMS servers, and the 
tokenService can just be the kms-nameservice, and from the config, we can find 
out the associated KMS server information.

Does that make sense?


> Delegation tokens are not shared between KMS instances
> --
>
> Key: HADOOP-14445
> URL: https://issues.apache.org/jira/browse/HADOOP-14445
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation, kms
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Wei-Chiu Chuang
>Assignee: Rushabh S Shah
> Attachments: HADOOP-14445-branch-2.8.patch
>
>
> As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do 
> not share delegation tokens. (a client uses KMS address/port as the key for 
> delegation token)
> {code:title=DelegationTokenAuthenticatedURL#openConnection}
> if (!creds.getAllTokens().isEmpty()) {
> InetSocketAddress serviceAddr = new InetSocketAddress(url.getHost(),
> url.getPort());
> Text service = SecurityUtil.buildTokenService(serviceAddr);
> dToken = creds.getToken(service);
> {code}
> But KMS doc states:
> {quote}
> Delegation Tokens
> Similar to HTTP authentication, KMS uses Hadoop Authentication for delegation 
> tokens too.
> Under HA, A KMS instance must verify the delegation token given by another 
> KMS instance, by checking the shared secret used to sign the delegation 
> token. To do this, all KMS instances must be able to retrieve the shared 
> secret from ZooKeeper.
> {quote}
> We should either update the KMS documentation, or fix this code to share 
> delegation tokens.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14445) Delegation tokens are not shared between KMS instances

2017-06-02 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035159#comment-16035159
 ] 

Daryn Sharp commented on HADOOP-14445:
--

bq. It seems unavoidable if we want to implement ksm-nameservice.
I'm not sure what this means but anything config based is not scalable.  
Updating configs of tens of thousands of nodes, launchers, oozie, storm, spark, 
etc and restarting the services is just not logistically possible.  This is 
largely why we added the ability for the NN to tell the client the kms uri, 
plus it added much needed multi-kms support.

bq. If user add new KMS and replace KMS, the clients need to be restarted with 
the new config.
Another illustration of why a config-based approach is a bad idea.  The 
cleanest way to manage a kms cluster is transparently via a cname or multi-A 
record.

I've consulted with Rushabh on the initial design, I'll review the actual patch 
today.

> Delegation tokens are not shared between KMS instances
> --
>
> Key: HADOOP-14445
> URL: https://issues.apache.org/jira/browse/HADOOP-14445
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation, kms
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Wei-Chiu Chuang
>Assignee: Rushabh S Shah
> Attachments: HADOOP-14445-branch-2.8.patch
>
>
> As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do 
> not share delegation tokens. (a client uses KMS address/port as the key for 
> delegation token)
> {code:title=DelegationTokenAuthenticatedURL#openConnection}
> if (!creds.getAllTokens().isEmpty()) {
> InetSocketAddress serviceAddr = new InetSocketAddress(url.getHost(),
> url.getPort());
> Text service = SecurityUtil.buildTokenService(serviceAddr);
> dToken = creds.getToken(service);
> {code}
> But KMS doc states:
> {quote}
> Delegation Tokens
> Similar to HTTP authentication, KMS uses Hadoop Authentication for delegation 
> tokens too.
> Under HA, A KMS instance must verify the delegation token given by another 
> KMS instance, by checking the shared secret used to sign the delegation 
> token. To do this, all KMS instances must be able to retrieve the shared 
> secret from ZooKeeper.
> {quote}
> We should either update the KMS documentation, or fix this code to share 
> delegation tokens.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14481) Print stack trace when native bzip2 library does not load

2017-06-02 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HADOOP-14481:
-
Attachment: HADOOP-14481.001.patch

Attach a very simple fix. With this fix, I am getting the following stacktrace:

17/06/02 10:42:04 WARN bzip2.Bzip2Factory: Failed to load/initialize 
native-bzip2 library system-native, will use pure-Java version
java.lang.UnsatisfiedLinkError: 
org.apache.hadoop.io.compress.bzip2.Bzip2Compressor.initIDs(Ljava/lang/String;)V
at org.apache.hadoop.io.compress.bzip2.Bzip2Compressor.initIDs(Native 
Method)
at 
org.apache.hadoop.io.compress.bzip2.Bzip2Compressor.initSymbols(Bzip2Compressor.java:284)
at 
org.apache.hadoop.io.compress.bzip2.Bzip2Factory.isNativeBzip2Loaded(Bzip2Factory.java:58)
at 
org.apache.hadoop.util.NativeLibraryChecker.main(NativeLibraryChecker.java:74)

> Print stack trace when native bzip2 library does not load
> -
>
> Key: HADOOP-14481
> URL: https://issues.apache.org/jira/browse/HADOOP-14481
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: io
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
> Attachments: HADOOP-14481.001.patch
>
>
> When I ran hadoop checknative on my machine, it was not able to load system 
> bzip2 library and printed the following message.
> 17/06/02 09:25:42 WARN bzip2.Bzip2Factory: Failed to load/initialize 
> native-bzip2 library system-native, will use pure-Java version
> Reviewing the relevant code, it fails because of an exception. However, that 
> exception is not logged. We should print the stacktrace, at least at debug 
> log level.
> {code:title=Bzip2Factory#isNativeBzip2Loaded()}
> try {
>   // Initialize the native library.
>   Bzip2Compressor.initSymbols(libname);
>   Bzip2Decompressor.initSymbols(libname);
>   nativeBzip2Loaded = true;
>   LOG.info("Successfully loaded & initialized native-bzip2 library " +
>libname);
> } catch (Throwable t) {
>   LOG.warn("Failed to load/initialize native-bzip2 library " + 
>libname + ", will use pure-Java version");
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-14472) Azure: TestReadAndSeekPageBlobAfterWrite fails intermittently

2017-06-02 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu reassigned HADOOP-14472:
--

Assignee: Mingliang Liu

> Azure: TestReadAndSeekPageBlobAfterWrite fails intermittently
> -
>
> Key: HADOOP-14472
> URL: https://issues.apache.org/jira/browse/HADOOP-14472
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure, test
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HADOOP-14472.000.patch
>
>
> Reported by [HADOOP-14461]
> {code}
> testManySmallWritesWithHFlush(org.apache.hadoop.fs.azure.TestReadAndSeekPageBlobAfterWrite)
>   Time elapsed: 1.051 sec  <<< FAILURE!
> java.lang.AssertionError: hflush duration of 13, less than minimum expected 
> of 20
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.fs.azure.TestReadAndSeekPageBlobAfterWrite.writeAndReadOneFile(TestReadAndSeekPageBlobAfterWrite.java:286)
>   at 
> org.apache.hadoop.fs.azure.TestReadAndSeekPageBlobAfterWrite.testManySmallWritesWithHFlush(TestReadAndSeekPageBlobAfterWrite.java:247)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14472) Azure: TestReadAndSeekPageBlobAfterWrite fails intermittently

2017-06-02 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035094#comment-16035094
 ] 

Mingliang Liu commented on HADOOP-14472:


Tested against US WEST region. Now all unit/live tests pass.

{code}
$ mcb; and cd hadoop-tools/hadoop-azure; and mvn test -q
java version "1.8.0_65"
Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.65-b01, mixed mode)

---
 T E S T S
---

---
 T E S T S
---
Running org.apache.hadoop.fs.azure.contract.TestAzureNativeContractAppend
Tests run: 5, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 2.225 sec - in 
org.apache.hadoop.fs.azure.contract.TestAzureNativeContractAppend
Running org.apache.hadoop.fs.azure.contract.TestAzureNativeContractCreate
Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.039 sec - in 
org.apache.hadoop.fs.azure.contract.TestAzureNativeContractCreate
Running org.apache.hadoop.fs.azure.contract.TestAzureNativeContractDelete
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.821 sec - in 
org.apache.hadoop.fs.azure.contract.TestAzureNativeContractDelete
Running org.apache.hadoop.fs.azure.contract.TestAzureNativeContractDistCp
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.131 sec - in 
org.apache.hadoop.fs.azure.contract.TestAzureNativeContractDistCp
Running org.apache.hadoop.fs.azure.contract.TestAzureNativeContractGetFileStatus
Tests run: 18, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.578 sec - in 
org.apache.hadoop.fs.azure.contract.TestAzureNativeContractGetFileStatus
Running org.apache.hadoop.fs.azure.contract.TestAzureNativeContractMkdir
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.513 sec - in 
org.apache.hadoop.fs.azure.contract.TestAzureNativeContractMkdir
Running org.apache.hadoop.fs.azure.contract.TestAzureNativeContractOpen
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.079 sec - in 
org.apache.hadoop.fs.azure.contract.TestAzureNativeContractOpen
Running org.apache.hadoop.fs.azure.contract.TestAzureNativeContractRename
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.097 sec - in 
org.apache.hadoop.fs.azure.contract.TestAzureNativeContractRename
Running org.apache.hadoop.fs.azure.contract.TestAzureNativeContractSeek
Tests run: 18, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.192 sec - in 
org.apache.hadoop.fs.azure.contract.TestAzureNativeContractSeek
Running org.apache.hadoop.fs.azure.metrics.TestAzureFileSystemInstrumentation
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.561 sec - in 
org.apache.hadoop.fs.azure.metrics.TestAzureFileSystemInstrumentation
Running org.apache.hadoop.fs.azure.metrics.TestBandwidthGaugeUpdater
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.324 sec - in 
org.apache.hadoop.fs.azure.metrics.TestBandwidthGaugeUpdater
Running 
org.apache.hadoop.fs.azure.metrics.TestNativeAzureFileSystemMetricsSystem
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.694 sec - in 
org.apache.hadoop.fs.azure.metrics.TestNativeAzureFileSystemMetricsSystem
Running org.apache.hadoop.fs.azure.metrics.TestRollingWindowAverage
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.213 sec - in 
org.apache.hadoop.fs.azure.metrics.TestRollingWindowAverage
Running org.apache.hadoop.fs.azure.TestAzureConcurrentOutOfBandIo
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.62 sec - in 
org.apache.hadoop.fs.azure.TestAzureConcurrentOutOfBandIo
Running org.apache.hadoop.fs.azure.TestAzureFileSystemErrorConditions
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.603 sec - in 
org.apache.hadoop.fs.azure.TestAzureFileSystemErrorConditions
Running org.apache.hadoop.fs.azure.TestBlobDataValidation
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.444 sec - in 
org.apache.hadoop.fs.azure.TestBlobDataValidation
Running org.apache.hadoop.fs.azure.TestBlobMetadata
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.747 sec - in 
org.apache.hadoop.fs.azure.TestBlobMetadata
Running org.apache.hadoop.fs.azure.TestBlobTypeSpeedDifference
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.091 sec - in 
org.apache.hadoop.fs.azure.TestBlobTypeSpeedDifference
Running org.apache.hadoop.fs.azure.TestContainerChecks
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.787 sec - in 
org.apache.hadoop.fs.azure.TestContainerChecks
Running org.apache.hadoop.fs.azure.TestFileSystemOperationExceptionHandling
Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.164 sec - in

[jira] [Commented] (HADOOP-13998) initial s3guard preview

2017-06-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035077#comment-16035077
 ] 

Steve Loughran commented on HADOOP-13998:
-

bq. We've run our standard downstream Hive, Spark, MR, Impala, scale, and 
performance tests

If these tests were working *before* you turned s3guard on then they weren't 
catching inconsistencies & so were lucky (as mine were). I'm running my spark 
committer tests with the inconsistent client turned on, and it is repeatedly 
failing the classic & magic committers without s3guard enabled: both depend on 
consistent listing. Also found a brittleness in path cleanup for the magic 
committer too; cleanup code *must* handle an FNFE if there's a file returned in 
the listing but which isn't there in the GET. This is why I'd like the  factory 
for the inconsistent client be in src/main: it lets anyone turn on 
inconsistency for their test runs

bq. This is a good point. Do you prefer timing-based microbenchmarks, or S3 
request statistics (counts)?

the instrumentation ones are way less brittle; Ming has been fixing some 
nanotimer-assertion in WASB which was failing intermittently. I have some tests 
somewhere which call listFiles(recursive) against the amazon landsat store: 
that's the reference example of a deep and wide directory tree.



> initial s3guard preview
> ---
>
> Key: HADOOP-13998
> URL: https://issues.apache.org/jira/browse/HADOOP-13998
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Steve Loughran
>
> JIRA to link in all the things we think are needed for a preview/merge into 
> trunk



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14468) S3Guard: make short-circuit getFileStatus() configurable

2017-06-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035064#comment-16035064
 ] 

Steve Loughran commented on HADOOP-14468:
-

What's the reason for this? To pick up changes to files which aren't going to 
s3guard even when auth=true?

> S3Guard: make short-circuit getFileStatus() configurable
> 
>
> Key: HADOOP-14468
> URL: https://issues.apache.org/jira/browse/HADOOP-14468
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
>
> Currently, when S3Guard is enabled, getFileStatus() will skip S3 if it gets a 
> result from the MetadataStore (e.g. dynamodb) first.
> I would like to add a new parameter 
> {{fs.s3a.metadatastore.getfilestatus.authoritative}} which, when true, keeps 
> the current behavior.  When false, S3AFileSystem will check both S3 and the 
> MetadataStore.
> I'm not sure yet if we want to have this behavior the same for all callers of 
> getFileStatus(), or if we only want to check both S3 and MetadataStore for 
> some internal callers such as open().



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14457) create() does not notify metadataStore of parent directories or ensure they're not existing files

2017-06-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035059#comment-16035059
 ] 

Steve Loughran commented on HADOOP-14457:
-

you know we do something in BlockOutputStream when finalizing a write? Or at 
least I am in the committer branch

> create() does not notify metadataStore of parent directories or ensure 
> they're not existing files
> -
>
> Key: HADOOP-14457
> URL: https://issues.apache.org/jira/browse/HADOOP-14457
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
> Attachments: HADOOP-14457-HADOOP-13345.001.patch, 
> HADOOP-14457-HADOOP-13345.002.patch
>
>
> Not a great test yet, but it at least reliably demonstrates the issue. 
> LocalMetadataStore will sometimes erroneously report that a directory is 
> empty with isAuthoritative = true when it *definitely* has children the 
> metadatastore should know about. It doesn't appear to happen if the children 
> are just directory. The fact that it's returning an empty listing is 
> concerning, but the fact that it says it's authoritative *might* be a second 
> bug.
> {code}
> diff --git 
> a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
>  
> b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> index 78b3970..1821d19 100644
> --- 
> a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> +++ 
> b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> @@ -965,7 +965,7 @@ public boolean hasMetadataStore() {
>}
>  
>@VisibleForTesting
> -  MetadataStore getMetadataStore() {
> +  public MetadataStore getMetadataStore() {
>  return metadataStore;
>}
>  
> diff --git 
> a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
>  
> b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> index 4339649..881bdc9 100644
> --- 
> a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> +++ 
> b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> @@ -23,6 +23,11 @@
>  import org.apache.hadoop.fs.contract.AbstractFSContract;
>  import org.apache.hadoop.fs.FileSystem;
>  import org.apache.hadoop.fs.Path;
> +import org.apache.hadoop.fs.s3a.S3AFileSystem;
> +import org.apache.hadoop.fs.s3a.Tristate;
> +import org.apache.hadoop.fs.s3a.s3guard.DirListingMetadata;
> +import org.apache.hadoop.fs.s3a.s3guard.MetadataStore;
> +import org.junit.Test;
>  
>  import static org.apache.hadoop.fs.contract.ContractTestUtils.dataset;
>  import static org.apache.hadoop.fs.contract.ContractTestUtils.writeDataset;
> @@ -72,4 +77,24 @@ public void testRenameDirIntoExistingDir() throws 
> Throwable {
>  boolean rename = fs.rename(srcDir, destDir);
>  assertFalse("s3a doesn't support rename to non-empty directory", rename);
>}
> +
> +  @Test
> +  public void testMkdirPopulatesFileAncestors() throws Exception {
> +final FileSystem fs = getFileSystem();
> +final MetadataStore ms = ((S3AFileSystem) fs).getMetadataStore();
> +final Path parent = path("testMkdirPopulatesFileAncestors/source");
> +try {
> +  fs.mkdirs(parent);
> +  final Path nestedFile = new Path(parent, "dir1/dir2/dir3/file4");
> +  byte[] srcDataset = dataset(256, 'a', 'z');
> +  writeDataset(fs, nestedFile, srcDataset, srcDataset.length,
> +  1024, false);
> +
> +  DirListingMetadata list = ms.listChildren(parent);
> +  assertTrue("MetadataStore falsely reports authoritative empty list",
> +  list.isEmpty() == Tristate.FALSE || !list.isAuthoritative());
> +} finally {
> +  fs.delete(parent, true);
> +}
> +  }
>  }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-14481) Print stack trace when native bzip2 library does not load

2017-06-02 Thread Wei-Chiu Chuang (JIRA)

Wei-Chiu Chuang created HADOOP-14481:


 Summary: Print stack trace when native bzip2 library does not load
 Key: HADOOP-14481
 URL: https://issues.apache.org/jira/browse/HADOOP-14481
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Reporter: Wei-Chiu Chuang
Assignee: Wei-Chiu Chuang
Priority: Minor


When I run hadoop checknative on my machine, it was not able to load system 
bzip2 library and printed the following message.

17/06/02 09:25:42 WARN bzip2.Bzip2Factory: Failed to load/initialize 
native-bzip2 library system-native, will use pure-Java version

Reviewing the relevant code, it fails because of an exception. However, that 
exception is not logged. We should print the stacktrace, at least at debug log 
level.

{code:title=Bzip2Factory#isNativeBzip2Loaded()}
try {
  // Initialize the native library.
  Bzip2Compressor.initSymbols(libname);
  Bzip2Decompressor.initSymbols(libname);
  nativeBzip2Loaded = true;
  LOG.info("Successfully loaded & initialized native-bzip2 library " +
   libname);
} catch (Throwable t) {
  LOG.warn("Failed to load/initialize native-bzip2 library " + 
   libname + ", will use pure-Java version");
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14481) Print stack trace when native bzip2 library does not load

2017-06-02 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HADOOP-14481:
-
Description: 
When I ran hadoop checknative on my machine, it was not able to load system 
bzip2 library and printed the following message.

17/06/02 09:25:42 WARN bzip2.Bzip2Factory: Failed to load/initialize 
native-bzip2 library system-native, will use pure-Java version

Reviewing the relevant code, it fails because of an exception. However, that 
exception is not logged. We should print the stacktrace, at least at debug log 
level.

{code:title=Bzip2Factory#isNativeBzip2Loaded()}
try {
  // Initialize the native library.
  Bzip2Compressor.initSymbols(libname);
  Bzip2Decompressor.initSymbols(libname);
  nativeBzip2Loaded = true;
  LOG.info("Successfully loaded & initialized native-bzip2 library " +
   libname);
} catch (Throwable t) {
  LOG.warn("Failed to load/initialize native-bzip2 library " + 
   libname + ", will use pure-Java version");
}
{code}

  was:
When I run hadoop checknative on my machine, it was not able to load system 
bzip2 library and printed the following message.

17/06/02 09:25:42 WARN bzip2.Bzip2Factory: Failed to load/initialize 
native-bzip2 library system-native, will use pure-Java version

Reviewing the relevant code, it fails because of an exception. However, that 
exception is not logged. We should print the stacktrace, at least at debug log 
level.

{code:title=Bzip2Factory#isNativeBzip2Loaded()}
try {
  // Initialize the native library.
  Bzip2Compressor.initSymbols(libname);
  Bzip2Decompressor.initSymbols(libname);
  nativeBzip2Loaded = true;
  LOG.info("Successfully loaded & initialized native-bzip2 library " +
   libname);
} catch (Throwable t) {
  LOG.warn("Failed to load/initialize native-bzip2 library " + 
   libname + ", will use pure-Java version");
}
{code}


> Print stack trace when native bzip2 library does not load
> -
>
> Key: HADOOP-14481
> URL: https://issues.apache.org/jira/browse/HADOOP-14481
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: io
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>
> When I ran hadoop checknative on my machine, it was not able to load system 
> bzip2 library and printed the following message.
> 17/06/02 09:25:42 WARN bzip2.Bzip2Factory: Failed to load/initialize 
> native-bzip2 library system-native, will use pure-Java version
> Reviewing the relevant code, it fails because of an exception. However, that 
> exception is not logged. We should print the stacktrace, at least at debug 
> log level.
> {code:title=Bzip2Factory#isNativeBzip2Loaded()}
> try {
>   // Initialize the native library.
>   Bzip2Compressor.initSymbols(libname);
>   Bzip2Decompressor.initSymbols(libname);
>   nativeBzip2Loaded = true;
>   LOG.info("Successfully loaded & initialized native-bzip2 library " +
>libname);
> } catch (Throwable t) {
>   LOG.warn("Failed to load/initialize native-bzip2 library " + 
>libname + ", will use pure-Java version");
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14474) Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures

2017-06-02 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035041#comment-16035041
 ] 

Xiao Chen commented on HADOOP-14474:


Thanks Allen, created HADOOP-14480 for the long shot.

> Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures
> --
>
> Key: HADOOP-14474
> URL: https://issues.apache.org/jira/browse/HADOOP-14474
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 2.8.0, 2.7.3
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
> Fix For: 2.9.0, 2.7.4, 2.6.6, 2.8.2
>
> Attachments: HADOOP-14474-branch-2.01.patch
>
>
> Recently Oracle has changed the download link for Oracle JDK7, and that's why 
> oracle-java7-installer fails. Precommit jobs for branch-2* are failing 
> because of this failure.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14480) Remove Oracle JDK usage in Dockerfile

2017-06-02 Thread Xiao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-14480:
---
Description: Further to the discussions in HADOOP-14474, we should look for 
a long-term solution that doesn't use Oracle JDKs.

> Remove Oracle JDK usage in Dockerfile
> -
>
> Key: HADOOP-14480
> URL: https://issues.apache.org/jira/browse/HADOOP-14480
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Xiao Chen
>
> Further to the discussions in HADOOP-14474, we should look for a long-term 
> solution that doesn't use Oracle JDKs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-14480) Remove Oracle JDK usage in Dockerfile

2017-06-02 Thread Xiao Chen (JIRA)

Xiao Chen created HADOOP-14480:
--

 Summary: Remove Oracle JDK usage in Dockerfile
 Key: HADOOP-14480
 URL: https://issues.apache.org/jira/browse/HADOOP-14480
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Xiao Chen






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14474) Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures

2017-06-02 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035034#comment-16035034
 ] 

Allen Wittenauer commented on HADOOP-14474:
---

As a sidenote, I hope people are aware this is likely "the first shot".  I 
wouldn't be surprised to see Oracle JDK 8 eventually also require an Oracle 
account.  We should probably consider moving off of Oracle JDKs in the 
Dockerfile completely.

> Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures
> --
>
> Key: HADOOP-14474
> URL: https://issues.apache.org/jira/browse/HADOOP-14474
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 2.8.0, 2.7.3
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
> Fix For: 2.9.0, 2.7.4, 2.6.6, 2.8.2
>
> Attachments: HADOOP-14474-branch-2.01.patch
>
>
> Recently Oracle has changed the download link for Oracle JDK7, and that's why 
> oracle-java7-installer fails. Precommit jobs for branch-2* are failing 
> because of this failure.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14474) Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures

2017-06-02 Thread Xiao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-14474:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.2
   2.6.6
   2.7.4
   2.9.0
   Status: Resolved  (was: Patch Available)

Pushed to branch-2 and all the way down to branch-2.6, should make pre-commit 
happy.

Thanks Akira for the fix, Allen for review, and everyone for discussion!

> Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures
> --
>
> Key: HADOOP-14474
> URL: https://issues.apache.org/jira/browse/HADOOP-14474
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 2.8.0, 2.7.3
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
> Fix For: 2.9.0, 2.7.4, 2.6.6, 2.8.2
>
> Attachments: HADOOP-14474-branch-2.01.patch
>
>
> Recently Oracle has changed the download link for Oracle JDK7, and that's why 
> oracle-java7-installer fails. Precommit jobs for branch-2* are failing 
> because of this failure.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14445) Delegation tokens are not shared between KMS instances

2017-06-02 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035008#comment-16035008
 ] 

Yongjun Zhang commented on HADOOP-14445:


Thanks [~daryn].

{quote}
No more config-based solutions
{quote}
It seems unavoidable if we want to implement ksm-nameservice.

For this jira, maybe we just go with what [~shahrs87] has, use key with 
concatenated host, if can't find match, fallback to original key format.  If 
user add new KMS and replace KMS, the clients need to be restarted with the new 
config.

What do you think [~shahrs87], [~asuresh] and [~daryn]?

Thanks.


> Delegation tokens are not shared between KMS instances
> --
>
> Key: HADOOP-14445
> URL: https://issues.apache.org/jira/browse/HADOOP-14445
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation, kms
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Wei-Chiu Chuang
>Assignee: Rushabh S Shah
> Attachments: HADOOP-14445-branch-2.8.patch
>
>
> As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do 
> not share delegation tokens. (a client uses KMS address/port as the key for 
> delegation token)
> {code:title=DelegationTokenAuthenticatedURL#openConnection}
> if (!creds.getAllTokens().isEmpty()) {
> InetSocketAddress serviceAddr = new InetSocketAddress(url.getHost(),
> url.getPort());
> Text service = SecurityUtil.buildTokenService(serviceAddr);
> dToken = creds.getToken(service);
> {code}
> But KMS doc states:
> {quote}
> Delegation Tokens
> Similar to HTTP authentication, KMS uses Hadoop Authentication for delegation 
> tokens too.
> Under HA, A KMS instance must verify the delegation token given by another 
> KMS instance, by checking the shared secret used to sign the delegation 
> token. To do this, all KMS instances must be able to retrieve the shared 
> secret from ZooKeeper.
> {quote}
> We should either update the KMS documentation, or fix this code to share 
> delegation tokens.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14474) Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures

2017-06-02 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034999#comment-16034999
 ] 

Xiao Chen commented on HADOOP-14474:


Thanks [~aw] for kicking off a jenkins job and reviewing. 
bq. This should probably work. +1
+1... will commit shortly.

> Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures
> --
>
> Key: HADOOP-14474
> URL: https://issues.apache.org/jira/browse/HADOOP-14474
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 2.8.0, 2.7.3
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
> Attachments: HADOOP-14474-branch-2.01.patch
>
>
> Recently Oracle has changed the download link for Oracle JDK7, and that's why 
> oracle-java7-installer fails. Precommit jobs for branch-2* are failing 
> because of this failure.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14429) getFsAction method of FTPFileSystem always returned FsAction.NONE

2017-06-02 Thread Hongyuan Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongyuan Li updated HADOOP-14429:
-
Summary: getFsAction method of FTPFileSystem  always returned FsAction.NONE 
 (was: getFsAction method of FTPFileSystem  always return FsAction.NONE)

> getFsAction method of FTPFileSystem  always returned FsAction.NONE
> --
>
> Key: HADOOP-14429
> URL: https://issues.apache.org/jira/browse/HADOOP-14429
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.0-alpha2
>Reporter: Hongyuan Li
>Assignee: Hongyuan Li
>Priority: Trivial
> Attachments: HADOOP-14429-001.patch, HADOOP-14429-002.patch, 
> HADOOP-14429-003.patch
>
>
>   
> {code}
> private FsAction getFsAction(int accessGroup, FTPFile ftpFile) {
>   FsAction action = FsAction.NONE;
>   if (ftpFile.hasPermission(accessGroup, FTPFile.READ_PERMISSION)) {
>   action.or(FsAction.READ);
>   }
> if (ftpFile.hasPermission(accessGroup, FTPFile.WRITE_PERMISSION)) {
>   action.or(FsAction.WRITE);
> }
> if (ftpFile.hasPermission(accessGroup, FTPFile.EXECUTE_PERMISSION)) {
>   action.or(FsAction.EXECUTE);
> }
> return action;
>   }
> {code}
> from code above, we can see that the  getFsAction method doesnot modify the 
> action generated by FsAction action = FsAction.NONE,which means it return 
> FsAction.NONE all the time;



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-14429) getFsAction method of FTPFileSystem always return FsAction.NONE

2017-06-02 Thread Hongyuan Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16032531#comment-16032531
 ] 

Hongyuan Li edited comment on HADOOP-14429 at 6/2/17 4:16 PM:
--

[~yzhangal] patch-002 does as what you comment.
[~ste...@apache.org] Would you mind giving me a code review?


was (Author: hongyuan li):
[~yzhangal] patch-002 does as what you comment.

> getFsAction method of FTPFileSystem  always return FsAction.NONE
> 
>
> Key: HADOOP-14429
> URL: https://issues.apache.org/jira/browse/HADOOP-14429
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.0-alpha2
>Reporter: Hongyuan Li
>Assignee: Hongyuan Li
>Priority: Trivial
> Attachments: HADOOP-14429-001.patch, HADOOP-14429-002.patch, 
> HADOOP-14429-003.patch
>
>
>   
> {code}
> private FsAction getFsAction(int accessGroup, FTPFile ftpFile) {
>   FsAction action = FsAction.NONE;
>   if (ftpFile.hasPermission(accessGroup, FTPFile.READ_PERMISSION)) {
>   action.or(FsAction.READ);
>   }
> if (ftpFile.hasPermission(accessGroup, FTPFile.WRITE_PERMISSION)) {
>   action.or(FsAction.WRITE);
> }
> if (ftpFile.hasPermission(accessGroup, FTPFile.EXECUTE_PERMISSION)) {
>   action.or(FsAction.EXECUTE);
> }
> return action;
>   }
> {code}
> from code above, we can see that the  getFsAction method doesnot modify the 
> action generated by FsAction action = FsAction.NONE,which means it return 
> FsAction.NONE all the time;



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14469) the listStatus method of FTPFileSystem should filter the path "." and ".."

2017-06-02 Thread Hongyuan Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongyuan Li updated HADOOP-14469:
-
Summary: the listStatus method of FTPFileSystem should filter the path "."  
and ".."  (was: the listStatus method of FTPFileSystem should ignore the path 
"."  and "..")

> the listStatus method of FTPFileSystem should filter the path "."  and ".."
> ---
>
> Key: HADOOP-14469
> URL: https://issues.apache.org/jira/browse/HADOOP-14469
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: Hongyuan Li
>Assignee: Hongyuan Li
> Attachments: HADOOP-14469-001.patch, HADOOP-14469-002.patch, 
> HADOOP-14469-003.patch
>
>
> for some ftpsystems, liststatus method will return new Path(".") and new 
> Path(".."), thus causing list op looping.for example, Serv-U
> We can see the logic in code below:
> {code}
>   private FileStatus[] listStatus(FTPClient client, Path file)
>   throws IOException {
> ……
> FileStatus[] fileStats = new FileStatus[ftpFiles.length];
> for (int i = 0; i < ftpFiles.length; i++) {
>   fileStats[i] = getFileStatus(ftpFiles[i], absolute);
> }
> return fileStats;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14470) the ternary operator in create method in class CommandWithDestination is redundant

2017-06-02 Thread Hongyuan Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongyuan Li updated HADOOP-14470:
-
Description: 
in if statement，the lazyPersist  is always true, thus the ternary operator is 
redundant，
{{lazyPersist == true}} in if statment, so {{lazyPersist ? 1 : 
getDefaultReplication(item.path)}} is redundant.

  related code like below, which is in 
{{org.apache.hadoop.fs.shell.CommandWithDestination}}  lineNumber : 504 :
{code:java}
   FSDataOutputStream create(PathData item, boolean lazyPersist,
boolean direct)
throws IOException {
  try {
if (lazyPersist) { // in if stament, lazyPersist is always true
  ……
  return create(item.path,
FsPermission.getFileDefault().applyUMask(
FsPermission.getUMask(getConf())),
createFlags,
getConf().getInt(IO_FILE_BUFFER_SIZE_KEY,
IO_FILE_BUFFER_SIZE_DEFAULT),
lazyPersist ? 1 : getDefaultReplication(item.path), // 
*this is redundant*
getDefaultBlockSize(),
null,
null);
} else {
  return create(item.path, true);
}
  } finally { // might have been created but stream was interrupted
if (!direct) {
  deleteOnExit(item.path);
}
  }
}

{code}

  was:
in if statement，the lazyPersist  is always true, thus the ternary operator is 
redundant，
{{lazyPersist == true}} in if statment, so {{ lazyPersist ? 1 : 
getDefaultReplication(item.path) }} is redundant.

  related code like below, which is in
 {{org.apache.hadoop.fs.shell.CommandWithDestination }} 
, lineNumber : 504 :
{code:java}
   FSDataOutputStream create(PathData item, boolean lazyPersist,
boolean direct)
throws IOException {
  try {
if (lazyPersist) { // in if stament, lazyPersist is always true
  ……
  return create(item.path,
FsPermission.getFileDefault().applyUMask(
FsPermission.getUMask(getConf())),
createFlags,
getConf().getInt(IO_FILE_BUFFER_SIZE_KEY,
IO_FILE_BUFFER_SIZE_DEFAULT),
lazyPersist ? 1 : getDefaultReplication(item.path), // 
*this is redundant*
getDefaultBlockSize(),
null,
null);
} else {
  return create(item.path, true);
}
  } finally { // might have been created but stream was interrupted
if (!direct) {
  deleteOnExit(item.path);
}
  }
}

{code}


> the ternary operator  in create method in class CommandWithDestination is 
> redundant
> ---
>
> Key: HADOOP-14470
> URL: https://issues.apache.org/jira/browse/HADOOP-14470
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Hongyuan Li
>Assignee: Hongyuan Li
>Priority: Trivial
> Attachments: HADOOP-14470-001.patch
>
>
> in if statement，the lazyPersist  is always true, thus the ternary operator is 
> redundant，
> {{lazyPersist == true}} in if statment, so {{lazyPersist ? 1 : 
> getDefaultReplication(item.path)}} is redundant.
>   related code like below, which is in 
> {{org.apache.hadoop.fs.shell.CommandWithDestination}}  lineNumber : 504 :
> {code:java}
>FSDataOutputStream create(PathData item, boolean lazyPersist,
> boolean direct)
> throws IOException {
>   try {
> if (lazyPersist) { // in if stament, lazyPersist is always true
>   ……
>   return create(item.path,
> FsPermission.getFileDefault().applyUMask(
> FsPermission.getUMask(getConf())),
> createFlags,
> getConf().getInt(IO_FILE_BUFFER_SIZE_KEY,
> IO_FILE_BUFFER_SIZE_DEFAULT),
> lazyPersist ? 1 : getDefaultReplication(item.path), 
> // *this is redundant*
> getDefaultBlockSize(),
> null,
> null);
> } else {
>   return create(item.path, true);
> }
>   } finally { // might have been created but stream was interrupted
> if (!direct) {
>   deleteOnExit(item.path);
> }
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail:

[jira] [Updated] (HADOOP-14470) the ternary operator in create method in class CommandWithDestination is redundant

2017-06-02 Thread Hongyuan Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongyuan Li updated HADOOP-14470:
-
Description: 
in if statement，the lazyPersist  is always true, thus the ternary operator is 
redundant，
{{lazyPersist == true}} in if statment, so {{ lazyPersist ? 1 : 
getDefaultReplication(item.path) }} is redundant.

  related code like below, which is in
 {{org.apache.hadoop.fs.shell.CommandWithDestination }} 
, lineNumber : 504 :
{code:java}
   FSDataOutputStream create(PathData item, boolean lazyPersist,
boolean direct)
throws IOException {
  try {
if (lazyPersist) { // in if stament, lazyPersist is always true
  ……
  return create(item.path,
FsPermission.getFileDefault().applyUMask(
FsPermission.getUMask(getConf())),
createFlags,
getConf().getInt(IO_FILE_BUFFER_SIZE_KEY,
IO_FILE_BUFFER_SIZE_DEFAULT),
lazyPersist ? 1 : getDefaultReplication(item.path), // 
*this is redundant*
getDefaultBlockSize(),
null,
null);
} else {
  return create(item.path, true);
}
  } finally { // might have been created but stream was interrupted
if (!direct) {
  deleteOnExit(item.path);
}
  }
}

{code}

  was:
in if statement，the lazyPersist  is always true, thus the ternary operator is 
redundant，
{{lazyPersist == true}} in if statment, so {{ lazyPersist ? 1 : 
getDefaultReplication(item.path) }} is redundant.

  related code like below, which is in {{ 
org.apache.hadoop.fs.shell.CommandWithDestination }} , lineNumber : 504 :
{code:java}
   FSDataOutputStream create(PathData item, boolean lazyPersist,
boolean direct)
throws IOException {
  try {
if (lazyPersist) { // in if stament, lazyPersist is always true
  ……
  return create(item.path,
FsPermission.getFileDefault().applyUMask(
FsPermission.getUMask(getConf())),
createFlags,
getConf().getInt(IO_FILE_BUFFER_SIZE_KEY,
IO_FILE_BUFFER_SIZE_DEFAULT),
lazyPersist ? 1 : getDefaultReplication(item.path), // 
*this is redundant*
getDefaultBlockSize(),
null,
null);
} else {
  return create(item.path, true);
}
  } finally { // might have been created but stream was interrupted
if (!direct) {
  deleteOnExit(item.path);
}
  }
}

{code}


> the ternary operator  in create method in class CommandWithDestination is 
> redundant
> ---
>
> Key: HADOOP-14470
> URL: https://issues.apache.org/jira/browse/HADOOP-14470
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Hongyuan Li
>Assignee: Hongyuan Li
>Priority: Trivial
> Attachments: HADOOP-14470-001.patch
>
>
> in if statement，the lazyPersist  is always true, thus the ternary operator is 
> redundant，
> {{lazyPersist == true}} in if statment, so {{ lazyPersist ? 1 : 
> getDefaultReplication(item.path) }} is redundant.
>   related code like below, which is in
>  {{org.apache.hadoop.fs.shell.CommandWithDestination }} 
> , lineNumber : 504 :
> {code:java}
>FSDataOutputStream create(PathData item, boolean lazyPersist,
> boolean direct)
> throws IOException {
>   try {
> if (lazyPersist) { // in if stament, lazyPersist is always true
>   ……
>   return create(item.path,
> FsPermission.getFileDefault().applyUMask(
> FsPermission.getUMask(getConf())),
> createFlags,
> getConf().getInt(IO_FILE_BUFFER_SIZE_KEY,
> IO_FILE_BUFFER_SIZE_DEFAULT),
> lazyPersist ? 1 : getDefaultReplication(item.path), 
> // *this is redundant*
> getDefaultBlockSize(),
> null,
> null);
> } else {
>   return create(item.path, true);
> }
>   } finally { // might have been created but stream was interrupted
> if (!direct) {
>   deleteOnExit(item.path);
> }
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail:

[jira] [Updated] (HADOOP-14470) the ternary operator in create method in class CommandWithDestination is redundant

2017-06-02 Thread Hongyuan Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongyuan Li updated HADOOP-14470:
-
Description: 
in if statement，the lazyPersist  is always true, thus the ternary operator is 
redundant，
{{lazyPersist == true}} in if statment, so {{ lazyPersist ? 1 : 
getDefaultReplication(item.path) }} is redundant.

  related code like below, which is in {{ 
org.apache.hadoop.fs.shell.CommandWithDestination }} , lineNumber : 504 :
{code:java}
   FSDataOutputStream create(PathData item, boolean lazyPersist,
boolean direct)
throws IOException {
  try {
if (lazyPersist) { // in if stament, lazyPersist is always true
  ……
  return create(item.path,
FsPermission.getFileDefault().applyUMask(
FsPermission.getUMask(getConf())),
createFlags,
getConf().getInt(IO_FILE_BUFFER_SIZE_KEY,
IO_FILE_BUFFER_SIZE_DEFAULT),
lazyPersist ? 1 : getDefaultReplication(item.path), // 
*this is redundant*
getDefaultBlockSize(),
null,
null);
} else {
  return create(item.path, true);
}
  } finally { // might have been created but stream was interrupted
if (!direct) {
  deleteOnExit(item.path);
}
  }
}

{code}

  was:
in if statement，the lazyPersist  is always true, thus the ternary operator is 
redundant，
{{lazyPersist == true}} in if statment, so {{ lazyPersist ? 1 : 
getDefaultReplication(item.path), }} is redundant.

  related code like below, which is in {{ 
org.apache.hadoop.fs.shell.CommandWithDestination}}, lineNumber : 504 :
{code:java}
   FSDataOutputStream create(PathData item, boolean lazyPersist,
boolean direct)
throws IOException {
  try {
if (lazyPersist) { // in if stament, lazyPersist is always true
  ……
  return create(item.path,
FsPermission.getFileDefault().applyUMask(
FsPermission.getUMask(getConf())),
createFlags,
getConf().getInt(IO_FILE_BUFFER_SIZE_KEY,
IO_FILE_BUFFER_SIZE_DEFAULT),
lazyPersist ? 1 : getDefaultReplication(item.path), // 
*this is redundant*
getDefaultBlockSize(),
null,
null);
} else {
  return create(item.path, true);
}
  } finally { // might have been created but stream was interrupted
if (!direct) {
  deleteOnExit(item.path);
}
  }
}

{code}


> the ternary operator  in create method in class CommandWithDestination is 
> redundant
> ---
>
> Key: HADOOP-14470
> URL: https://issues.apache.org/jira/browse/HADOOP-14470
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Hongyuan Li
>Assignee: Hongyuan Li
>Priority: Trivial
> Attachments: HADOOP-14470-001.patch
>
>
> in if statement，the lazyPersist  is always true, thus the ternary operator is 
> redundant，
> {{lazyPersist == true}} in if statment, so {{ lazyPersist ? 1 : 
> getDefaultReplication(item.path) }} is redundant.
>   related code like below, which is in {{ 
> org.apache.hadoop.fs.shell.CommandWithDestination }} , lineNumber : 504 :
> {code:java}
>FSDataOutputStream create(PathData item, boolean lazyPersist,
> boolean direct)
> throws IOException {
>   try {
> if (lazyPersist) { // in if stament, lazyPersist is always true
>   ……
>   return create(item.path,
> FsPermission.getFileDefault().applyUMask(
> FsPermission.getUMask(getConf())),
> createFlags,
> getConf().getInt(IO_FILE_BUFFER_SIZE_KEY,
> IO_FILE_BUFFER_SIZE_DEFAULT),
> lazyPersist ? 1 : getDefaultReplication(item.path), 
> // *this is redundant*
> getDefaultBlockSize(),
> null,
> null);
> } else {
>   return create(item.path, true);
> }
>   } finally { // might have been created but stream was interrupted
> if (!direct) {
>   deleteOnExit(item.path);
> }
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail:

[jira] [Updated] (HADOOP-14470) the ternary operator in create method in class CommandWithDestination is redundant

2017-06-02 Thread Hongyuan Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongyuan Li updated HADOOP-14470:
-
Description: 
in if statement，the lazyPersist  is always true, thus the ternary operator is 
redundant，
{{lazyPersist == true}} in if statment, so {{ lazyPersist ? 1 : 
getDefaultReplication(item.path), }} is redundant.

  related code like below, which is in {{ 
org.apache.hadoop.fs.shell.CommandWithDestination}}, lineNumber : 504 :
{code:java}
   FSDataOutputStream create(PathData item, boolean lazyPersist,
boolean direct)
throws IOException {
  try {
if (lazyPersist) { // in if stament, lazyPersist is always true
  ……
  return create(item.path,
FsPermission.getFileDefault().applyUMask(
FsPermission.getUMask(getConf())),
createFlags,
getConf().getInt(IO_FILE_BUFFER_SIZE_KEY,
IO_FILE_BUFFER_SIZE_DEFAULT),
lazyPersist ? 1 : getDefaultReplication(item.path), // 
*this is redundant*
getDefaultBlockSize(),
null,
null);
} else {
  return create(item.path, true);
}
  } finally { // might have been created but stream was interrupted
if (!direct) {
  deleteOnExit(item.path);
}
  }
}

{code}

  was:
in if statement，the lazyPersist  is always true, thus the ternary operator is 
redundant，related code like below:
{code:java}
   FSDataOutputStream create(PathData item, boolean lazyPersist,
boolean direct)
throws IOException {
  try {
if (lazyPersist) { // in if stament, lazyPersist is always true
  ……
  return create(item.path,
FsPermission.getFileDefault().applyUMask(
FsPermission.getUMask(getConf())),
createFlags,
getConf().getInt(IO_FILE_BUFFER_SIZE_KEY,
IO_FILE_BUFFER_SIZE_DEFAULT),
lazyPersist ? 1 : getDefaultReplication(item.path), // 
*this is redundant*
getDefaultBlockSize(),
null,
null);
} else {
  return create(item.path, true);
}
  } finally { // might have been created but stream was interrupted
if (!direct) {
  deleteOnExit(item.path);
}
  }
}

{code}


> the ternary operator  in create method in class CommandWithDestination is 
> redundant
> ---
>
> Key: HADOOP-14470
> URL: https://issues.apache.org/jira/browse/HADOOP-14470
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Hongyuan Li
>Assignee: Hongyuan Li
>Priority: Trivial
> Attachments: HADOOP-14470-001.patch
>
>
> in if statement，the lazyPersist  is always true, thus the ternary operator is 
> redundant，
> {{lazyPersist == true}} in if statment, so {{ lazyPersist ? 1 : 
> getDefaultReplication(item.path), }} is redundant.
>   related code like below, which is in {{ 
> org.apache.hadoop.fs.shell.CommandWithDestination}}, lineNumber : 504 :
> {code:java}
>FSDataOutputStream create(PathData item, boolean lazyPersist,
> boolean direct)
> throws IOException {
>   try {
> if (lazyPersist) { // in if stament, lazyPersist is always true
>   ……
>   return create(item.path,
> FsPermission.getFileDefault().applyUMask(
> FsPermission.getUMask(getConf())),
> createFlags,
> getConf().getInt(IO_FILE_BUFFER_SIZE_KEY,
> IO_FILE_BUFFER_SIZE_DEFAULT),
> lazyPersist ? 1 : getDefaultReplication(item.path), 
> // *this is redundant*
> getDefaultBlockSize(),
> null,
> null);
> } else {
>   return create(item.path, true);
> }
>   } finally { // might have been created but stream was interrupted
> if (!direct) {
>   deleteOnExit(item.path);
> }
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14474) Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures

2017-06-02 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034925#comment-16034925
 ] 

Allen Wittenauer commented on HADOOP-14474:
---

This should probably work. +1

> Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures
> --
>
> Key: HADOOP-14474
> URL: https://issues.apache.org/jira/browse/HADOOP-14474
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 2.8.0, 2.7.3
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
> Attachments: HADOOP-14474-branch-2.01.patch
>
>
> Recently Oracle has changed the download link for Oracle JDK7, and that's why 
> oracle-java7-installer fails. Precommit jobs for branch-2* are failing 
> because of this failure.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14474) Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures

2017-06-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034842#comment-16034842
 ] 

Hadoop QA commented on HADOOP-14474:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m  
1s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
 0s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} shelldocs {color} | {color:green}  0m  
8s{color} | {color:green} There were no new shelldocs issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 15m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:8515d35 |
| JIRA Issue | HADOOP-14474 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12870747/HADOOP-14474-branch-2.01.patch
 |
| Optional Tests |  asflicense  shellcheck  shelldocs  |
| uname | Linux a9eee9084235 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | branch-2 / 8e119f1 |
| shellcheck | v0.4.6 |
| modules | C:  U:  |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12434/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures
> --
>
> Key: HADOOP-14474
> URL: https://issues.apache.org/jira/browse/HADOOP-14474
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 2.8.0, 2.7.3
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
> Attachments: HADOOP-14474-branch-2.01.patch
>
>
> Recently Oracle has changed the download link for Oracle JDK7, and that's why 
> oracle-java7-installer fails. Precommit jobs for branch-2* are failing 
> because of this failure.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14436) Remove the redundant colon in ViewFs.md

2017-06-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034832#comment-16034832
 ] 

Hudson commented on HADOOP-14436:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11818 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11818/])
HADOOP-14436. Remove the redundant colon in ViewFs.md. Contributed by (brahma: 
rev 056cc72885471d6952ff182670e4b4a38421603d)
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ViewFs.md


> Remove the redundant colon in ViewFs.md
> ---
>
> Key: HADOOP-14436
> URL: https://issues.apache.org/jira/browse/HADOOP-14436
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1, 3.0.0-alpha2
>Reporter: maobaolong
>Assignee: maobaolong
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HADOOP-14436.patch
>
>
> Minor mistake can led the beginner to the wrong way and getting far away from 
> us.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-12360) Create StatsD metrics2 sink

2017-06-02 Thread Michael Moss (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034826#comment-16034826
 ] 

Michael Moss commented on HADOOP-12360:
---

Hi, I'm curious what this section of code is trying to achieve:
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/sink/StatsDSink.java#L103

It seems that in some cases, for some metrics (JVM metrics for example), the sn 
(serviceName) variable is overridden, which breaks the configured prefix: 
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/sink/StatsDSink.java#L91

Wondering if this was intended?

> Create StatsD metrics2 sink
> ---
>
> Key: HADOOP-12360
> URL: https://issues.apache.org/jira/browse/HADOOP-12360
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 2.7.1
>Reporter: Dave Marion
>Assignee: Dave Marion
>Priority: Minor
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: HADOOP-12360.001.patch, HADOOP-12360.002.patch, 
> HADOOP-12360.003.patch, HADOOP-12360.004.patch, HADOOP-12360.005.patch, 
> HADOOP-12360.006.patch, HADOOP-12360.007.patch, HADOOP-12360.008.patch, 
> HADOOP-12360.009.patch, HADOOP-12360.010.patch
>
>
> Create a metrics sink that pushes to a StatsD daemon.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14474) Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures

2017-06-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034819#comment-16034819
 ] 

Hadoop QA commented on HADOOP-14474:


(!) A patch to the testing environment has been detected. 
Re-executing against the patched versions to perform further tests. 
The console is at 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12434/console in case of 
problems.


> Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures
> --
>
> Key: HADOOP-14474
> URL: https://issues.apache.org/jira/browse/HADOOP-14474
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 2.8.0, 2.7.3
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
> Attachments: HADOOP-14474-branch-2.01.patch
>
>
> Recently Oracle has changed the download link for Oracle JDK7, and that's why 
> oracle-java7-installer fails. Precommit jobs for branch-2* are failing 
> because of this failure.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14436) Remove the redundant colon in ViewFs.md

2017-06-02 Thread Brahma Reddy Battula (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HADOOP-14436:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha4
   2.9.0
   Status: Resolved  (was: Patch Available)

Committed to {{trunk}} and {{branch-2}}. [~maobaolong] thanks for your 
contribution.

> Remove the redundant colon in ViewFs.md
> ---
>
> Key: HADOOP-14436
> URL: https://issues.apache.org/jira/browse/HADOOP-14436
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1, 3.0.0-alpha2
>Reporter: maobaolong
>Assignee: maobaolong
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HADOOP-14436.patch
>
>
> Minor mistake can led the beginner to the wrong way and getting far away from 
> us.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-14479) Erasurecode testcase failures with ISA-L

2017-06-02 Thread Ayappan (JIRA)

Ayappan created HADOOP-14479:


 Summary: Erasurecode testcase failures with ISA-L 
 Key: HADOOP-14479
 URL: https://issues.apache.org/jira/browse/HADOOP-14479
 Project: Hadoop Common
  Issue Type: Bug
  Components: common
Affects Versions: 3.0.0-alpha3
 Environment: x86_64 Ubuntu 16.04.02 LTS
Reporter: Ayappan


I built hadoop with ISA-L support. I took the ISA-L code from 
https://github.com/01org/isa-l  (tag v2.18.0) and built it. While running the 
UTs , following three testcases are failing

1)TestHHXORErasureCoder

Tests run: 7, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 1.106 sec <<< 
FAILURE! - in org.apache.hadoop.io.erasurecode.coder.TestHHXORErasureCoder
testCodingDirectBuffer_10x4_erasing_p1(org.apache.hadoop.io.erasurecode.coder.TestHHXORErasureCoder)
  Time elapsed: 0.029 sec  <<< FAILURE!
java.lang.AssertionError: Decoding and comparing failed.
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.io.erasurecode.TestCoderBase.compareAndVerify(TestCoderBase.java:170)
at 
org.apache.hadoop.io.erasurecode.coder.TestErasureCoderBase.compareAndVerify(TestErasureCoderBase.java:141)
at 
org.apache.hadoop.io.erasurecode.coder.TestErasureCoderBase.performTestCoding(TestErasureCoderBase.java:98)
at 
org.apache.hadoop.io.erasurecode.coder.TestErasureCoderBase.testCoding(TestErasureCoderBase.java:69)
at 
org.apache.hadoop.io.erasurecode.coder.TestHHXORErasureCoder.testCodingDirectBuffer_10x4_erasing_p1(TestHHXORErasureCoder.java:64)


2)TestRSErasureCoder

Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.591 sec - in 
org.apache.hadoop.io.erasurecode.coder.TestXORCoder
Running org.apache.hadoop.io.erasurecode.coder.TestRSErasureCoder
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x7f486a28a6e4, pid=8970, tid=0x7f4850927700
#
# JRE version: OpenJDK Runtime Environment (8.0_121-b13) (build 
1.8.0_121-8u121-b13-0ubuntu1.16.04.2-b13)
# Java VM: OpenJDK 64-Bit Server VM (25.121-b13 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# C  [libc.so.6+0x8e6e4]
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /home/ayappan/hadoop/hadoop-common-project/hadoop-common/hs_err_pid8970.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

3)TestCodecRawCoderMapping

Running org.apache.hadoop.io.erasurecode.TestCodecRawCoderMapping
Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.559 sec <<< 
FAILURE! - in org.apache.hadoop.io.erasurecode.TestCodecRawCoderMapping
testRSDefaultRawCoder(org.apache.hadoop.io.erasurecode.TestCodecRawCoderMapping)
  Time elapsed: 0.015 sec  <<< FAILURE!
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.hadoop.io.erasurecode.TestCodecRawCoderMapping.testRSDefaultRawCoder(TestCodecRawCoderMapping.java:58)




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14436) Remove the redundant colon in ViewFs.md

2017-06-02 Thread Brahma Reddy Battula (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HADOOP-14436:
--
Summary: Remove the redundant colon in ViewFs.md  (was: The ViewFs.md's 
minor error about a redundant colon)

> Remove the redundant colon in ViewFs.md
> ---
>
> Key: HADOOP-14436
> URL: https://issues.apache.org/jira/browse/HADOOP-14436
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1, 3.0.0-alpha2
>Reporter: maobaolong
>Assignee: maobaolong
> Attachments: HADOOP-14436.patch
>
>
> Minor mistake can led the beginner to the wrong way and getting far away from 
> us.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14473) Optimize NativeAzureFileSystem::seek for forward seeks

2017-06-02 Thread Rajesh Balamohan (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034683#comment-16034683
 ] 

Rajesh Balamohan commented on HADOOP-14473:
---

Since it was easier to combine this patch with HADOOP-14478, I have merged it 
and posted the revised patch there.

In the revised patch, I have fixed an issue in seek() and shared the test 
results as well there. Tests were run against "japan west region" end point.

{{BlobInputStream::skip()}} is more of a no-op call. Issue was related to 
closing the stream and opening it again via {{store.retrieve()}} as it would 
end up creating new {{BlobInputStream}}. And that would internally need 
additional http call as it needs to download blob attributes internally in 
{{BlobInputStream}}. This has been avoided in the patch. 

I completely agree that it would be good to get the instrumentation similar to 
s3a, and it was very useful. Please let me know if this could be done in 
incremental tickets.

> Optimize NativeAzureFileSystem::seek for forward seeks
> --
>
> Key: HADOOP-14473
> URL: https://issues.apache.org/jira/browse/HADOOP-14473
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HADOOP-14473-001.patch
>
>
> {{NativeAzureFileSystem::seek()}} closes and re-opens the inputstream 
> irrespective of forward/backward seek. It would be beneficial to re-open the 
> stream on backward seek.
> https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java#L889



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads

2017-06-02 Thread Rajesh Balamohan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan reassigned HADOOP-14478:
-

Assignee: Rajesh Balamohan

> Optimize NativeAzureFsInputStream for positional reads
> --
>
> Key: HADOOP-14478
> URL: https://issues.apache.org/jira/browse/HADOOP-14478
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HADOOP-14478.001.patch, HADOOP-14478.002.patch
>
>
> Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of 
> the data length requested for. This would be beneficial for sequential reads. 
> However, for positional reads (seek to specific location, read x number of 
> bytes, seek back to original location) this may not be beneficial and might 
> even download lot more data which are not used later.
> It would be good to override {{readFully(long position, byte[] buffer, int 
> offset, int length)}} for {{NativeAzureFsInputStream}} and make use of 
> {{mark(readLimit)}} as a hint to Azure's BlobInputStream.
> BlobInputStream reference: 
> https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448
> BlobInputStream can consider this as a hint later to determine the amount of 
> data to be read ahead. Changes to BlobInputStream would not be addressed in 
> this JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads

2017-06-02 Thread Rajesh Balamohan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HADOOP-14478:
--
Attachment: HADOOP-14478.002.patch

Attaching .2 version with fixes in seek().  Also attaching test results from 
hadoop-azure module.

My azure machine and endpoints are hosted in "Japan West region"

{noformat}

hdiuser@hn0:~/hadoop/hadoop-tools/hadoop-azure⟫ mvn test

...
..
Tests run: 16, Failures: 0, Errors: 0, Skipped: 16, Time elapsed: 0.421 sec - 
in org.apache.hadoop.fs.azure.TestFileSystemOperationExceptionHandling
Running org.apache.hadoop.fs.azure.TestAzureConcurrentOutOfBandIo
Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0.361 sec - in 
org.apache.hadoop.fs.azure.TestAzureConcurrentOutOfBandIo
Running org.apache.hadoop.fs.azure.TestAzureFileSystemErrorConditions
Tests run: 6, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 0.939 sec - in 
org.apache.hadoop.fs.azure.TestAzureFileSystemErrorConditions

Results :

Tests run: 703, Failures: 0, Errors: 0, Skipped: 436

[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 01:50 min
[INFO] Finished at: 2017-06-02T13:08:42+00:00
[INFO] Final Memory: 29M/1574M
[INFO] 
{noformat}

> Optimize NativeAzureFsInputStream for positional reads
> --
>
> Key: HADOOP-14478
> URL: https://issues.apache.org/jira/browse/HADOOP-14478
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Reporter: Rajesh Balamohan
> Attachments: HADOOP-14478.001.patch, HADOOP-14478.002.patch
>
>
> Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of 
> the data length requested for. This would be beneficial for sequential reads. 
> However, for positional reads (seek to specific location, read x number of 
> bytes, seek back to original location) this may not be beneficial and might 
> even download lot more data which are not used later.
> It would be good to override {{readFully(long position, byte[] buffer, int 
> offset, int length)}} for {{NativeAzureFsInputStream}} and make use of 
> {{mark(readLimit)}} as a hint to Azure's BlobInputStream.
> BlobInputStream reference: 
> https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448
> BlobInputStream can consider this as a hint later to determine the amount of 
> data to be read ahead. Changes to BlobInputStream would not be addressed in 
> this JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14436) The ViewFs.md's minor error about a redundant colon

2017-06-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034628#comment-16034628
 ] 

Hadoop QA commented on HADOOP-14436:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 16m  6s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HADOOP-14436 |
| GITHUB PR | https://github.com/apache/hadoop/pull/223 |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux 8e9a4f056b23 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 8d9084e |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/12433/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> The ViewFs.md's minor error about a redundant colon
> ---
>
> Key: HADOOP-14436
> URL: https://issues.apache.org/jira/browse/HADOOP-14436
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1, 3.0.0-alpha2
>Reporter: maobaolong
>Assignee: maobaolong
> Attachments: HADOOP-14436.patch
>
>
> Minor mistake can led the beginner to the wrong way and getting far away from 
> us.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14436) The ViewFs.md's minor error about a redundant colon

2017-06-02 Thread Brahma Reddy Battula (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034613#comment-16034613
 ] 

Brahma Reddy Battula commented on HADOOP-14436:
---

[~maobaolong] thanks for uploading the patch..Patch LGTM..pending for jenkins..

Go throught he following link for more details on contribution.
https://cwiki.apache.org/confluence/display/HADOOP/HowToContribute

> The ViewFs.md's minor error about a redundant colon
> ---
>
> Key: HADOOP-14436
> URL: https://issues.apache.org/jira/browse/HADOOP-14436
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1, 3.0.0-alpha2
>Reporter: maobaolong
>Assignee: maobaolong
> Attachments: HADOOP-14436.patch
>
>
> Minor mistake can led the beginner to the wrong way and getting far away from 
> us.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14436) The ViewFs.md's minor error about a redundant colon

2017-06-02 Thread Brahma Reddy Battula (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HADOOP-14436:
--
Status: Patch Available  (was: Open)

> The ViewFs.md's minor error about a redundant colon
> ---
>
> Key: HADOOP-14436
> URL: https://issues.apache.org/jira/browse/HADOOP-14436
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.0.0-alpha2, 2.7.1
>Reporter: maobaolong
>Assignee: maobaolong
> Attachments: HADOOP-14436.patch
>
>
> Minor mistake can led the beginner to the wrong way and getting far away from 
> us.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14436) The ViewFs.md's minor error about a redundant colon

2017-06-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034545#comment-16034545
 ] 

ASF GitHub Bot commented on HADOOP-14436:
-

Github user maobaolong commented on the issue:

https://github.com/apache/hadoop/pull/223
  
@brahmareddybattula  Thank you advance.  Now I upload the patch file here, 
please take a look, any review comment will help me.


> The ViewFs.md's minor error about a redundant colon
> ---
>
> Key: HADOOP-14436
> URL: https://issues.apache.org/jira/browse/HADOOP-14436
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1, 3.0.0-alpha2
>Reporter: maobaolong
>Assignee: maobaolong
> Attachments: HADOOP-14436.patch
>
>
> Minor mistake can led the beginner to the wrong way and getting far away from 
> us.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14436) The ViewFs.md's minor error about a redundant colon

2017-06-02 Thread maobaolong (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034541#comment-16034541
 ] 

maobaolong commented on HADOOP-14436:
-

[~brahma] Thank you advance. I'm glad to join you. Now I upload the patch file 
here, please take a look, any review comment will help me.

> The ViewFs.md's minor error about a redundant colon
> ---
>
> Key: HADOOP-14436
> URL: https://issues.apache.org/jira/browse/HADOOP-14436
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1, 3.0.0-alpha2
>Reporter: maobaolong
>Assignee: maobaolong
> Attachments: HADOOP-14436.patch
>
>
> Minor mistake can led the beginner to the wrong way and getting far away from 
> us.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14436) The ViewFs.md's minor error about a redundant colon

2017-06-02 Thread maobaolong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maobaolong updated HADOOP-14436:

Attachment: HADOOP-14436.patch

> The ViewFs.md's minor error about a redundant colon
> ---
>
> Key: HADOOP-14436
> URL: https://issues.apache.org/jira/browse/HADOOP-14436
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1, 3.0.0-alpha2
>Reporter: maobaolong
>Assignee: maobaolong
> Attachments: HADOOP-14436.patch
>
>
> Minor mistake can led the beginner to the wrong way and getting far away from 
> us.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14475) Metrics of S3A don't print out when enable it in Hadoop metrics property file

2017-06-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034518#comment-16034518
 ] 

Steve Loughran commented on HADOOP-14475:
-

You are probably the first person to play with this. I've been primarily using 
them for testing, and, in code, using the new stats API, 
{{getStorageStatistics()}} to pick them and log/store them (in HADOOP-13786, 
saving into the _SUCCESS file for later retrieval)

If you could help work out what I've done wrong here, that'd be great. Probably 
some registration issue.

> Metrics of S3A don't print out  when enable it in Hadoop metrics property file
> --
>
> Key: HADOOP-14475
> URL: https://issues.apache.org/jira/browse/HADOOP-14475
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.8.0
> Environment: uname -a
> Linux client01 4.4.0-74-generic #95-Ubuntu SMP Wed Apr 12 09:50:34 UTC 2017 
> x86_64 x86_64 x86_64 GNU/Linux
>  cat /etc/issue
> Ubuntu 16.04.2 LTS \n \l
>Reporter: Yonger
>
> *.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink
> #*.sink.file.class=org.apache.hadoop.metrics2.sink.influxdb.InfluxdbSink
> #*.sink.influxdb.url=http:/xx
> #*.sink.influxdb.influxdb_port=8086
> #*.sink.influxdb.database=hadoop
> #*.sink.influxdb.influxdb_username=hadoop
> #*.sink.influxdb.influxdb_password=hadoop
> #*.sink.ingluxdb.cluster=c1
> *.period=10
> #namenode.sink.influxdb.class=org.apache.hadoop.metrics2.sink.influxdb.InfluxdbSink
> #S3AFileSystem.sink.influxdb.class=org.apache.hadoop.metrics2.sink.influxdb.InfluxdbSink
> S3AFileSystem.sink.file.filename=s3afilesystem-metrics.out
> I can't find the out put file even i run a MR job which should be used s3.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14472) Azure: TestReadAndSeekPageBlobAfterWrite fails intermittently

2017-06-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034493#comment-16034493
 ] 

Steve Loughran commented on HADOOP-14472:
-

Seems reasonable. Which endpoint did you test against?

> Azure: TestReadAndSeekPageBlobAfterWrite fails intermittently
> -
>
> Key: HADOOP-14472
> URL: https://issues.apache.org/jira/browse/HADOOP-14472
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure, test
>Reporter: Mingliang Liu
> Attachments: HADOOP-14472.000.patch
>
>
> Reported by [HADOOP-14461]
> {code}
> testManySmallWritesWithHFlush(org.apache.hadoop.fs.azure.TestReadAndSeekPageBlobAfterWrite)
>   Time elapsed: 1.051 sec  <<< FAILURE!
> java.lang.AssertionError: hflush duration of 13, less than minimum expected 
> of 20
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.fs.azure.TestReadAndSeekPageBlobAfterWrite.writeAndReadOneFile(TestReadAndSeekPageBlobAfterWrite.java:286)
>   at 
> org.apache.hadoop.fs.azure.TestReadAndSeekPageBlobAfterWrite.testManySmallWritesWithHFlush(TestReadAndSeekPageBlobAfterWrite.java:247)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14473) Optimize NativeAzureFileSystem::seek for forward seeks

2017-06-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034491#comment-16034491
 ] 

Steve Loughran commented on HADOOP-14473:
-

This is going to read forward no matter how big the file is, even if you are 
going to the last MB of a 20 GB file. Is this really the most optimal.

Rajesh, you are pulling over the s3a input stream work again, aren't you? Maybe 
its best here to group them into 1 patch. That s3a work also added stream 
instrumentation 
{{org.apache.hadoop.fs.s3a.S3AInstrumentation.InputStreamStatistics}} , so we 
could actually measure what is going on, *and use it in tests*. This seek work 
here & related is the opportunity to do the same for Azure, which will benefit 
production monitoring too. In particular, here I'd like to track the #of bytes 
skipped in forward seeks, and the #of close/open pairs, so we can detect when 
there's a lot of skipping going on, plus make better tests. Ideally I'd like 
something like {{ITestS3AInputStreamPerformance}}, so as to catch any 
performance regressions in various read sequences (whole file vs skip forwards 
vs full random)

> Optimize NativeAzureFileSystem::seek for forward seeks
> --
>
> Key: HADOOP-14473
> URL: https://issues.apache.org/jira/browse/HADOOP-14473
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HADOOP-14473-001.patch
>
>
> {{NativeAzureFileSystem::seek()}} closes and re-opens the inputstream 
> irrespective of forward/backward seek. It would be beneficial to re-open the 
> stream on backward seek.
> https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java#L889



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14473) Optimize NativeAzureFileSystem::seek for forward seeks

2017-06-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034485#comment-16034485
 ] 

Steve Loughran commented on HADOOP-14473:
-

which endpoint did you test against? -1 until that's declared. 

> Optimize NativeAzureFileSystem::seek for forward seeks
> --
>
> Key: HADOOP-14473
> URL: https://issues.apache.org/jira/browse/HADOOP-14473
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HADOOP-14473-001.patch
>
>
> {{NativeAzureFileSystem::seek()}} closes and re-opens the inputstream 
> irrespective of forward/backward seek. It would be beneficial to re-open the 
> stream on backward seek.
> https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java#L889



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads

2017-06-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034484#comment-16034484
 ] 

Steve Loughran commented on HADOOP-14478:
-

Usual rule: which endpoint have you tested this with?

> Optimize NativeAzureFsInputStream for positional reads
> --
>
> Key: HADOOP-14478
> URL: https://issues.apache.org/jira/browse/HADOOP-14478
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Reporter: Rajesh Balamohan
> Attachments: HADOOP-14478.001.patch
>
>
> Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of 
> the data length requested for. This would be beneficial for sequential reads. 
> However, for positional reads (seek to specific location, read x number of 
> bytes, seek back to original location) this may not be beneficial and might 
> even download lot more data which are not used later.
> It would be good to override {{readFully(long position, byte[] buffer, int 
> offset, int length)}} for {{NativeAzureFsInputStream}} and make use of 
> {{mark(readLimit)}} as a hint to Azure's BlobInputStream.
> BlobInputStream reference: 
> https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448
> BlobInputStream can consider this as a hint later to determine the amount of 
> data to be read ahead. Changes to BlobInputStream would not be addressed in 
> this JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14477) FileSystem Simplify / Optimize listStatus Method

2017-06-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034480#comment-16034480
 ] 

Steve Loughran commented on HADOOP-14477:
-

test failure is unlrelated; checkstyles are new

As this goes into FS behaviours, could you create an HDFS JIRA for same patch 
and submit it there too (& link it to this)? That will force all the hdfs tests 
to use this new routine as well.

> FileSystem Simplify / Optimize listStatus Method
> 
>
> Key: HADOOP-14477
> URL: https://issues.apache.org/jira/browse/HADOOP-14477
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.7.3, 3.0.0-alpha3
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HADOOP-14477.1.patch, HADOOP-14477.2.patch
>
>
> {code:title=org.apache.hadoop.fs.FileSystem.listStatus(ArrayList, 
> Path, PathFilter)}
>   /*
>* Filter files/directories in the given path using the user-supplied path
>* filter. Results are added to the given array results.
>*/
>   private void listStatus(ArrayList results, Path f,
>   PathFilter filter) throws FileNotFoundException, IOException {
> FileStatus listing[] = listStatus(f);
> if (listing == null) {
>   throw new IOException("Error accessing " + f);
> }
> for (int i = 0; i < listing.length; i++) {
>   if (filter.accept(listing[i].getPath())) {
> results.add(listing[i]);
>   }
> }
>   }
> {code}
> {code:title=org.apache.hadoop.fs.FileSystem.listStatus(Path, PathFilter)}
>   public FileStatus[] listStatus(Path f, PathFilter filter) 
>throws FileNotFoundException, IOException {
> ArrayList results = new ArrayList();
> listStatus(results, f, filter);
> return results.toArray(new FileStatus[results.size()]);
>   }
> {code}
> We can be smarter about this:
> # Use enhanced for-loops
> # Optimize for the case where there are zero files in a directory, save on 
> object instantiation
> # More encapsulated design



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14477) FileSystem Simplify / Optimize listStatus Method

2017-06-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034481#comment-16034481
 ] 

Steve Loughran commented on HADOOP-14477:
-

test failure is unlrelated; checkstyles are new

As this goes into FS behaviours, could you create an HDFS JIRA for same patch 
and submit it there too (& link it to this)? That will force all the hdfs tests 
to use this new routine as well.

> FileSystem Simplify / Optimize listStatus Method
> 
>
> Key: HADOOP-14477
> URL: https://issues.apache.org/jira/browse/HADOOP-14477
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.7.3, 3.0.0-alpha3
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HADOOP-14477.1.patch, HADOOP-14477.2.patch
>
>
> {code:title=org.apache.hadoop.fs.FileSystem.listStatus(ArrayList, 
> Path, PathFilter)}
>   /*
>* Filter files/directories in the given path using the user-supplied path
>* filter. Results are added to the given array results.
>*/
>   private void listStatus(ArrayList results, Path f,
>   PathFilter filter) throws FileNotFoundException, IOException {
> FileStatus listing[] = listStatus(f);
> if (listing == null) {
>   throw new IOException("Error accessing " + f);
> }
> for (int i = 0; i < listing.length; i++) {
>   if (filter.accept(listing[i].getPath())) {
> results.add(listing[i]);
>   }
> }
>   }
> {code}
> {code:title=org.apache.hadoop.fs.FileSystem.listStatus(Path, PathFilter)}
>   public FileStatus[] listStatus(Path f, PathFilter filter) 
>throws FileNotFoundException, IOException {
> ArrayList results = new ArrayList();
> listStatus(results, f, filter);
> return results.toArray(new FileStatus[results.size()]);
>   }
> {code}
> We can be smarter about this:
> # Use enhanced for-loops
> # Optimize for the case where there are zero files in a directory, save on 
> object instantiation
> # More encapsulated design



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-14477) FileSystem Simplify / Optimize listStatus Method

2017-06-02 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-14477:
---

Assignee: BELUGA BEHR

> FileSystem Simplify / Optimize listStatus Method
> 
>
> Key: HADOOP-14477
> URL: https://issues.apache.org/jira/browse/HADOOP-14477
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.7.3, 3.0.0-alpha3
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HADOOP-14477.1.patch, HADOOP-14477.2.patch
>
>
> {code:title=org.apache.hadoop.fs.FileSystem.listStatus(ArrayList, 
> Path, PathFilter)}
>   /*
>* Filter files/directories in the given path using the user-supplied path
>* filter. Results are added to the given array results.
>*/
>   private void listStatus(ArrayList results, Path f,
>   PathFilter filter) throws FileNotFoundException, IOException {
> FileStatus listing[] = listStatus(f);
> if (listing == null) {
>   throw new IOException("Error accessing " + f);
> }
> for (int i = 0; i < listing.length; i++) {
>   if (filter.accept(listing[i].getPath())) {
> results.add(listing[i]);
>   }
> }
>   }
> {code}
> {code:title=org.apache.hadoop.fs.FileSystem.listStatus(Path, PathFilter)}
>   public FileStatus[] listStatus(Path f, PathFilter filter) 
>throws FileNotFoundException, IOException {
> ArrayList results = new ArrayList();
> listStatus(results, f, filter);
> return results.toArray(new FileStatus[results.size()]);
>   }
> {code}
> We can be smarter about this:
> # Use enhanced for-loops
> # Optimize for the case where there are zero files in a directory, save on 
> object instantiation
> # More encapsulated design



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14477) FileSystem Simplify / Optimize listStatus Method

2017-06-02 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14477:

Component/s: fs

> FileSystem Simplify / Optimize listStatus Method
> 
>
> Key: HADOOP-14477
> URL: https://issues.apache.org/jira/browse/HADOOP-14477
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.7.3, 3.0.0-alpha3
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: HADOOP-14477.1.patch, HADOOP-14477.2.patch
>
>
> {code:title=org.apache.hadoop.fs.FileSystem.listStatus(ArrayList, 
> Path, PathFilter)}
>   /*
>* Filter files/directories in the given path using the user-supplied path
>* filter. Results are added to the given array results.
>*/
>   private void listStatus(ArrayList results, Path f,
>   PathFilter filter) throws FileNotFoundException, IOException {
> FileStatus listing[] = listStatus(f);
> if (listing == null) {
>   throw new IOException("Error accessing " + f);
> }
> for (int i = 0; i < listing.length; i++) {
>   if (filter.accept(listing[i].getPath())) {
> results.add(listing[i]);
>   }
> }
>   }
> {code}
> {code:title=org.apache.hadoop.fs.FileSystem.listStatus(Path, PathFilter)}
>   public FileStatus[] listStatus(Path f, PathFilter filter) 
>throws FileNotFoundException, IOException {
> ArrayList results = new ArrayList();
> listStatus(results, f, filter);
> return results.toArray(new FileStatus[results.size()]);
>   }
> {code}
> We can be smarter about this:
> # Use enhanced for-loops
> # Optimize for the case where there are zero files in a directory, save on 
> object instantiation
> # More encapsulated design



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14163) Refactor existing hadoop site to use more usable static website generator

2017-06-02 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034306#comment-16034306
 ] 

Akira Ajisaka commented on HADOOP-14163:


Cool. Now Apache Hadoop 3.0.0-alpha3 is released. Would you update the document 
as well?
When you update the document, I'll push this to asf-site branch.

> Refactor existing hadoop site to use more usable static website generator
> -
>
> Key: HADOOP-14163
> URL: https://issues.apache.org/jira/browse/HADOOP-14163
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: site
>Reporter: Elek, Marton
>Assignee: Elek, Marton
> Attachments: HADOOP-14163-001.zip, HADOOP-14163-002.zip, 
> HADOOP-14163-003.zip, hadoop-site.tar.gz, hadop-site-rendered.tar.gz
>
>
> From the dev mailing list:
> "Publishing can be attacked via a mix of scripting and revamping the darned 
> website. Forrest is pretty bad compared to the newer static site generators 
> out there (e.g. need to write XML instead of markdown, it's hard to review a 
> staging site because of all the absolute links, hard to customize, did I 
> mention XML?), and the look and feel of the site is from the 00s. We don't 
> actually have that much site content, so it should be possible to migrate to 
> a new system."
> This issue is find a solution to migrate the old site to a new modern static 
> site generator using a more contemprary theme.
> Goals: 
>  * existing links should work (or at least redirected)
>  * It should be easy to add more content required by a release automatically 
> (most probably with creating separated markdown files)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads

2017-06-02 Thread Rajesh Balamohan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HADOOP-14478:
--
Description: 
Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of 
the data length requested for. This would be beneficial for sequential reads. 
However, for positional reads (seek to specific location, read x number of 
bytes, seek back to original location) this may not be beneficial and might 
even download lot more data which are not used later.

It would be good to override {{readFully(long position, byte[] buffer, int 
offset, int length)}} for {{NativeAzureFsInputStream}} and make use of 
{{mark(readLimit)}} as a hint to Azure's BlobInputStream.

BlobInputStream reference: 
https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448

BlobInputStream can consider this as a hint later to determine the amount of 
data to be read ahead. Changes to BlobInputStream would not be addressed in 
this JIRA.




  was:
Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of 
the data length requested for. This would be beneficial for sequential reads. 
However, for positional reads (seek to specific location, read x number of 
bytes, seek back to original location) this may not be beneficial and might 
even download lot more data which are not used later.

It would be good to override {{readFully(long position, byte[] buffer, int 
offset, int length)}} for {{NativeAzureFsInputStream}} and make use of 
{{mark(readLimit)}} as a hint to Azure's BlobInputStream.

BlobInputStream reference: 
https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448

BlobInputStream can consider this as a hint later to determine the amount of 
data to be read ahead. Changes to BlobInputStream would not be apart of this 
JIRA.





> Optimize NativeAzureFsInputStream for positional reads
> --
>
> Key: HADOOP-14478
> URL: https://issues.apache.org/jira/browse/HADOOP-14478
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Reporter: Rajesh Balamohan
> Attachments: HADOOP-14478.001.patch
>
>
> Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of 
> the data length requested for. This would be beneficial for sequential reads. 
> However, for positional reads (seek to specific location, read x number of 
> bytes, seek back to original location) this may not be beneficial and might 
> even download lot more data which are not used later.
> It would be good to override {{readFully(long position, byte[] buffer, int 
> offset, int length)}} for {{NativeAzureFsInputStream}} and make use of 
> {{mark(readLimit)}} as a hint to Azure's BlobInputStream.
> BlobInputStream reference: 
> https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448
> BlobInputStream can consider this as a hint later to determine the amount of 
> data to be read ahead. Changes to BlobInputStream would not be addressed in 
> this JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads

2017-06-02 Thread Rajesh Balamohan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HADOOP-14478:
--
Attachment: HADOOP-14478.001.patch

Attaching .1 patch for review. This includes changes related to HADOOP-14473 as 
well.

> Optimize NativeAzureFsInputStream for positional reads
> --
>
> Key: HADOOP-14478
> URL: https://issues.apache.org/jira/browse/HADOOP-14478
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Reporter: Rajesh Balamohan
> Attachments: HADOOP-14478.001.patch
>
>
> Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of 
> the data length requested for. This would be beneficial for sequential reads. 
> However, for positional reads (seek to specific location, read x number of 
> bytes, seek back to original location) this may not be beneficial and might 
> even download lot more data which are not used later.
> It would be good to override {{readFully(long position, byte[] buffer, int 
> offset, int length)}} for {{NativeAzureFsInputStream}} and make use of 
> {{mark(readLimit)}} as a hint to Azure's BlobInputStream.
> BlobInputStream reference: 
> https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448
> BlobInputStream can consider this as a hint later to determine the amount of 
> data to be read ahead. Changes to BlobInputStream would not be apart of this 
> JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads

2017-06-02 Thread Rajesh Balamohan (JIRA)

Rajesh Balamohan created HADOOP-14478:
-

 Summary: Optimize NativeAzureFsInputStream for positional reads
 Key: HADOOP-14478
 URL: https://issues.apache.org/jira/browse/HADOOP-14478
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/azure
Reporter: Rajesh Balamohan


Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of 
the data length requested for. This would be beneficial for sequential reads. 
However, for positional reads (seek to specific location, read x number of 
bytes, seek back to original location) this may not be beneficial and might 
even download lot more data which are not used later.

It would be good to override {{readFully(long position, byte[] buffer, int 
offset, int length)}} for {{NativeAzureFsInputStream}} and make use of 
{{mark(readLimit)}} as a hint to Azure's BlobInputStream.

BlobInputStream reference: 
https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448

BlobInputStream can consider this as a hint later to determine the amount of 
data to be read ahead. Changes to BlobInputStream would not be apart of this 
JIRA.






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

95 matches

Mail list logo