[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-12 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15743497#comment-15743497
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

Thanks [~steve_l].  Those tests that make assertions about internal operation 
counts have been useful.  I have disabled many of them with a check against 
{{S3AFileSystem#isMetadataStoreConfigured()}} or 
{{S3ATestUtils#isMetadataStoreAuthoritative()}}.  They are a bit brittle to 
change but do end up catching issues, so I think ultimately useful.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch, 
> HADOOP-13449-HADOOP-13345.011.patch, HADOOP-13449-HADOOP-13345.012.patch, 
> HADOOP-13449-HADOOP-13345.013.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-12 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15741936#comment-15741936
 ] 

Steve Loughran commented on HADOOP-13449:
-

I've commented on HADOOP-13886 ; I'm not sure that this is a bug in the dynamo 
work, more a surfacing of the fact that those internal metrics, while they 
helped test what was going on, turn out to be brittle to change. That is what 
[~aw] warned me about, so he can be happy that the first people to experience 
it is ourselves.

we can cut the test.



> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch, 
> HADOOP-13449-HADOOP-13345.011.patch, HADOOP-13449-HADOOP-13345.012.patch, 
> HADOOP-13449-HADOOP-13345.013.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-09 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736924#comment-15736924
 ] 

Lei (Eddy) Xu commented on HADOOP-13449:


Thanks a lot for the great work here, [~liuml07] and [~fabbri]. It is great 
that I can integrate this patch to HADOOP-13650 now. I will start to work on it 
today and keep you guys updated.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch, 
> HADOOP-13449-HADOOP-13345.011.patch, HADOOP-13449-HADOOP-13345.012.patch, 
> HADOOP-13449-HADOOP-13345.013.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-09 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736711#comment-15736711
 ] 

Mingliang Liu commented on HADOOP-13449:


Yes just edited the comment above. Lei helped us a lot in reviewing patches. I 
think we can get [HADOOP-13650] in soon after review.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch, 
> HADOOP-13449-HADOOP-13345.011.patch, HADOOP-13449-HADOOP-13345.012.patch, 
> HADOOP-13449-HADOOP-13345.013.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-09 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736704#comment-15736704
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

You are welcome!  Thanks also to [~eddyxu] for his help getting to this point.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch, 
> HADOOP-13449-HADOOP-13345.011.patch, HADOOP-13449-HADOOP-13345.012.patch, 
> HADOOP-13449-HADOOP-13345.013.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-09 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736694#comment-15736694
 ] 

Mingliang Liu commented on HADOOP-13449:


Thank you [~fabbri] for your great help, discussion, review, testing, bug 
fixing! Thanks [~ste...@apache.org] and [~cnauroth] for the discussion and 
initial patch. Let's move on to  other tasks and make this feature branch be 
merged to trunk early.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch, 
> HADOOP-13449-HADOOP-13345.011.patch, HADOOP-13449-HADOOP-13345.012.patch, 
> HADOOP-13449-HADOOP-13345.013.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-09 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736682#comment-15736682
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

I committed this to the HADOOP-13345 feature branch.  Thank you for all your 
hard work on this [~liuml07].

I ran all integration tests against s3-us-west-2.amazonaws.com endpoint.

Failures were as expected:
  
ITestS3AFileOperationCost.testFakeDirectoryDeletion:254->Assert.assertEquals:555
  
ITestJets3tNativeS3FileSystemContract>NativeS3FileSystemContractBaseTest.testListStatusForRoot:66
 Root directory is not empty;  expected:<0> but was:<3>

  ITestS3AAWSCredentialsProvider.testAnonymousProvider:133 » AWSServiceIO 
initia...
  ITestS3ACredentialsInURL.testInstantiateFromURL:86 » InterruptedIO 
initializin...
  
ITestS3AFileSystemContract>FileSystemContractBaseTest.testRenameToDirWithSamePrefixAllowed:669->FileSystemContractBaseTest.rename:525
 » AWSServiceIO

Which are covered by HADOOP-13876 and HADOOP-13886.


> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch, 
> HADOOP-13449-HADOOP-13345.011.patch, HADOOP-13449-HADOOP-13345.012.patch, 
> HADOOP-13449-HADOOP-13345.013.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734222#comment-15734222
 ] 

Hadoop QA commented on HADOOP-13449:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
25s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
55s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
37s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
32s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
47s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
55s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
28s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
12s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 36s{color} | {color:orange} root: The patch generated 1 new + 8 unchanged - 
0 fixed = 9 total (was 8) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
20s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
29s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
42s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 80m  4s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-13449 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12842477/HADOOP-13449-HADOOP-13345.013.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux d255e124893c 3.13.0-103-generic #150-Ubuntu SMP Thu Nov 24 
10:34:17 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| 

[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-08 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734138#comment-15734138
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

Sounds good.  Thanks for analysis on that failure. I will do some review + 
testing with the v13 patch tonight and make sure we have updated JIRAs for any 
issues, including the createFakeDirectoryIfNecessary() thing.  I can commit the 
v13 patch in the morning if everyone is in favor and I don't find any new 
issues.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch, 
> HADOOP-13449-HADOOP-13345.011.patch, HADOOP-13449-HADOOP-13345.012.patch, 
> HADOOP-13449-HADOOP-13345.013.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-08 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733756#comment-15733756
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

I'm in the process of testing the v12 patch.  [~liuml07] and [~steve_l] I'd 
like to propose we get this patch in so we can split up the remaining work on 
DynamoDB.  I'm thinking the steps are:

1. I will finish testing and do a quick review on the v12 patch here.
2. I will open JIRAs for outstanding issues related to DynamoDB.
3. If you guys are +1 on this, I will commit the v12 (or latest) patch.

One possible concern is if we wanted to try and merge the HADOOP-13345 branch 
to trunk before DynamoDB support is finished (we'd talked about that to deal 
with code churn and allow things like working on parallel rename without 
needing to redo all the s3guard rename code, etc).  If we still wanted to to 
attempt this, let me know.



> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch, 
> HADOOP-13449-HADOOP-13345.011.patch, HADOOP-13449-HADOOP-13345.012.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727577#comment-15727577
 ] 

Hadoop QA commented on HADOOP-13449:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
47s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
54s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
47s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
42s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
49s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
56s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
11s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
33s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
54s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 41s{color} | {color:orange} root: The patch generated 1 new + 8 unchanged - 
0 fixed = 9 total (was 8) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
20s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
45s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
42s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 82m 25s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-13449 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12842071/HADOOP-13449-HADOOP-13345.012.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux 5ea041425062 3.13.0-103-generic #150-Ubuntu SMP Thu Nov 24 
10:34:17 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| 

[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-06 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727382#comment-15727382
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

If it is MetadataStoreTestBase line 154 "assertEmptyDirs() shown here:

{code}
ms.put(new PathMetadata(makeFileStatus("/da1/db1/fc1", 100)));

assertEmptyDirs("/da1", "/da2", "/da3");
assertDirectorySize("/da1/db1", 1);
{code}

I think we can change that to be

{code}
assertEmptyDirs("/da2", "/da3)
{code}

Why?  Because it is not unreasonable to allow a MetadataStore to infer the 
existence of /da1/db1 from a call to put(/da1/db1/fc1).

In fact, as we see here, it can be a helpful implementation technique when we 
don't have a cheap way to prefix scan everything in the MetadataStore.



> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch, 
> HADOOP-13449-HADOOP-13345.011.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-06 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727385#comment-15727385
 ] 

Mingliang Liu commented on HADOOP-13449:


Sorry I marked it as {{@Ignored}} in the patch...
{code:title=TestDynamoDBMetadataStore.java}
184   // TODO; update the test base class instead of ignoring this one
185   @Ignore
186   @Override
187   public void testPutNew() throws Exception {
188   }
{code}

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch, 
> HADOOP-13449-HADOOP-13345.011.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-06 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727368#comment-15727368
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

What line does testPutNew() fail on for you?  {{TestDynamoDBMetadataStore}} is 
passing for me.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch, 
> HADOOP-13449-HADOOP-13345.011.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-06 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727354#comment-15727354
 ] 

Mingliang Liu commented on HADOOP-13449:


Thanks for your insightful comments, [~fabbri]. You summary looks very good to 
me. Having follow-up JIRAs looks a good step as most of them can be addressed 
separately === parallelly. 

# For point 1, I'll change this in the new patch. I think by induction it is 
good enough to include this optimization which should be minor change.
# For the example of pre-existing items, I was thinking the import phase may 
have already loaded the S3 directory tree to the DDB table, e.g. by command 
line tool. For better implementing {{isEmpty}}, I think working with 
S3AFileStatus is a good idea. Steve also suggested we should query DDB lazily 
only when we need the isEmpty information. Obviously we can't enable this 
within DDBMetatdataStore itself.
# The integration tests you shared are very helpful. I'll have a look at the 
{{ITestS3AFileOperationCost#testFakeDirectoryDeletion}} first. Will get back to 
this the day after tomorrow.
# For the invalid creds in URI, the DDBClientFactory itself uses the same 
{{createAWSCredentialProviderSet}} as S3ClientFactory does so it should honor 
the creds in URI name. But after FS#initialization, S3AFS has stripped the 
creds and returns the scheme://host only URI to create a MetadataStore. We can 
refuse to support this case as it's very unsafe and deprecated. One approach 
though is to pass the {{name}} URI which contains the creds to 
{{S3Guard#getMetadataStore()}}.
{code:title=S3AFileSystem#initialize()}
-  metadataStore = S3Guard.getMetadataStore(this);
+  metadataStore = S3Guard.getMetadataStore(this, name);
{code}

By the way, do you have suggestions about the failing {{testPutNew}} unit test? 
My current idea is to override this test method in DDB and local so they can 
have different behaviors; alternatively we can enforce the put API of metadata 
store so both DDB and local implement that for newly put item, all its 
ancestors are existent.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch, 
> HADOOP-13449-HADOOP-13345.011.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-06 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727226#comment-15727226
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

This looks pretty good [~liuml07], thank you.  A couple of comments:

1. For put(), in your loop which walks the path to the root, should we should 
break as soon as we find a non-null ancestor?  I see you have a comment 
suggesting that.

This should be correct, by induction, since the invariant for the DDB 
MetadataStore (MS)  is:

- For any path P in MS, all ancestors p_i are in MS.

Thus, when we encounter an ancestor p_i on the way to the root, we know that 
all its ancestors must already exist.

2. The isEmpty bit on directory FileStatus's.  

{code}
 @Override
  public PathMetadata get(Path path) throws IOException {
  ... (snip) ...
// for directory, we query its direct children to determine isEmpty bit
{code}

I think this misses the case where we have not seen all the children yet.  For 
example, we have an existing bucket with some directory {{/a/b/}} and two 
existing files {{/a/b/file1}}, {{/a/b/file2}}.  We create a new MetadataStore 
and put(/a/b/), then get(/a/b).

That said, we should consider merging this patch and creating follow-up JIRAs 
since it is good to split up the work and keep the code integrated.

Also, we should consider reworking the isEmptyDirectory logic around 
S3AFileStatus as we discussed before.  Meanwhile, I'd be happy to implement a 
similar algorithm as to what I used for LocalMetadataStore if that would be 
helpful.

Testing I did on this patch:
mvn clean verify -Dit.test="ITestS3A*"
Failed tests:
  ITestS3ACredentialsInURL.testInvalidCredentialsFail:130->Assert.fail:88 
Expected an AccessDeniedException, got S3AFileStatus{path=s3a://fabbri-dev/; 
isDirectory=true; modification_time=0; access_time=0; owner=fabbri; 
group=fabbri; permission=rwxrwxrwx; isSymlink=false} isEmptyDirectory=false
  
ITestS3AFileOperationCost.testFakeDirectoryDeletion:254->Assert.assertEquals:555->Assert.assertEquals:118->Assert.failNotEquals:743->Assert.fail:88
 after rename(srcFilePath, destFilePath): directories_created expected:<1> but 
was:<0>

Tests in error:
  ITestS3AAWSCredentialsProvider.testAnonymousProvider:133 » AWSServiceIO 
initia...
  ITestS3AAWSCredentialsProvider.testBadCredentials:102->createFailingFS:76 » 
AWSServiceIO
  ITestS3ACredentialsInURL.testInstantiateFromURL:86 » AWSClientIO initializing 
...
  
ITestS3AFileSystemContract>FileSystemContractBaseTest.testRenameToDirWithSamePrefixAllowed:656->FileSystemContractBaseTest.rename:512
 » AWSServiceIO

Tests run: 321, Failures: 2, Errors: 4, Skipped: 42

To summarize:
- isEmptyDirectory logic for DynamoDBMetadataStore (needs JIRA--unless I'm 
missing something)
- Anonymous credentials issue (needs JIRA)
- Issue with invalid creds in URI.  Creds in URI in general may not be honored 
by DDB MetadataStore? (needs JIRA & investigation)
- Stats not showing directory creation from rename():   
ITestS3AFileOperationCost.testFakeDirectoryDeletion (needs investigation)


> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch, 
> HADOOP-13449-HADOOP-13345.011.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724940#comment-15724940
 ] 

Hadoop QA commented on HADOOP-13449:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
37s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
25s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
47s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
33s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
50s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
55s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
59s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
27s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
39s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
40s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 43s{color} | {color:orange} root: The patch generated 1 new + 8 unchanged - 
0 fixed = 9 total (was 8) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
33s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
26s{color} | {color:red} hadoop-aws in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
17s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
24s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
41s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 80m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-13449 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12841900/HADOOP-13449-HADOOP-13345.011.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux ea3d4d4c2f23 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux 

[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-02 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15717170#comment-15717170
 ] 

Mingliang Liu commented on HADOOP-13449:


Yes I did consider putting all the ancestors to the metadata store when putting 
a single path. Another benefit is that, {{isEmpty}} will be much easier: simply 
issue a query request (limit return size 1) whose hash key ("parent" field) is 
the specific directory, and if there is any data returned, the directory is 
non-empty; else empty. Then the case that {{/a, /a/b/c, /a/b/d}} yet {{/a}} is 
not empty, does not exist. Plus we don't have to store/maintain the {{isEmpty}} 
field any longer.

I gave up this constraints when implementing DDB and let the file system 
enforces this for the sake of performance. Consider a simple case: to 
{{put(PathMetadata meta)}} 1K files in a deep directory (say 10 layers), every 
put operation will check if all the ancestors exist, and 1K operation becomes 
10K operations to DDB. For {{put(DirListingMetadata meta)}}, it will be 
efficient so we can blame users for not using this one instead.

So overall, not changing MetadataStore is possible and we can change this in 
the {{DynamoDBMetadataStore}} implementation. I'll post a patch (may be a wip 
one) soon.

So we did find real bugs/problems/limitation via integration tests; and they're 
helpful. Thanks,

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-02 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15717021#comment-15717021
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

I did a little research on #3. It looks like you cannot do a prefix scan on a 
partition key for DynamoDB.  This seems to imply that, considering an operation 
{{deleteSubtree(delete_path)}}, a simple search by prefix to find all entries 
with paths that begin with {{delete_path}} would actually be a full table scan. 
 If I'm right, that is unfortunate.

The problem with the existing deleteSubtree(delete_path) implementation is that 
all the children under delete_path might not be reachable from delete_path by 
doing a simple tree walk over the state in the MetadataStore.  The algorithm 
would work, however, if, when we created a file, we also created all its 
ancestor directories up to the root.  This would establish an invariant that

{quote}
For any path p in DDB MetadataStore
For each ancestor a_i from p to the root
a_i is in DDB MetadataStore
{quote}

This actually sounds reasonable.  Can we do it without changing the 
{{MetadataStore}} interface?  I think we can: when we create(path), we always 
have the full absolute 'path', so we know the names of the ancestors all the 
way to the root.

Thoughts?

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-02 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15716725#comment-15716725
 ] 

Mingliang Liu commented on HADOOP-13449:


Thanks [~fabbri]. Quick reply (I'm working on this as well, will keep posted):

# For point 1, let's track elsewhere.
# For point 2, the explanation makes sense. My current in-progress change is to 
remove the "isEmpty" field from DynamoDB (DDB) table for directories, and to 
issue a query DDB request whose "parent" field is the current directory. Then I 
realized that, there may be items in the table whose ancestor (parent of 
parent, say) is the given directory, but their parent directories are missing. 
e.g for {{/a, /a/b/c, /a/b/d}}, {{/a}} is not empty. This has some similar 
problem to point 3. A simply query seems not enough.
# For point 3, yes we may have to use scan as the hash key is not known. Let's 
figure out the best solution.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-02 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15716696#comment-15716696
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

I think this is the list of outstanding items to get integration tests passing.

1. Dealing with anonymous / reduced privilege bucket credentials ([~steve_l]'s 
previous comment).  We should discuss separately... maybe separate JIRA?  I 
have some other related requirements around table <-> bucket mappings.
2. Updating {{S3AFileStatus#isEmptyDirectory()}}.  move(), put(), delete(), and 
deleteSubtree() will need to maintain the parent dir's empty bit and/or 
invalidate it's state.  I think the basic logic used in LocalMetadataStore 
should work fine for now.
3. deleteSubtree(path) assumes that any deleted subtree is fully recorded in 
the MetadataStore.  Best solution, IMO, is to query for all entries that have 
'path' as an ancestor.  Hoping we can use a prefix scan to keep that efficient. 
 [~liuml07] would love to hear your DynamoDB expertise on that idea?

I'm working on #2 at the moment.  I wrote a new integration test 
{{ITestS3AEmptyDirectory}} that exercises a directory going from 
empty->non-empty and vice-versa. Much easier to debug that case in isolation!  
It passes for LocalMetadataStore but, as expected, fails for DDB still.




> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-01 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713241#comment-15713241
 ] 

Mingliang Liu commented on HADOOP-13449:


Thanks for the tip of disabling s3n integration tests. I find the command {{mvn 
-Dparallel-tests -DtestsThreadCount=8 -Dit.test='ITestS3A*' -Dtest=none clean 
verify}} is also helpful.

I'll review and/or commit [HADOOP-13793] today. You're right we have to disable 
the DDB metadatastore is disabled for unit tests even it's configured. For the 
{{DynamoDBClientFactory}}, that sounds good if both S3 client and DDB client 
are mocked except {{TestDynamoDBMetadataStore}} which will create one itself 
against the DynamoDBLocal.

I'm working on fixing other failing tests. Per offline discussion with Steve, 
he suggested we by now ignore the failing anonymous auth tests for this patch. 
We do have to support that though.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-01 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713155#comment-15713155
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

That use case for Mock S3 Client + Real DDB client (for DynamoDBLocal) makes 
sense.  We also need to be able to ensure that DDB metadatastore is disabled 
for unit tests, even if it is configured in the Hadoop configuration.  That 
could be solved as part of HADOOP-13589.

In my working tree I have a patch on top of your v10 here that separates out 
the DynamoDB Client Factory into a separate class {{DynamoDBClientFactory}}.  
That would allow us to use a Mock S3 client without a real DDB client.  It is 
an easy change but depends on (or conflicts with) my outstanding patch for 
HADOOP-13793 (which we should get in soon).

As for disabling s3n integration tests, you should be able to add a couple of 
lines to your pom to exclude those.. Google the Maven Failsafe options for 
details.  I personally run just integration tests like so:

{{mvn clean test-compile failsafe:integration-test}}

and then find one failure that I want to debug, and run that one alone like 
this:

{{mvn clean test-compile failsafe:integration-test 
-Dit.test=ITestS3AFileSystemContract}}


> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-12-01 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711793#comment-15711793
 ] 

Steve Loughran commented on HADOOP-13449:
-

Some of these tests are failing because dynamo db is trying to init, even when 
anonymously authenticating with an external read only store.

we need to consider how to support deployment where some object stores are read 
only + no dynamo db, and perhaps fallback to no db if auth fails. And also the 
situation where one store has an authoritative DB, another none...it'll have to 
be on a per-object store basis

{code}

testAnonymousProvider(org.apache.hadoop.fs.s3a.ITestS3AAWSCredentialsProvider)  
Time elapsed: 0.91 sec  <<< ERROR!
org.apache.hadoop.fs.s3a.AWSServiceIOException: initializing  on 
s3a://landsat-pds/scene_list.gz: 
com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: Request is 
missing Authentication Token (Service: AmazonDynamoDBv2; Status Code: 400; 
Error Code: MissingAuthenticationTokenException; Request ID: 
NS80UK0G6OKHI6IR7KCIV1VRONVV4KQNSO5AEMVJF66Q9ASUAAJG): Request is missing 
Authentication Token (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: 
MissingAuthenticationTokenException; Request ID: 
NS80UK0G6OKHI6IR7KCIV1VRONVV4KQNSO5AEMVJF66Q9ASUAAJG)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1529)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1167)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:948)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:661)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:635)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:618)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$300(AmazonHttpClient.java:586)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:573)
at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:445)
at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.doInvoke(AmazonDynamoDBClient.java:1722)
at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:1698)
at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.createTable(AmazonDynamoDBClient.java:743)
at 
com.amazonaws.services.dynamodbv2.document.DynamoDB.createTable(DynamoDB.java:96)
at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.createTable(DynamoDBMetadataStore.java:413)
at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.initialize(DynamoDBMetadataStore.java:187)
at 
org.apache.hadoop.fs.s3a.s3guard.S3Guard.getMetadataStore(S3Guard.java:85)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:252)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3246)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:123)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3295)
at org.apache.hadoop.fs.FileSystem$Cache.getUnique(FileSystem.java:3269)
at org.apache.hadoop.fs.FileSystem.newInstance(FileSystem.java:529)
at 
org.apache.hadoop.fs.s3a.ITestS3AAWSCredentialsProvider.testAnonymousProvider(ITestS3AAWSCredenti
{code}

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-30 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710947#comment-15710947
 ] 

Mingliang Liu commented on HADOOP-13449:


Sorry for late reply. Thank you [~fabbri] very much for running integration 
tests and analyze the failure. I can reproduce the unit test failure 
{{TestS3AGetFileStatus#testNotFound}}. I can also reproduce the integration 
failures on US-standard region. I'll work on them this tomorrow. Thanks for 
taking care of {{ITestS3AFileSystemContract}}.
{code}
---
 T E S T S
---

---
 T E S T S
---
Running org.apache.hadoop.fs.contract.s3a.ITestS3AContractGetFileStatus
Running org.apache.hadoop.fs.contract.s3a.ITestS3AContractMkdir
Running org.apache.hadoop.fs.contract.s3a.ITestS3AContractSeek
Running org.apache.hadoop.fs.contract.s3a.ITestS3AContractRename
Running org.apache.hadoop.fs.contract.s3a.ITestS3AContractDelete
Running org.apache.hadoop.fs.contract.s3a.ITestS3AContractOpen
Running org.apache.hadoop.fs.contract.s3a.ITestS3AContractCreate
Running org.apache.hadoop.fs.contract.s3a.ITestS3AContractDistCp
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.946 sec - in 
org.apache.hadoop.fs.contract.s3a.ITestS3AContractMkdir
Running org.apache.hadoop.fs.contract.s3n.ITestS3NContractCreate
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 64.332 sec - in 
org.apache.hadoop.fs.contract.s3a.ITestS3AContractOpen
Tests run: 10, Failures: 0, Errors: 0, Skipped: 10, Time elapsed: 0.372 sec - 
in org.apache.hadoop.fs.contract.s3n.ITestS3NContractCreate
Running org.apache.hadoop.fs.contract.s3n.ITestS3NContractDelete
Running org.apache.hadoop.fs.contract.s3n.ITestS3NContractMkdir
Tests run: 8, Failures: 0, Errors: 0, Skipped: 8, Time elapsed: 0.455 sec - in 
org.apache.hadoop.fs.contract.s3n.ITestS3NContractDelete
Tests run: 5, Failures: 0, Errors: 0, Skipped: 5, Time elapsed: 0.375 sec - in 
org.apache.hadoop.fs.contract.s3n.ITestS3NContractMkdir
Running org.apache.hadoop.fs.contract.s3n.ITestS3NContractOpen
Running org.apache.hadoop.fs.contract.s3n.ITestS3NContractRename
Tests run: 6, Failures: 0, Errors: 0, Skipped: 6, Time elapsed: 0.406 sec - in 
org.apache.hadoop.fs.contract.s3n.ITestS3NContractRename
Tests run: 6, Failures: 0, Errors: 0, Skipped: 6, Time elapsed: 0.478 sec - in 
org.apache.hadoop.fs.contract.s3n.ITestS3NContractOpen
Running org.apache.hadoop.fs.contract.s3n.ITestS3NContractSeek
Running org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContext
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.313 sec - in 
org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContext
Tests run: 18, Failures: 0, Errors: 0, Skipped: 18, Time elapsed: 0.655 sec - 
in org.apache.hadoop.fs.contract.s3n.ITestS3NContractSeek
Running org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextCreateMkdir
Running org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextMainOperations
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 72.987 sec - in 
org.apache.hadoop.fs.contract.s3a.ITestS3AContractRename
Running org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextURI
Tests run: 10, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 73.829 sec - 
in org.apache.hadoop.fs.contract.s3a.ITestS3AContractCreate
Running org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextUtil
Tests run: 8, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 75.878 sec <<< 
FAILURE! - in org.apache.hadoop.fs.contract.s3a.ITestS3AContractDelete
testDeleteNonEmptyDirNonRecursive(org.apache.hadoop.fs.contract.s3a.ITestS3AContractDelete)
  Time elapsed: 28.759 sec  <<< FAILURE!
java.lang.AssertionError: non recursive delete should have raised an exception, 
but completed with exit code true
at org.junit.Assert.fail(Assert.java:88)
at 
org.apache.hadoop.fs.contract.AbstractContractDeleteTest.testDeleteNonEmptyDirNonRecursive(AbstractContractDeleteTest.java:78)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)

[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-30 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710879#comment-15710879
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

FYI so we don't duplicate effort: I'm looking at ITestS3AFileSystemContract 
failure right now.  Looks like it may be a failure to delete from DDB metadata 
store.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-30 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709115#comment-15709115
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

Finished a run of the S3A integration tests.  I see that fixing the 
MockS3Client factory is not as simple as my last comment, as you use it for the 
DynamoDBMetadataStore unit test.  We can revisit this here or on HADOOP-13589.

Here are the integration test failures I see when I configure the 
DynamoDBMetadataStore via core-site.xml:

{code}
Failed tests: 
  
ITestS3AContractDelete>AbstractContractDeleteTest.testDeleteNonEmptyDirNonRecursive:78->Assert.fail:88
 non recursive delete should have raised an exception, but completed with exit 
code true
  
ITestS3AContractDelete>AbstractContractDeleteTest.testDeleteNonEmptyDirRecursive:94->AbstractFSContractTestBase.assertDeleted:349->Assert.fail:88
 Deleted file: unexpectedly found 
s3a://fabbri-dev/test/testDeleteNonEmptyDirNonRecursive as  
S3AFileStatus{path=s3a://fabbri-dev/test/testDeleteNonEmptyDirNonRecursive; 
isDirectory=true; modification_time=0; access_time=0; owner=fabbri; 
group=fabbri; permission=rwxrwxrwx; isSymlink=false} isEmptyDirectory=false
  ITestS3AConfiguration.testUsernameFromUGI:481 owner in 
S3AFileStatus{path=s3a://fabbri-dev/; isDirectory=true; modification_time=0; 
access_time=0; owner=fabbri; group=fabbri; permission=rwxrwxrwx; 
isSymlink=false} isEmptyDirectory=false expected:<[alice]> but was:<[fabbri]>
  
ITestS3AFileOperationCost.testFakeDirectoryDeletion:254->Assert.assertEquals:555->Assert.assertEquals:118->Assert.failNotEquals:743->Assert.fail:88
 after rename(srcFilePath, destFilePath): directories_created expected:<1> but 
was:<0>
  
ITestS3AFileOperationCost.testCostOfGetFileStatusOnNonEmptyDir:139->Assert.fail:88
 FileStatus says directory isempty: 
S3AFileStatus{path=s3a://fabbri-dev/test/empty; isDirectory=true; 
modification_time=0; access_time=0; owner=fabbri; group=fabbri; 
permission=rwxrwxrwx; isSymlink=false} isEmptyDirectory=true
ls s3a://fabbri-dev/test/empty [00] 
S3AFileStatus{path=s3a://fabbri-dev/test/empty/simple.txt; isDirectory=false; 
length=0; replication=1; blocksize=33554432; modification_time=1480497225005; 
access_time=0; owner=fabbri; group=fabbri; permission=rw-rw-rw-; 
isSymlink=false} isEmptyDirectory=false

Tests in error: 
  
ITestS3AContractRootDir>AbstractContractRootDirectoryTest.testRmEmptyRootDirNonRecursive:116
 » PathIO
  
ITestS3AFileContextMainOperations>FileContextMainOperationsBaseTest.testRenameDirectoryAsNonExistentDirectory:1038->FileContextMainOperationsBaseTest.testRenameDirectoryAsNonExistentDirectory:1052->FileContextMainOperationsBaseTest.rename:1197
 » IO
  ITestS3AAWSCredentialsProvider.testAnonymousProvider:133 » AWSServiceIO 
initia...
  ITestS3AAWSCredentialsProvider.testBadCredentials:102->createFailingFS:76 » 
AWSServiceIO
  ITestS3ACredentialsInURL.testInstantiateFromURL:86 » AWSClientIO initializing 
...
  
ITestS3AFileSystemContract>FileSystemContractBaseTest.testWriteReadAndDeleteOneBlock:266->FileSystemContractBaseTest.writeReadAndDelete:285->FileSystemContractBaseTest.writeAndRead:815
 » FileAlreadyExists
  
ITestS3AFileSystemContract>FileSystemContractBaseTest.testRenameToDirWithSamePrefixAllowed:656->FileSystemContractBaseTest.rename:512
 » AWSServiceIO
{code}


> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-29 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707310#comment-15707310
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

This fixes that failed unit test for me:

{code}
--- 
a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/MockS3ClientFactory.java
+++ 
b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/MockS3ClientFactory.java
@@ -41,9 +41,8 @@
   @Override
   public AmazonDynamoDBClient createDynamoDBClient(
   URI uri, com.amazonaws.regions.Region region) throws IOException {
-final DefaultS3ClientFactory factory = new DefaultS3ClientFactory();
-factory.setConf(getConf());
-return factory.createDynamoDBClient(uri, region);
+throw new IOException("Purposely failing to create DynamoDB client"
+  + " for unit test.");
   }
{code}

Also noticed a spot we need to fix the exception thrown (supposed to be an 
IOException):

{code}
@Override
public AmazonDynamoDBClient createDynamoDBClient(URI fsUri, Region region)
throws IOException {
{code}
...
{code}
  String msg = "Incorrect DynamoDB endpoint: "  + endPoint;
  LOG.error(msg, e);
  throw new IllegalArgumentException(msg, e);
}
  }
{code}

I have a number of integration test failures I'll be working through next.  
BTW, I'm happy to submit a follow-up (v11) patch with these things if that 
would help, just shout.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-29 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707232#comment-15707232
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

Congrats on the clean jenkins run [~liuml07].

I'm running some tests with this configured as the MetadataStore today.  First 
thing I'm noticing is a failure in {{TestS3AGetFileStatus}}.  This led me to 
this part of the v10 patch:

{code}
+public class MockS3ClientFactory extends Configured implements S3ClientFactory 
{
+
+  @Override
+  public AmazonDynamoDBClient createDynamoDBClient(
+  URI uri, com.amazonaws.regions.Region region) throws IOException {
+final DefaultS3ClientFactory factory = new DefaultS3ClientFactory();
+factory.setConf(getConf());
+return factory.createDynamoDBClient(uri, region);
+  }
{code}

I believe the goal of the mock s3 client is to be able to run non-integration 
(unit) tests without S3 configured.  It looks like you are creating an actual 
S3 client from the mock client.  Doesn't this break the ability of unit tests 
to run without S3?

It seems like all the unit tests should only use MetadataStores which can run 
locally (Null or LocalMetadataStore).  So, maybe we do not need this code at 
all.  Maybe MockS3ClientFactory#createDynamoDBClient() just throws a runtime 
exception "Failing to create DynamoDB for unit test", and then we fall back to 
the NullMetadataStore automatically in S3Guard#getMetadataStore()?

I'm also wondering if, instead of having S3ClientFactory expose a 
createDynamoDBClient() method, we should just add getters to S3ClientFactory 
(getAwsConfig() and maybe getCredentials()), and then move 
createDynamoDBClient() to inside the DynamoDBMetadataStore? The 
DynamoDBMetadataStore can then call the getters on the client to get what it 
needs to construct a DynamoDB client.  The goal here would be to keep dynamodb 
specifics encapsulated in DynamoDBMetadataStore.  This would allow, for 
example, removing the Dynamo dependency from s3a if we ever want to create a 
separate submodule for the DynamoDBMetadataStore.




> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-29 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707088#comment-15707088
 ] 

Mingliang Liu commented on HADOOP-13449:


I merged again from the {{trunk}} locally and resolved minor conflicts with 
[HADOOP-13823]. If you guys support the merge, I'll push it to the remote 
feature branch.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch, 
> HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15688831#comment-15688831
 ] 

Hadoop QA commented on HADOOP-13449:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
41s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m  
3s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
36s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
53s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 1s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
28s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
19s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
45s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
47s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 81m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-13449 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12840184/HADOOP-13449-HADOOP-13345.010.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux 1d67e385a192 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15688595#comment-15688595
 ] 

Hadoop QA commented on HADOOP-13449:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 
49s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
16s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
58s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
22s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
18s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
29s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
54s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
40s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
26s{color} | {color:red} hadoop-aws in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 13m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  2m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
17s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
54s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
7s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
45s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 98m 59s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-13449 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12840161/HADOOP-13449-HADOOP-13345.009.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux 99976286ab13 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-22 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15688223#comment-15688223
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

Yes, I believe that is what Steve L was suggesting.  Go ahead, I'm +1 on
the merge.

On Tue, Nov 22, 2016 at 2:57 PM, Mingliang Liu (JIRA) 



> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org




[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-22 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15688201#comment-15688201
 ] 

Mingliang Liu commented on HADOOP-13449:


I guess a {{git checkout HADOOP-13345}} and {{git merge trunk}} will work? I 
tested here and it was a clean merge with commit message saying "Merge branch 
'trunk' into HADOOP-13345". Then we simply {{git push}} to the remote repo? If 
I get it wrongly, please take care of this; thank you very much.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-22 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15688188#comment-15688188
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

Sounds good [~liuml07].  Last time I did a rebase but [~steve_l] suggested we 
just do merge commit to avoid the force-push and associated issues.  I'm fine 
either way.  I can do this if you like.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-22 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15688160#comment-15688160
 ] 

Mingliang Liu commented on HADOOP-13449:


The patch that bumps the AWS SDK version is committed to {{trunk}}. To help us 
here, I suggest we merge from trunk again; [~fabbri], thoughts? Last time we 
did rebase IIRC?

After that, I'll upload the patch against the changes there; I don't expect 
major conflicts (if any) for this patch.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch, HADOOP-13449-HADOOP-13345.008.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15685462#comment-15685462
 ] 

Hadoop QA commented on HADOOP-13449:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
53s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
56s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
32s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
38s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
41s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
48s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  7m 48s{color} 
| {color:red} root generated 6 new + 695 unchanged - 0 fixed = 701 total (was 
695) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
15s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 35s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 38s{color} 
| {color:red} hadoop-aws in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 73m 35s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.security.TestGroupsCaching |
|   | hadoop.fs.s3a.s3guard.TestNullMetadataStore |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HADOOP-13449 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12839928/HADOOP-13449-HADOOP-13345.008.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux 

[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-20 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15681442#comment-15681442
 ] 

Steve Loughran commented on HADOOP-13449:
-

correction; I can't talk monday, other commitments.

w.r.t SDK and jackson, we're going to have to upgrade jackson and the SDK on 
branch-2, using a jackson version that is compatible at the API level with 
existing code. This is going to cause problems downstream, but we don't really 
have a choice any more.

the jackson update should go in as HADOOP-12705

I'm also going to have to update the aws SDK in 2.7 and 2.8; troublesome that. 
We need to get rid of org.json artifacts embedded in the existing AWS SDK, 
which means update SDK, which means Jsckson update, unless I can do one of: (a) 
shade everything or (b) swap in tdunning's org.json replacement (more 
specifically: cut the org.json lib off the AWS JAR and explicitly add ted's 
replacement)

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-19 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15679191#comment-15679191
 ] 

Steve Loughran commented on HADOOP-13449:
-

I can talk morning Palo Alto time from 09:00 to 11:00. It's good to open the 
calls to all. I have a webex conf #, or we can use google+ hangouts (which I 
prefer, though they will block anyone from China trying to dial in)

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, 
> HADOOP-13449-HADOOP-13345.007.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15678636#comment-15678636
 ] 

Hadoop QA commented on HADOOP-13449:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
28s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
48s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
54s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
27s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
34s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
41s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
18s{color} | {color:red} hadoop-aws in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  6m 
37s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  6m 37s{color} 
| {color:red} root in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 29s{color} | {color:orange} root: The patch generated 7 new + 5 unchanged - 
0 fixed = 12 total (was 5) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
23s{color} | {color:red} hadoop-aws in the patch failed. {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
20s{color} | {color:red} hadoop-aws in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
15s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
15s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 24s{color} 
| {color:red} hadoop-aws in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 70m 19s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HADOOP-13449 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12839677/HADOOP-13449-HADOOP-13345.007.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux 07b328f0b616 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 

[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-18 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15677910#comment-15677910
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

Hey [~liuml07]. I am available early next week to help work on this.  I could 
help with:

- The test refactoring needed to keep existing code but extend behavior for 
S3A/DDB-specific behavior.
- Running the S3A integration tests with your MetadataStore and fixing bugs.
- Working on the SDK revision bump for trunk.
- Anything else you think of (TODOs etc).

If you are interested in chatting on Monday about this JIRA, please email me 
times that work for you and I can host a public call.  Monday afternoon 
(pacific time) is good for me.

Also in response to your last comment, I feel like the patch is good enough to 
start integration testing and get list consistency (without authoritative / 
high-performance caching) for a V1 of the feature.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15677752#comment-15677752
 ] 

Hadoop QA commented on HADOOP-13449:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
15s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
6s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
37s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
43s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
43s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
8s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
41s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
14s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  8m 14s{color} 
| {color:red} root generated 5 new + 695 unchanged - 0 fixed = 700 total (was 
695) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 37s{color} | {color:orange} root: The patch generated 1 new + 4 unchanged - 
0 fixed = 5 total (was 4) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
15s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
14s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
35s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 75m 38s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HADOOP-13449 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12839604/HADOOP-13449-HADOOP-13345.006.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux 4948552e7f11 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 

[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-18 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15677349#comment-15677349
 ] 

Lei (Eddy) Xu commented on HADOOP-13449:


+1 for the current patch.  LGTM.

Thanks [~liuml07]

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15675673#comment-15675673
 ] 

Hadoop QA commented on HADOOP-13449:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
39s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
58s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
53s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
27s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
34s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
40s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
50s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
13s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
31s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-aws in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
55s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  6m 55s{color} 
| {color:red} root generated 5 new + 695 unchanged - 0 fixed = 700 total (was 
695) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 30s{color} | {color:orange} root: The patch generated 10 new + 4 unchanged - 
0 fixed = 14 total (was 4) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
14s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m  
7s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
39s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 73m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HADOOP-13449 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12839489/HADOOP-13449-HADOOP-13345.005.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux 9c2561317d10 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 

[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-15 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669245#comment-15669245
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

{quote}
I see the difference that we have now. After import, I was treating the DDB as 
the consistent store and authoritative by default. 
{quote}

An import process *could* set all directory entries as authoritative.. Or, for 
v1, you could always return authoritative = false.  The ideal third option is 
to set directories as authoritative (fully-cached) only when the client does so 
via put(DirListingMetadata). The interface allows any of these behaviors.  None 
of them should require changing the API semantics or the MS Contract tests, 
unless there is a bug in my tests  :-)

For the ideal option #3, where would you store the isAuthoritative bit without 
extra queries?  Maybe that is future work. I have some ideas.

{quote}
Suppose we have /a, /a/b, /a/c, for delete operations, we delete /a/ first; if 
other thread/process comes to list /a, it returns null (because /a is not 
there) indicating the subtree does not exist, though /a/b/ and /a/c/ are there. 
{quote}

That sounds like valid behavior.  There are many race conditions for concurrent 
directory updates, both with, and without, S3Guard.

(Slightly related: The filesystem is responsible for ensuring that the delete 
to /a must be recursive since it is not empty.  MetadataStore explicitly does 
not do that.)

{quote}
Another question is that, if another process bypasses S3Guard and puts a new 
entry to /a/d/, do we have to make sure /a/d be added to the store by checking 
S3? I was thinking not; S3Guard guards those who buy it.
{quote}

Correct.  Consistency only works for those that use S3Guard.  If 
fs.s3a.metadatastore.authoritative config is false (default) the client will 
ignore the isAuthoritative return value from listChildren(), and will *always* 
check S3 in addition to the MetadataStore.  In this configuration, clients will 
discover files added outside of S3Guard.  Those will be subject to eventual 
consistency, of course.

{quote}
 As to the concern of not strictly having parent directories pre-created, is 
importing the only one?
{quote}
You either have to (A) pay money to store an extra copy of your metadata 
forever, or (B) spend money and time hydrating the MetadataStore each time you 
start a cluster.

Also, if there are failures (DynamoDB is famous for throttling users and being 
difficult to provision properly), and we don't assume everything is always in 
DynamoDB, it makes recovery *much* easier.  In general, you can invalidate 
state in the MetadataStore by just removing paths or subtrees, and the client 
will adaptively reload those parts from S3.

The other concern is that I just don't understand why you would want to do the 
preloading.

Can you elaborate a bit on why do you want to have parent directories 
pre-created? Which operation does it help with?

JIRA communication is a bit difficult.  I'm happy to host a public conference 
call if that would help.




> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-15 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669124#comment-15669124
 ] 

Mingliang Liu commented on HADOOP-13449:


Thanks for the ideas. Now we're getting close to the depth.

{quote}
The client has to check S3.
{quote}
I see the difference that we have now. After import, I was treating the DDB as 
the consistent store and authoritative by default. Suppose we have /a, /a/b, 
/a/c, for delete operations, we delete /a/ first; if other thread/process comes 
to list /a, it returns null (because /a is not there) indicating the subtree 
does not exist, though /a/b/ and /a/c/ are there. We may need to cover corner 
cases. Another question is that, if another process bypasses S3Guard and puts a 
new entry to /a/d/, do we have to make sure /a/d be added to the store by 
checking S3? I was thinking not; S3Guard guards those who buy it.

My previous thoughts: scan was not acceptable. If we should use scan; we have 
to redesign the DDB schema. Creating the parent path while creating the child 
is not efficient; for putting X files in the same directory, we don't want to 
check X times ancestors which bring heavy overhead. For empty directories, I 
didn't have to go to S3 as we may need time to get latest state. There must be 
feasible solutions/workarounds for DDBMetadataStore implementation if we agree 
on the list contract; I'll post my ideas later.

Now I have to think about the difference on the "contract" for list. As to the 
concern of not strictly having parent directories pre-created, is importing the 
only one? Cluster not being started is OK; DDB persists the data. We can import 
data via tools (e.g. command line) first on Day 1.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-15 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669036#comment-15669036
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

One more answer.. Sorry for spamming but I have a hard time parsing the 
questions sometimes (need a whiteboard).

{quote}
But for listStatus(), my understanding is that we assume the entry per se is 
there. Or else, if we query the DDB and no entries having parent as this path, 
is the directory nonexistent, or the directory is empty? 
{quote}

I think in this case you return null.  This means "I do not have any state for 
that directory".  The client has to check S3.

Once we have delete tracking, we might look for an entry for the directory 
itself with isDeleted=true, and return file not found, but we are not doing 
delete tracking initially (just focus on list consistency).


> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-15 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669011#comment-15669011
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

Sorry, I was a little hasty to comment without the details of your DynamoDB 
schema.

Since you use the parent dir as the key for each entry, listChildren(/a/b) can 
just return all items with key=/a/b, right?

You shouldn't have to create any separate entry for parent (unless client 
specifically does {{put(PathMetadata(/a/b))}})

Still curious if you can use {{begins_with}} query to implement recursive 
delete.  If so, that could be a future optimization.

Next time you run the existing test code, and you have a failure related to 
this, feel free to email/JIRA mention me and I'll work with you on it.  I'll 
take another look at the MetadataStoreTestBase now.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-15 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668965#comment-15668965
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

{quote}
But for listStatus(), my understanding is that we assume the entry per se is 
there. 
{quote}

The only thing that matters is the semantics or API contract.  As long as you 
have the behavior I outlined, it is correct.  

We cannot require client to put(parent) before put(child), since we may run on 
an existing bucket where the directory was already created before we started 
our cluster.  

{quote}Or else, if we query the DDB and no entries having parent as this path, 
is the directory nonexistent, or the directory is empty? DDBMetadataStore 
should return DirListingMetadata accordingly. Thanks,
{quote}

Ok, I think this is an implementation detail for DynamoDB.  Two ideas.  #1 
seems pretty good:

1. Do a prefix scan.. I thought DynamoDB had built-in support for looking up 
values by key prefix.  I.e. {{begins_with}}.  When you do a 
listChildren(parent), you can just query for {{key begins_with parent}}?

2. Create the parent path when you create the child so you can implement 
listChildren() properly (what I did for initial LocalMetadataStore).

Also, can't you use prefix queries to eliminate the whole 
{{DescendantsIterator}} thing for recursive delete?

Thanks for the discussion [~liuml07].


> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-15 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668909#comment-15668909
 ] 

Mingliang Liu commented on HADOOP-13449:


When we put {{/a/b/file}}, {{/a/b/}} does not need to be there. MetadataStore 
will not enforce this. Either the caller has to pre-create the parent, or she 
knows about this and do not make any assumption about it in application. I 
agree that the put is a simple request for one exact entry to the metadata 
store.

But for {{listStatus()}}, my understanding is that we assume the entry per se 
is there. Or else, if we query the DDB and no entries having parent as this 
path, is the directory nonexistent, or the directory is empty? DDBMetadataStore 
should return DirListingMetadata accordingly. Thanks,

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-15 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668739#comment-15668739
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

Thanks [~liuml07], sounds good.  Can't wait to try this stuff out.

{quote}
My final concern is that, the MetadataStore assumes all ancestor directories 
(including direct parent directory) have been pre-created by the caller/user. I 
have to change the base test MetadataStoreTestBase to make all the tests pass. 
We have to change LocalMetadataStore accordingly. See my above comments.
{quote}

I looked in the test code and did not see the bug you mention.  Feel free to 
call out a particular line of code: I may have missed it.

This is the intention:  Say you start by doing a 
MetadataStore.put(PathMetadata(/a/b/file)).

That file should appear when listing /a/b.

That is, MetadataStore.listChildren(/a/b) should return {[/a/b/file], 
isAuthoritative=false}.  This is used by S3AFileSystem#listStatus() to get list 
consistency (e.g. if /a/b/file is not visible in s3 yet, we will add that 
entry).

However, there is no requirement that MetadataStore.get(/a/b) should return 
anything.  So the parent directory does not need to be created, per se, but the 
new file *does* need to appear when listing the parent.

If the MetadataStore contract tests do not reflect this, I am very happy to 
change them.  Also you might have missed some other test code (helper 
functions) that explicitly create parent dirs.  We don't want to assume 
creation of ancestors at all, just file to show up in parent listing

The LocalMetadataStore ends up creating a parent dir internally, but that 
should be implementation detail, and not required as above.

Does that help clarify?

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-15 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668686#comment-15668686
 ] 

Mingliang Liu commented on HADOOP-13449:


Thanks for your review [~fabbri]. Quick reply (will consider more carefully 
before I submit a new patch this week):

# Yes, the AWS SDK version bump should be separated out. However, I found the 
DynamoDBLocal is not included in version >1.11.0.1 Steve does have a patch in 
[HADOOP-13050], which bumps up the version to 1.11.45. I guess we may have to 
use different DynamoDBLocal version from AWS SDK (and thus DynamoDB) version. 
By the way, the headache of {{jackson2}} version, I got 2.5.5 working. Next 
patch will be easier to separate the version bump code out.
# I think moving {{testDescendantsIterator}} to {{MetadataStoreTestBase}} is a 
very good idea. I'll try to consolidate this in next patch. It should work.
# As to the changes in the test base and local metadata store. I'm sorry for 
that. I was selfish to make my test pass only. I was indeed make the test 
methods in base class common/general enough so that both DDB and local MS can 
use this. Plus, the local MS can override them:
{code:title=from the patch}
61  
62@Override
63protected void verifyFileStatus(FileStatus status, long size) {
64  super.verifyFileStatus(status, size);
65  
66  assertEquals("Replication value", REPLICATION, 
status.getReplication());
67  assertEquals("Access time", getAccessTime(), 
status.getAccessTime());
68  assertEquals("Owner", OWNER, status.getOwner());
69  assertEquals("Group", GROUP, status.getGroup());
70  assertEquals("Permission", PERMISSION, status.getPermission());
71}
72  
73@Override
74protected void verifyDirStatus(FileStatus status) {
75  super.verifyDirStatus(status);
76  
77  assertEquals("Mod time", getModTime(), 
status.getModificationTime());
78  assertEquals("Replication value", REPLICATION, 
status.getReplication());
79  assertEquals("Access time", getAccessTime(), 
status.getAccessTime());
80  assertEquals("Owner", OWNER, status.getOwner());
81  assertEquals("Group", GROUP, status.getGroup());
82  assertEquals("Permission", PERMISSION, status.getPermission());
83}
{code}
Do you want me to make the base test methods untouched (as well as local), and 
make DDB test override the super methods? That is also feasible, but the base 
class will not be general/common enough; plus we may have duplicate code.
# {{public abstract MetadataStore getMs();}} in {{AbstractMSContract}} was 
unexpected. I must have been fooled by my IDE.
# No I'll finish my review (sorry for the delay) for [HADOOP-13651] before I 
run any integration/contract tests. We may have a few of follow-up JIRAs to 
make end2end tests work.

The comments are driving this patch in the right direction. But did I 
promise too much about the next version patch? I may post a mid-way patch for 
quick feedback.

My final concern is that, the MetadataStore assumes all ancestor directories 
(including direct parent directory) have been pre-created by the caller/user. I 
have to change the base test MetadataStoreTestBase to make all the tests pass. 
We have to change LocalMetadataStore accordingly. See my above comments.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-15 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668591#comment-15668591
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

Thanks for your hard work on this [~liuml07]!

{code}
-1.10.6
+1.11.0
{code}
We should apply this to trunk first, separately, and merge back to feature 
branch, as [~steve_l] suggested.  Let me know if you want help w/ that.

{code}
+  @Test
+  public void testDescendantsIterator() throws IOException {
{code}
 
Thanks for the extra test code!  Should we move this to MetadataStoreTestBase?  
That is, should this test run for any MetadataStore?

{code}
  
-  private void verifyBasicFileStatus(PathMetadata meta) {
-FileStatus status = meta.getFileStatus();
+  void verifyFileStatus(FileStatus status, long size) {
 assertFalse("Not a dir", status.isDirectory());
-assertEquals("Replication value", REPLICATION, status.getReplication());
-assertEquals("Access time", accessTime, status.getAccessTime());
 assertEquals("Mod time", modTime, status.getModificationTime());
+assertEquals("File size", size, status.getLen());
 assertEquals("Block size", BLOCK_SIZE, status.getBlockSize());
-assertEquals("Owner", OWNER, status.getOwner());
-assertEquals("Group", GROUP, status.getGroup());
-assertEquals("Permission", PERMISSION, status.getPermission());
   }
  
-  private FileStatus makeDirStatus(String pathStr) {
+  private FileStatus makeDirStatus(String pathStr)
+  throws IOException {
 return basicFileStatus(new Path(pathStr), 0, true);
   }
  
-  private void verifyDirStatus(PathMetadata meta) {
-FileStatus status = meta.getFileStatus();
+  /**
+   * Verify the directory file status. Subclass may verify additional fields.
+   */
+  void verifyDirStatus(FileStatus status) {
 assertTrue("Is a dir", status.isDirectory());
 assertEquals("zero length", 0, status.getLen());
-assertEquals("Replication value", REPLICATION, status.getReplication());
-assertEquals("Access time", accessTime, status.getAccessTime());
-assertEquals("Mod time", modTime, status.getModificationTime());
-assertEquals("Owner", OWNER, status.getOwner());
-assertEquals("Group", GROUP, status.getGroup());
-assertEquals("Permission", PERMISSION, status.getPermission());
+  }
+
{code}

Glad you got this working.  I'd like to keep this test code though.  Can we 
extend the MetadataStoreTestBase instead of changing it?  I made some 
suggestions 
[here|https://issues.apache.org/jira/browse/HADOOP-13650?focusedCommentId=15637956=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15637956].

Similarly, the changes in {{TestLocalMetadataStore}} should not be needed.  
Also I think that code has changed in the latest version posted in 
HADOOP-13651, so those changes would conflict on merge.

{code}
-public MetadataStore getMetadataStore() throws IOException {
-  LocalMetadataStore lms = new LocalMetadataStore();
-  return lms;
+public MetadataStore getMs() {
{code}

This function rename will probably cause merge headaches for both of us as 
well.  

Other than that, looks good.  Have you ran any of the integration / contract 
tests with it yet?  I suppose not since you'll need HADOOP-13651 first.


> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-15 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668393#comment-15668393
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

Hey I've been travelling but I'm back and will try to get you a review on this 
shortly.  I had a couple minor comments so far, that could be addressed in 
subsequent JIRA if needed.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665550#comment-15665550
 ] 

Mingliang Liu commented on HADOOP-13449:


We can provision the capacity of an existing table in DDB, and 
{{DynamoDBMetadataStore}} also supports this via 
{{DynamoDBMetadataStore#provisionTable}}. This may be useful to the command 
line tools.

In {{initialization}}, it's also works for me to only consider the config keys 
while creating table.



> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-14 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665532#comment-15665532
 ] 

Lei (Eddy) Xu commented on HADOOP-13449:


bq.  If we're using an existing table, should we provision it according to the 
config keys? Or we only use this one for newly created tables?

>From what I understand, we can only specify read/write capacity when create 
>the table? I think it is ok to just use these for newly created tables.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665511#comment-15665511
 ] 

Mingliang Liu commented on HADOOP-13449:


If the user is gonna destroy a non-existent table, that will be interesting. We 
don't have to create a new table, wait it to become active, and then delete the 
table, wait the table become deleted... that's not good.

Let me add an internal (not for users) config key CREATE_IF_NOT_EXISTS, whose 
default value will be true; in the command tool for deletion/destroy work, the 
config will be set false. How about this?

p.s. I just noticed but yes write/read capacity unit can be configured. If 
we're using an existing table, should we provision it according to the config 
keys? Or we only use this one for newly created tables?

Thanks.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-14 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665501#comment-15665501
 ] 

Lei (Eddy) Xu commented on HADOOP-13449:


[~liuml07] If you is going to make a new patch. Would you also please add two 
configuration keys for Dynamodb write / read capacity unit?

The CLI tool needs a way to customize these.

Thanks.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665489#comment-15665489
 ] 

Mingliang Liu commented on HADOOP-13449:


Thanks [~eddyxu] for your review and insightful comments.

# Yes we can define a new method for destroying in MS interface. At least, that 
makes sense to DDB/MySQL stores. I'll upload a refined patch with this 
addressed along with {{isAuthoritative}}.
# As to not creating the table in {{initialization}} or adding a flag 
indicating the CREATE_IF_NOT_EXISTS , I did consider this, but was not sure 
about this.
#- The 1st concern I have is about the mapping. I propose to map one 
MetadataStore to one S3 bucket. MetadataStore does not have utility or general 
functions for operating metadatas elsewhere except the specific S3 bucket (and 
thus MetadataStore, say, a Table). After initialization all operations will 
operate against the same S3 bucket; we don't need to specify the associated 
Table for put/get/list etc operations; we don't need to specify the name for 
creation and destroy either.
#- The second point is about the {{initialization()}} semantic. In 
{{S3AFileSystem#initialize}}, it checks the bucket exists or not. It associate 
the file system object with the bucket explicitly. If we initialize the 
DDBMetadataStore from an S3AFileSystem, we associate the Table with the bucket 
as well. Once initialized successfully, we know that the table is there, and 
incoming operations are free to go. That's why I assumed we always create table 
if not exists. If the table exists, the {{createTable()}} will use the existing 
one instead of creating a new one. For destroy/deleteTable operations, the 
target table being created/got at initialization time is even more meaningful. 
The overhead by now is considered small; we can of course optimize this part 
later. Do this make sense?

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-14 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665346#comment-15665346
 ] 

Lei (Eddy) Xu commented on HADOOP-13449:


Hi, [~liuml07] 

The patch LGTM overall. +1.  I am OK to commit the last patch in the feature 
branch.

Some nits that are not necessary to be addressed in this patch:

* {{void deleteTable()}} should be an interface method for {{MetadataStore}}, 
and maybe rename it to {{destroy()}} / {{tearDown()}}? Each MS should have this 
function and not all of them are implemented "table" concept.

* Can we move {{createTable()}} from {{initialization()}} ? or add a flag like 
{{CREATE_IF_NOT_EXISTS}} in configuration? Because for the {{s3guard destroy -m 
URI}} command, it needs to initialize the DynamoDBMetadataStore instance first 
and then call {{destroy()}} / {{deleteTable()}}. 


> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658660#comment-15658660
 ] 

Hadoop QA commented on HADOOP-13449:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
47s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
57s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
28s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
33s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
41s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
21s{color} | {color:red} hadoop-aws in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
20s{color} | {color:red} hadoop-tools_hadoop-aws generated 2 new + 0 unchanged 
- 0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
15s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
14s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 37s{color} 
| {color:red} hadoop-aws in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 71m 52s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.fs.s3a.s3guard.TestLocalMetadataStore |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HADOOP-13449 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12838628/HADOOP-13449-HADOOP-13345.004.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux 94ee94c329d4 

[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-11-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658303#comment-15658303
 ] 

Hadoop QA commented on HADOOP-13449:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
32s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
54s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
53s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
29s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
34s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
50s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
36s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-aws in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m  
3s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  7m  3s{color} 
| {color:red} root generated 2 new + 695 unchanged - 0 fixed = 697 total (was 
695) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 34s{color} | {color:orange} root: The patch generated 24 new + 4 unchanged - 
0 fixed = 28 total (was 4) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
49s{color} | {color:red} hadoop-tools/hadoop-aws generated 1 new + 0 unchanged 
- 0 fixed = 1 total (was 0) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
19s{color} | {color:red} hadoop-tools_hadoop-aws generated 2 new + 0 unchanged 
- 0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
12s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
48s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 38s{color} 
| {color:red} hadoop-aws in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 73m 51s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-tools/hadoop-aws |
|  |  Arguments in wrong order for invocation of checkNotNull in 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.initialize(Configuration)
  At DynamoDBMetadataStore.java:invocation of checkNotNull in 

[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-31 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622752#comment-15622752
 ] 

Steve Loughran commented on HADOOP-13449:
-

regarding create() on a path. S3A isn't strict enough (HADOOP-13321) . It 
checks the dest path is there and not a directory, but doesn't go up the tree. 
It should. we know it should, but we also know how much slower things would be. 
I'm hoping to make this something done asynchronously between creating the file 
and actually committing the write in the final close(). That is, fail later, 
rather than sooner —but do fail before anything is materialized (HADOOP-13654).

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-31 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622747#comment-15622747
 ] 

Steve Loughran commented on HADOOP-13449:
-

I'm just catching up with this, apologies if I say things that are clearly 
wrong to anyone who knows the code or its history: I don't, yet.

h2. Build

# [~cnauroth] I see your point about declaring the dependency; you are correct. 
It does need to be something published for downstream users.
# I do still want the AWS update to be a standalone patch, and with a matching 
Jackson update. Those can perhaps be done to trunk/ itself, and merged in here, 
so that any/all other trunk work will be with the upgraded artifacts.

h2. Code

Anything in source marked TODO scares me. There's a lot here. Presumably the 
plan is to have them addressed by the time the patch goes in? Or at least 
pulled out into explicit followup JIRAs?

h3. {{DynamoDBMetadataStore}}

* just use .* on the static imports of the s3a constant, util, 
PathMetadataDynamoDBTranslation entries
* L108: no need to mix {{@code}} and {{}} tags. For multiline, {{}} 
should suffice ... check with the generated javadocs to see
* L218: that endpoint map/convert logic should be pulled into a static s3a util 
method, with tests.
* L465: what if close() is called twice? If a re-entrant call is made?
* L492, 531: Throw {{InterruptedIOException}}, or set the thread's interrupted 
bit again. We don't want that thread interrupt to be lost if at all possible.
* Most of those info-level per-operation logs should be at Debug
* Operation param to calls of {{translateException}} could be more informative. 
Consider: what info would you need there in order to debug this from the logs.


h2. Tests

As well as the unit tests, I need to be able to run the entire existing suite 
with s3guard enabled. This could be done with a new maven profile which would 
enable it, or simply a property passed down through the build. That's what's 
done in the scale tests in trunk, using methods in {{S3ATestUtils}} to allow a 
maven-defined property to override one in the core-site.xml, allowing you to 
enable it permanently in your {{test/resources/auth-keys.xml}} reference, or 
via maven.


I see that the tests are using java 8 language features. That is going to make 
backporting to branch-2 in future harder. Is everyone happy there (i.e. willing 
to do the effort to downgrade the code if the need arises)? 


h3. {{MetadataStoreTestBase}}

* L234: I know it's not in this patch, but I think the path name should be 
changed to something else.


h3. {{TestDynamoDBMetadataStore}}

I'll need to spend some time looking/playing with this.

* There's an inevitable risk the native libs aren't around/going to work with 
the native OS running the build. What policy is good there? Fail? or downgrade 
to skip? It's probably easiest to leave it as it is now (fail) and see what 
needs to change as/when failures surface.
* Add {{S3AFS.close()}} call in  {{tearDownAfterClass}} just to make sure 
threads all get cleaned up.



> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-30 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15620639#comment-15620639
 ] 

Chris Nauroth commented on HADOOP-13449:


bq. Also: that dynamo DB dependency MUST be at {{}} scope. We don't 
want to force it on people.

[~ste...@apache.org], the DynamoDB client adds no transitive dependencies that 
hadoop-aws is not already picking up through the AWS SDK.  Here we can see the 
only dependencies are on aws-java-sdk-s3 and aws-java-sdk-core:

https://github.com/aws/aws-sdk-java/blob/1.10.6/aws-java-sdk-dynamodb/pom.xml#L19-L45

Everything else is a test-only dependency.

In this case, I wonder if the right trade-off is for us to allow the 
dependency, so that downstream projects can pick up S3Guard functionality 
without needing to add the aws-java-sdk-dynamodb dependency explicitly.  Those 
projects would likely need to keep synchronized with whatever AWS SDK version 
number we're using in Hadoop, so as to avoid internal version conflicts around 
things like aws-java-sdk-core.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-29 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15618096#comment-15618096
 ] 

Steve Loughran commented on HADOOP-13449:
-

I see this patch bumps up the AWS version. Could that change be self contained 
in HADOOP-13050; that way the change is more visible & easier to cherry pick.  
This also implies HADOOP-12705.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-27 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613426#comment-15613426
 ] 

Lei (Eddy) Xu commented on HADOOP-13449:


Good discussion, [~liuml07] and [~fabbri]

bq. The contract assumes we create the direct parent directory (other ancestors 
should be taken care of by the clients/callers) when putting a new file item. I 
checked the in-memory local metadata store and it implements this idea. This 
may be not efficient to DDB. Basically for putting X items, we have to issue 
2X~3X DDB requests (X for putting file, X for checking its parent directories, 
and possible X for updating its parent directories). I'm wondering if we can 
also let the client/caller pre-create the direct parent directory as other 
ancestors.

I suggest to consider this into two aspects: 
* Checking parents directories in normal {{S3AFileSystem}} operations  (i.e., 
create / mkdirs ). In such case, S3AFileSystem should already ensure the 
invariant of the contracts(the parent directories existed before S3AFileSystem 
starts to create files on S3). 
* Loading files and directories outside of normal {{S3AFileSystem}} operations, 
e.g., load a *non-cached* directory or from CLI tool, in such cases, would a 
small local "dentry_cache" types of data structure be sufficient for a batch 
operation? Because these operations can ensure that the namespace structure 
exists on S3 already. 

The last resort is, if {{S3AFileSystem}} considers that it is safe to {{create 
/ mkdir}} on a path. You can always create all its parent directories in a 
single batch to dynamodb. In short, I'd suggest to let {{S3AFileSystem}} ensure 
the contract. 

bq. We store the is_empty for directory in the DynamoDB (DDB) metadata store 
now. We have to update this information in a consistent and efficient way. We 
don't want to check the parent directory every time we delete/put a file item. 
At least we can optimize this when deleting a subtree.

Another way to do it is letting the {{isEmpty()}} flag being set by issuing a 
small _additional_ query on the directory with a 
[Limit=1|http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/QueryAndScan.html#ScanQueryLimit].
 So if it returns more than 1 result, the {{isEmpty}} flag is false, otherwise, 
the flag is true. And this value can be cached with the lifetime of 
{{S3AFileStatus}}, as it can not reliably reflect the changes in S3 anyway. So 
the query cost only occurs when you call the {{IsEmpty()}} for the first time. 
And you don't need to update this flag for any S3 writes. 

Hope that works.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-27 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613313#comment-15613313
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

Exciting stuff, thanks for update.

{quote}
I changed the base unit test as the owner, group and permission etc are not 
part of the metadata we're interested in by now.
{quote}

Good. We could have a helper function that all tests could use, e.g. 
doesMetadataStorePersistOwnerGroupPermission() which returns false if 
MetadataStore instanceof DynamoDBMetadataStore.  This is also another spot it 
might be nice to add a function {{getProperty()}} for MetadataStore, so we 
could {{getProperty(PERSISTS_PERMISSIONS}} etc.  We could do that later on.

{quote}
We store the is_empty for directory in the DynamoDB (DDB) metadata store now. 
We have to update this information in a consistent and efficient way. We don't 
want to check the parent directory every time we delete/put a file item. At 
least we can optimize this when deleting a subtree.
{quote}
This part is a pain.  We should revisit the whole 
{{S3AFileStatus#isEmptyDirectory}} idea in the future. 

In case it helps, my algorithm is here:

In put(PathMetadata meta):
{code}
  if we have PathMetadata for meta's parent path:
  parentMeta.setIsEmpty(false)
{code}

The harder case, when we are removing an entry:

{code}

  // If we have cached a FileStatus for the parent...
  DirListingMetadata dir = dirHash.get(parent);
  if (dir != null) {
LOG.debug("removing parent's entry for {} ", path);

// Remove our path from the parent dir
dir.remove(path);

// S3A-specific logic dealing with S3AFileStatus#isEmptyDirectory()
if (isS3A) {
  if (dir.isAuthoritative() && dir.numEntries() == 0) {
setS3AIsEmpty(parent, true);
  } else if (dir.numEntries() == 0) {
// We do not know of any remaining entries in parent directory.
// However, we do not have authoritative listing, so there may
// still be some entries in the dir.  Since we cannot know the
// proper state of the parent S3AFileStatus#isEmptyDirectory, we
// will invalidate our entries for it.
// Better than deleting entries would be marking them as "missing
// metadata".  Deleting them means we lose consistent listing and
// ability to retry for eventual consistency for the parent path.

// TODO implement missing metadata feature
invalidateFileStatus(parent);
  }
  // else parent directory still has entries in it, isEmptyDirectory
  // does not change
}
{code}

Fixing the loss of consistency on the parent could be achieved by leaving an 
empty PathMetadata for the parent that does not contain a FileStatus in it.  
That "missing metadata" PathMetadata would indicate to future getFileStatus() 
or listStatus() calls that the file does exist (so retry if S3 is eventually 
consistent), but the FileStatus needs to be fetched from S3, since we cannot 
know the value of its isEmptyDirectory()

I added a TODO because we can tackle this later if we want.

{quote}The contract assumes we create the direct parent directory (other 
ancestors should be taken care of by the clients/callers) when putting a new 
file item{quote}

Yeah this is for consistent listing on the parent after the child is created.  
I'm wondering if we can relax this or make it configurable?  When 
{{fs.s3a.metadatastore.authoritative}} is true, the performance hit on create 
could be offset by a performance gain on subsequent listing of the parent 
directory. 

Looks like good progress! Please shout if I can help at all.


> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-27 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612642#comment-15612642
 ] 

Lei (Eddy) Xu commented on HADOOP-13449:


Ping [~liuml07].  Just wondering is there any update on this?

Thanks!

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-24 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15603396#comment-15603396
 ] 

Mingliang Liu commented on HADOOP-13449:


Thanks! I'll review that patch this week.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-24 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15603380#comment-15603380
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

Heads-up I just posted my latest patch to HADOOP-13651.  Integration with 
S3AFileSystem is looking pretty good.  I did make some changes to the way Paths 
are handled.  I have two patches outstanding that could go in: HADOOP-13631 
(move implementation) and HADOOP-13651 (S3AFileSystem integration).  We may 
want to start reviewing and committing these to avoid merge hell when this one 
gets done.

The work in my latest patch should make your life easier on this JIRA when you 
get to running all the S3A integration tests.  I'm available to help with that, 
i.e. if you want some test refactoring to make the MetadataStore contract tests 
easier to apply to S3AFileStatus (where not all FileSystem fields need to be 
persisted), just let me know.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-24 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15603231#comment-15603231
 ] 

Mingliang Liu commented on HADOOP-13449:


Thansk for the review, [~eddyxu]. Will post new patches addressing these.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-24 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15603229#comment-15603229
 ] 

Mingliang Liu commented on HADOOP-13449:


{quote}
I feel that we should have a consistent contract for all the stores.
{quote}
Yes we should fix this.

{quote}
Regarding isEmptyDirectory. should we store it in dynamodb as well?
{quote}
I plan to store this in the v1 patch. We can optimize this in the future if too 
many requests to DynamoDB.

{quote}
DynamoDBMetadataStore should have a getTableName() or getTable().
{quote}
That makes perfect sense as well.

{quote}
For authoritative(), do you think storing it as a flag in the DynamoDB is a 
good idea?
{quote}
That's a good suggestion. I can try this.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-24 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15603207#comment-15603207
 ] 

Lei (Eddy) Xu commented on HADOOP-13449:


Hi, [~liuml07]

The patch looks good to me overall.  Looking forward to fill the gaps of tests. 

{code:title=PathMetadataToDynamoDBTranslation.java}
final FileStatus fileStatus = isDir ? new S3AFileStatus(true, false, path) : 
new S3AFileStatus(0, 0, path, 0);
{code}

Here, it seems that it only has the path be correctly populated. It assumes 
that {{S3AFileSystem}} only checks the existence of file in {{MS}}.  It is 
different to {{InMemoryMetadataStore}}. I feel that we should have a consistent 
contract for all the stores.

* Regarding {{isEmptyDirectory}}. should we store it in dynamodb as well? The 
drawback is that we should update this field in DynamoDB in 
{{S3AFileStatus#finishWrite}} for every file. 

* {{DynamoDBMetadataStore}} should have a {{getTableName()}} or {{getTable()}}. 
The table name is parsed within {{initialize()}}, so from the caller (i.e., CLI 
tool) point of view, it is difficult to get the table name to call 
{{deleteTable(String tableName);}}. 


* For {{authoritative()}}, do you think storing it as a flag in the DynamoDB is 
a good idea?


> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-24 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15602746#comment-15602746
 ] 

Lei (Eddy) Xu commented on HADOOP-13449:


bq. . I'm wondering whether we should also save access time, owner, group, 
permission, size etc in the metadata,

Yes, in theory, we can save them. However, for S3AFileStatus, these fields are 
not set, while the tests are not test the special cases of S3AFIleStatus (i.e., 
whether {{isEmptyDirectory() == true && DirMetadata.isEmpty()}} ).   IMO, we 
should modify the tests for these invariants. 


I am working on reviewing the rest of the patch. Will post a review soon. 
Thanks for the good work.




> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-24 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15602745#comment-15602745
 ] 

Lei (Eddy) Xu commented on HADOOP-13449:


bq. . I'm wondering whether we should also save access time, owner, group, 
permission, size etc in the metadata,

Yes, in theory, we can save them. However, for S3AFileStatus, these fields are 
not set, while the tests are not test the special cases of S3AFIleStatus (i.e., 
whether {{isEmptyDirectory() == true && DirMetadata.isEmpty()}} ).   IMO, we 
should modify the tests for these invariants. 


I am working on reviewing the rest of the patch. Will post a review soon. 
Thanks for the good work.




> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-20 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593462#comment-15593462
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

Thanks for feedback [~eddyxu]. Turns out a user can provide URIs for any number 
of buckets.  They will each have their own S3AFileSystem of course.

I actually ran into this running integration tests (this is the last failure I 
have for my S3AFileSystem integration work). Some of the S3A Scale integration 
tests hit s3a://landsat-pds in addition to the test filesystem.  This caused a 
failure when I had paths from both buckets in my MetadataStore, without the 
leading scheme+host (bucket) to differentiate between the two.

I'm proposing that any FS client that supplies a Path without a host component 
will get an exception, if it should have one (i.e. S3AFileSystem should, 
RawLocalFilesystem should not).  I'll call it out when I post my next patch on 
HADOOP-13651.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.wip.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-20 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593298#comment-15593298
 ] 

Lei (Eddy) Xu commented on HADOOP-13449:


Hi, [~fabbri]

One {{S3AFileSystem}} only works for one S3 bucket,  I think we can safely 
assume that even multiple file system instances (i assume what you mean is that 
they are running on different clusters for different jobs (i.e., ETL steps)), 
they are running against the same bucket.  And the scheme is always be 
{{s3a://}}.   So {{Path}} should work well.   

Thanks.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.wip.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-20 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593268#comment-15593268
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

Chatted with a friend about this some and we both are leaning towards requiring 
callers of MetadataStore and DirListingMetadata to pass in fully-qualified 
paths (with scheme+bucket).  Considering that DirListingMetadata is , 
conceptually, a sort of simple POJO that folks will serialize/deserialize, it 
makes sense to have it consume complete paths instead of "calling back" to the 
FS client for clarification on what it means.

Hope this discussion helps with your efforts on this JIRA [~eddyxu] and 
[~liuml07].  Let me know if you disagree at all..

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.wip.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-20 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593227#comment-15593227
 ] 

Aaron Fabbri commented on HADOOP-13449:
---

I'm curious what you guys are doing about path normalization.. For 
LocalMetadataStore, I'm using Path as the hashtable key.  We already agreed to 
require all incoming paths to be absolute, but this is not quite enough to 
ensure consistency.  Some callers may hand in an absolute path without scheme 
and bucket (a.k.a URI host) qualification, and others may hand in a path to the 
same object with both of those things.  

I first tried stripping scheme and bucket away (via 
{Path#getPathWithoutSchemeAndAuthority()}).  This causes a problem, though, if 
multiple FileSystem instances share a metadata store singleton.  In this case, 
we actually need the host portion of the URI to distinguish between the 
different buckets that may be sharing a single MetadataStore.

So, now I'm thinking that MetadataStore implementations, and DirListingMetadata 
API, need to do something like fs.qualify(path) for incoming paths.

Questions
1. Does this sound right (always using qualified paths as lookup keys)?
2. Is there something less heavyweight we should do besides fs.qualify(path)?  
I'd like common case to be fast.
3. Should MetadataStore impls. and DirListingMetadata be responsible for this 
path conversion, or should we assert that the caller already did it?
3.b. If the former, are we ok with DirListingMetadata keeping a reference to an 
"owning" FileSystem instance via its constructor?

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.wip.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-20 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592718#comment-15592718
 ] 

Lei (Eddy) Xu commented on HADOOP-13449:


Ping, [~liuml07], in case that you did not HADOOP-13736, could you take a look 
of whether that approach can make your patch easier to implement? 


> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.wip.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-20 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592543#comment-15592543
 ] 

Mingliang Liu commented on HADOOP-13449:


I noticed the work [HADOOP-13650] is blocked. Sorry about that. The v1 patch 
will be ready by this week.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.wip.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-18 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586969#comment-15586969
 ] 

Lei (Eddy) Xu commented on HADOOP-13449:


Ping [~liuml07]. Do you have any updates on this?  

As started to work on HADOOP-13650, just realized that there is no one 
{{MetadataStore}} implementation actually being checked in...I'd much 
appreciate if we can have a patch soon, so we can work on HADOOP-13650 soon. 


> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.wip.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-13 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15573616#comment-15573616
 ] 

Lei (Eddy) Xu commented on HADOOP-13449:


Thanks!

For the MS contract test, I am also playing that: I think that it should not 
verify the fields that S3AFileStatus does not support, i.e., {{atime}}.  We 
should also adjust contract test for that.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.wip.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-13 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15573607#comment-15573607
 ] 

Mingliang Liu commented on HADOOP-13449:


Thank you [~eddyxu] for your review! I'll address them in the v0 patch. I'm 
oncall this week so hopefully early next week I can post a working patch.

The unit test can not pass by now, as the UT assumes read (e.g. {{get}}) 
operations return a very valid {{S3AFileStatus}} which DynamoDBMetadataStaore 
doesn't provide yet. I have to read related patches in parallel to address this.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.wip.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-13 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15573579#comment-15573579
 ] 

Lei (Eddy) Xu commented on HADOOP-13449:


Thanks for posting the patch,[~liuml07]. 

The concept looks very reasonable in general. And I like the schema. I 
understand that this is a WIP patch. So I'd list the following suggestions for 
reference. 

* Using local dynamodb local mode in test is really nice.
* We should store more metadata beside {{is_directory}} in metadata. so that we 
can reconstruct {{S3AFileStatus}} in {{itemToPathMetadata()}}.  Especially, do 
you think it is worth to store {{S3AFileStatus#isEmptyDirectory}} as well?
* To this extend, I think {{PathMetadata}} should take {{S3AFileStatus()}} 
instead of {{FileStatus}}. So that S3 specific attributes are more easily to be 
obtained. 
* {{DynamoDBMetadataStore}} should have a {{deleteTable}} on par with 
{{initTable}}. Both {{initTable}} and {{deleteTable}} should be package-wide 
visible so that it can be used from CLI tools. 
* 
{code}
try {
   table.waitForActive();
} catch (InterruptedException e) {
   LOG.warn("Interrupted while waiting for DynamoDB table {} active",
  tableName, e);
}
{code}

Should it throw {{IOE}} to indicate the failure?

* When do {{table.query()}} in {{listChildren()}},  the query might return 
partial results because the returned dataset is large. You can use {{ 
QueryResult#LastEvaluatedKey()}} for the following calls.

* DynamoDB {{tableName}} should be able to be specified in configuration, i.e., 
considering that multiple ETL jobs might running against the same dataset with 
different purposes and different lifetimes, using different tables could allow 
such jobs managed the lifetime of dynamodb tables by themself. 

Thanks for the nice work!

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.wip.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-10-06 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15552730#comment-15552730
 ] 

Mingliang Liu commented on HADOOP-13449:


I just came from a short vacation. Will post the patch early next week. Thanks.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-09-27 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15527444#comment-15527444
 ] 

Lei (Eddy) Xu commented on HADOOP-13449:


Great.Thanks a lot!

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-09-27 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15527442#comment-15527442
 ] 

Mingliang Liu commented on HADOOP-13449:


Will post a WIP patch in one  week.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-09-27 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15527383#comment-15527383
 ] 

Lei (Eddy) Xu commented on HADOOP-13449:


Hi, [~liuml07] 

Would you mind give some insights about the progress, giving HADOOP-13448 been 
committed?

Much appreciated. 


> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-09-06 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468105#comment-15468105
 ] 

Mingliang Liu commented on HADOOP-13449:


Hi [~eddyxu], I expect progress once [HADOOP-13448] is committed and/or almost 
done. We can first link the dependent JIRAs here so related efforts can be 
easily tracked. Thanks.

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-09-06 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468051#comment-15468051
 ] 

Lei (Eddy) Xu commented on HADOOP-13449:


Hi [~liuml07].  Is there any progress here?  We'd love to start work on the 
following implementations, which are depended on this JIRA.



> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.

2016-08-01 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403079#comment-15403079
 ] 

Mingliang Liu commented on HADOOP-13449:


Feel free to assign it to me if your working queue is too long. Thanks 
[~cnauroth].

> S3Guard: Implement DynamoDBMetadataStore.
> -
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Priority: Minor
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org