[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-27 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15527640#comment-15527640
 ] 

Aaron Fabbri commented on HADOOP-13448:
---

Hey [~cnauroth] thanks for the feedback

{quote}
Regarding option (c), what is your thoughts about that it moves the complexity 
to the caller
{quote}

I see the complexity as unavoidable if we want (1) on-demand storage of 
metadata (i.e. don't have to pre-load the entire s3a bucket's metadata before 
turning on MetadataStore, it just incrementally picks up state), and (2) the 
ability to do atomic move.  

I feel like #1 is very valuable here, and ultimately more robust.

{quote}
what about we just do the move(srcKey, dstKey) together with copyFile() in the 
loop
{quote}
I'd be fine with this if we don't mind relaxing #2.  That is, there is no 
atomic {{MetadataStore#move()}}.  We are still able to explicitly track path 
deletions.

We can always add back this "batch move" interface I propose as (c), if the 
DynamoDB implementation wants to implement atomic move, or wants to batch this 
stuff to make it more efficient.

That said, it is probably not too hard for s3a to accumulate lists of paths and 
pass them in as a batch towards the end.

Some options:

1. Keep the (c) batch interface (I have a patch for this I can post on 
HADOOP-13631)
2. Implement a non-batch, non-recurive move(srcPath, destPath) as you mention.
3. Implement both single move() and batch move(), with default batch 
implementation just looping over input collections and calling the single 
variant.

I'm good with any of these.  We should feel comfortable punting complexity to 
future patches until we have more context with the dynamo db implementation and 
the s3a integration.



> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-27 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15527531#comment-15527531
 ] 

Lei (Eddy) Xu commented on HADOOP-13448:


Hi, [~fabbri] Thanks for the proposals.

Regarding option (c), what is your thoughts about that it moves the complexity 
to the caller (i.e., s3a filesystem)? For instance, s3a needs to determine 
where to get a full list of s3 files under "{{/src}}"?  The other concern, as 
you mentioned, is the interface is not very intuitive to the API consumers. 

As moving a directory recursively is implemented by copying actual S3 file 
(object) one by one eventually, what about we just do the {{move(srcKey, 
dstKey)}} together with {{copyFile()}} in the loop in 
{{S3AFileSystem#innerRename()}}, so that {{move()}} does not need to validate 
{{srcPath}} and {{dstPath}}

{code}
  while (true) {
for (S3ObjectSummary summary : objects.getObjectSummaries()) {
  keysToDelete.add(
  new DeleteObjectsRequest.KeyVersion(summary.getKey()));
  String newDstKey =
  dstKey + summary.getKey().substring(srcKey.length());
  copyFile(summary.getKey(), newDstKey, summary.getSize());

  if (keysToDelete.size() == MAX_ENTRIES_TO_DELETE) {
removeKeys(keysToDelete, true);
  }
}
   }
{code}


> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-23 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15517772#comment-15517772
 ] 

Aaron Fabbri commented on HADOOP-13448:
---

Thinking about this a bit more, I'm favoring option (C).  pathsToCreate would 
probably be a Collection though.

LocalMetadataStore is the easiest to implement.  It could basically 

{code}
acquireLock(this)
for deleteMe in pathsToDelete :
metadata = get(deleteMe)
metadata.setDeleted(true)

for createMe in pathsToCreate :
put(createMe)
releaseLock(this)
{code}

We should also consider, in the future, exposing a getProperties() method on 
MetadataStore where clients (and tests) can query properties of the 
MetadataStore, e.g. metadataStore.getProperties(IS_MOVE_ATOMIC).  This allows, 
say, DynamoDBMetadataStore to chose not to implement atomic move.

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-21 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15510928#comment-15510928
 ] 

Aaron Fabbri commented on HADOOP-13448:
---

Another interface thing to discuss is move().  I'm working on HADOOP-13631 and 
there a couple of negatives to the current interface:

{code}
  /**
   * Moves metadata from {@code src} to {@code dst}, including all descendants
   * recursively.
   *
   * @param src the source path
   * @param dst the new path of the file/dir after the move
   * @throws IOException if there is an error
   */
  void move(Path src, Path dst) throws IOException;
{code}

1. This does not support sparsely-stored metadata well.
2.  The client (FileSystem) likely is already traversing the source and 
destination tree paths, and that work will be repeated in the MetadataStore to 
affect all the paths that need to change. (not too bad, depending on how a 
particular MetadataStore is implemented.)

On #1, consider the fact that we do not require the entire filesystem directory 
structure to be in the MetadataStore.  For example, a user could have some 
bucket s3a://big-bucket with many files and (emulated) directories in it.  They 
then fire up a Hadoop cluster for the first time, and the metadata store tracks 
changes they make from that point forward.

Imagine the big-bucket contains this subtree:
{code}
.
├── a1
│   └── b1
│   ├── file1
│   ├── file2
│   ├── file3
│   ├── file4
│   └── file5
└── a2
{code}

Since this is existing data, and user has just started a new Hadoop cluster, 
the MetadataStore is currently empty.

Now, the user does a {{move(/a1/b1, /a2/b1)}}

The MetadataStore will still be empty.  It received a move request but it did 
not match any stored paths.

Next, another node does a {{listStatus(/a2)}}.  Assuming the underlying blob 
store is eventually consistent, the user may see an empty listing, as both the 
underlying store and the MetadataStore have no entries in that directory.

Some possible solutions:

A. Clients must call both {{MetadataStore#move()}} on src, dst paths, and also 
{{put()}} on all destination paths affected (recursively).  

B. We could remove move() from MetadataStore, and tell clients to use put()s 
for each moved path, and deleteSubtree(src).  This does not allow atomic move 
to be implemented, however.

C. We could expose a batch version of B, i.e. move(Collection 
pathsToDelete, Collection pathsToCreate).

Any thoughts?  Option C does solve both issues I brought up.  It would have 
good usability for s3a, but could be a little awkward for other filesystems 
perhaps.

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-20 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15508089#comment-15508089
 ] 

Chris Nauroth commented on HADOOP-13448:


+1 for a {{MetadataStore#initialize}} method accepting the {{FileSystem}} base 
class, but allowing subclasses to demand and downcast to something more 
specific.  In my prototype, tightly coupling the DynamoDB integration to 
{{S3AFileSystem}} was helpful, because it allowed reuse of the S3A configured 
bucket, {{AWSCredentialsProvider}} and {{ClientConfiguration}}, which involves 
some fairly complex initialization logic.

I think passing {{FileSystem}} to {{initialize}} also allows us to remove the 
{{Configuration}} parameter.  Any {{FileSystem}} is a {{Configured}}, so we can 
get a {{Configuration}} out of it.

+1 also for requiring absolute paths in the arguments, and leaving 
responsibility for absolute path resolution to the {{FileSystem}}.

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-15 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494750#comment-15494750
 ] 

Mingliang Liu commented on HADOOP-13448:


+1 for the proposal. Thanks.

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-15 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494659#comment-15494659
 ] 

Aaron Fabbri commented on HADOOP-13448:
---

Another design question.  I'd like to propose we require all clients to supply 
absolute paths to the MetadataStore.

Why?

I just tried to support a combination of relative and absolute paths in the 
LocalMetadataStore.  I wrote some new tests and found that:

- This requires both MetadataStore and DirListingMetadata to be able to resolve 
relative and absolute paths.  I feel better separation is to leave this to the 
FileSystem clients: they already have to deal with this.DirListingMetadata 
should be a stupid container, not something that knows how to resolve relative 
paths against working dirs.  If DirListingMetadata#put() can take a relative 
path, then it needs to resolve it against the working dir so it can ensure it 
is a proper child of the dir path. The working dir may differ between clients 
sharing a metadata store.  This implies that implementations need to track an 
absolute path in addition to the relative one allowed in the 
DirListingMetadata.  This is extra complexity, garbage, etc.

- In general the logic becomes more error-prone to handle correctly when both 
types of paths are allowed.

- FileSystem clients (i.e. s3a) already have to do this logic.  

In short, I propose that methods in MetadataStore and DirListingMetadata which 
take a path will use Precondition checks to enforce that those paths are 
absolute.  Thoughts [~cnauroth]?

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491760#comment-15491760
 ] 

Mingliang Liu commented on HADOOP-13448:


{quote}
I can make the change as part of the v2 patch for HADOOP-13452
{quote}
That looks good. I'm not blocked anyway. Will bring you guys to discussion for 
sure. Basically I likes what Chris designed in his prototype, and minor changes 
on the way. The major effort now is about starting up a DynamoDBLocal in the 
unit test. Sadly DynamoDBLocal is not well maintained (after I left Amazon :P).

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-14 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491745#comment-15491745
 ] 

Aaron Fabbri commented on HADOOP-13448:
---

I can make the change as part of the v2 patch for HADOOP-13452.  I'll try to 
get that out soon so we don't disrupt your ongoing DynamoDB work too much.  
Does that work for you?

Excited to see your DynamoDB implementation, by the way.  Shout if myself or 
[~eddyxu] can help.

Yes, for the LocalMetadataStore unit test I will probably use a stubbed 
FileSystem or RawLocalFilesystem.  I only need it to get current working dir so 
far (for making paths qualified).  It will be nice to have some basic test 
coverage that does not require S3A integration test setup.

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491728#comment-15491728
 ] 

Mingliang Liu commented on HADOOP-13448:


Sure it's more flexible, though we're not exploring RawLocalFileSystem yet in 
the current S3A test I guess?

Should we file a JIRA for this? Or I can change this in [HADOOP-13449], or you 
change this in [HADOOP-13452] (or both). Thanks.

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491708#comment-15491708
 ] 

Mingliang Liu commented on HADOOP-13448:


Precisely. See my above comment. Ping [~cnauroth] for 3rd opinion.

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-14 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491710#comment-15491710
 ] 

Aaron Fabbri commented on HADOOP-13448:
---

Ok, thanks.  Another thing to consider is, I shouldn't have to create or mock a 
S3AFileSystem to test LocalMetadataStore.  I should be able to just use 
something like RawLocalFileSystem, right?  It can be a unit test instead of an 
integration test.

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-14 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491701#comment-15491701
 ] 

Aaron Fabbri commented on HADOOP-13448:
---

i.e. 

{code}

public class DynamoDBMetadataStore implements MetadataStore {
...
public void initialize(FileSystem fs) {
if (! fs instanceof S3AFileSystem) {
throw new IOException("DynamoDBMetadataStore only supports S3A 
filesystem.");
}
...
{code}

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491699#comment-15491699
 ] 

Mingliang Liu commented on HADOOP-13448:


Sorry I never thought out of the S3A box. It looks a bit over-design to 
consolidate the efforts of improving consistency for Swift/WASB here; but I 
think the proposal is acceptable. One approach is in DynamoDBMetadataStore, we 
always assume the FileSystem is actually a S3AFileSystem; at least we can cast 
type.

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491679#comment-15491679
 ] 

Mingliang Liu commented on HADOOP-13448:


Are we moving the S3Guard out of S3A module someday? If not, class circular 
dependency seems OK to me.

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-14 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491681#comment-15491681
 ] 

Aaron Fabbri commented on HADOOP-13448:
---

Just saw your followup comment.

How about this idea:  We keep MetadataStore independent of particular 
implementations' dependencies, but add extra parameters as needed for 
implementations.

e.g. MetadataStore could eventually be separate submodule in Hadoop Common, 
with DynamoDBMetadataStore living in hadoop-aws and taking additional init 
parameters for S3 client, etc.

Then, for example, Swift/WASB/ADLS could also have implementations that require 
additional dependencies (e.g. AzureDBMetadataStore). 

Another way of thinking of this is, MetadataStore interface only includes 
things that all implementations have in common, instead of providing a union of 
everything all the implementations use.


> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-14 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491663#comment-15491663
 ] 

Aaron Fabbri commented on HADOOP-13448:
---

[~liuml07] can you elaborate on why you prefer S3AFileSystem?

Yes, it is feasible to extract this work because there are no dependencies on 
S3A as the code stands today, besides the fact that it happens to live in that 
project.

Also from layering perspective, I'd prefer S3A --depends-on> MetadataStore 
instead of a circular dependency. 



> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-14 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491653#comment-15491653
 ] 

Chris Nauroth commented on HADOOP-13448:


Yes, I forgot about this point, which I encountered myself in the prototype 
patch.  Thank you, Mingliang.

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491649#comment-15491649
 ] 

Mingliang Liu commented on HADOOP-13448:


Another point is that, DynamoDBMetadataStore should feel her life much easier 
if S3AFileSystem is provided, say s3client configuration, regions etc. 
Actually, in my in-progress patch for HADOOP-13449, I already assume an 
S3AFileSystem object.

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-14 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491636#comment-15491636
 ] 

Mingliang Liu commented on HADOOP-13448:


I'd still prefer S3AFileSystem.

{quote}
envisioning the directory entry caching features potentially useful to other 
clients
{quote}
Is it feasible to extract the common code out of S3AGuard? Thx.

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-14 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491609#comment-15491609
 ] 

Chris Nauroth commented on HADOOP-13448:


[~fabbri], yes, that sounds like a good idea.  If you pass a {{FileSystem}} 
parameter, then I think you could drop the {{Configuration}} parameter, because 
every {{FileSystem}} is a {{Configured}}, and therefore it carries the conf 
with it.

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-14 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491603#comment-15491603
 ] 

Aaron Fabbri commented on HADOOP-13448:
---

(Posting here since it is related, but I'd probably add the change as part of 
HADOOP-13452 or a subsequent patch.)

[~cnauroth] how do you feel about adding a FileSystem parameter to 
MetadataStore#initialize()?  The motivation is that I'm adding path 
qualification / normalization to the LocalMetadataStore, and finding I could 
use the associated FileSystem's working directory to make sure it is an 
absolute path.

I'm still trying not to add S3A dependencies unless strictly needed 
(envisioning the directory entry caching features potentially useful to other 
clients).


> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-09 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15477543#comment-15477543
 ] 

Aaron Fabbri commented on HADOOP-13448:
---

[~cnauroth] would love to, thanks.  Will try to commit this after a meeting.

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-09 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15477451#comment-15477451
 ] 

Chris Nauroth commented on HADOOP-13448:


[~fabbri], thank you for the code review.  Since this is targeted to the 
HADOOP-13345 branch, would you like to try out your shiny new commit bit?  :-)

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-09 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15477444#comment-15477444
 ] 

Aaron Fabbri commented on HADOOP-13448:
---

+1 on patch 005 (I'm only feature branch committer).

Thanks for posting this [~cnauroth]. Agreed we can continue to tweak this as 
needed. 

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15477439#comment-15477439
 ] 

Hadoop QA commented on HADOOP-13448:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
42s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
26s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
19s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 32s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HADOOP-13448 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12827771/HADOOP-13448-HADOOP-13345.005.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 57ad4950e0b3 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / d152557 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/10473/testReport/ |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/10473/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, 

[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475411#comment-15475411
 ] 

Hadoop QA commented on HADOOP-13448:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
29s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
41s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
39s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
20s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 17m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HADOOP-13448 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12827676/HADOOP-13448-HADOOP-13345.004.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 6cc145668803 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / d152557 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/10467/testReport/ |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/10467/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch
>
>
> Define the common 

[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-08 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475388#comment-15475388
 ] 

Aaron Fabbri commented on HADOOP-13448:
---

bq. Yes, that makes sense. Would both get and put change to work with 
PathMetadata instead of a raw FileStatus? I think so. I can update the patch 
one more time if you agree.

It feels like both should use PathMetadata for symmetry.. Then again, 
put(PathMetadata isDeleted=true) should be disallowed 
(IllegalArgumentException?).. so I could go either way on this.

Yep, I'm cool with you updating patch.

I'm +1 on this.. After you post above updates  I'll rebase my MetadataStore 
unit test and in-memory implementation and will try to get patches out for 
those tomorrow.


> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-08 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475290#comment-15475290
 ] 

Aaron Fabbri commented on HADOOP-13448:
---

Thanks for the patch [~cnauroth].  Looks good.  Couple of comments below.

{code}
+  /**
+   * @return entries in the directory
+   */
+  public Collection getFileStatuses() {
+return listMap.values();
{code}
Should this be "return Collections.unmodifiableCollection(listMap.values())" )?

I think main callers will be clients, which shouldn't be updating directories 
this way.  Looks like modification of the 
[values()|https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentHashMap.html#values()]
 collection are reflected in the Map.

{code}
+  /**
+   * Lookup entry within this directory listing.  This may return null if the
+   * {@code MetadataStore} only tracks a partial set of the directory entries.
+   * In the case where {@link #isAuthoritative()} is true, however, this
+   * function returns null iff the directory is known not to contain the 
listing
+   * at given path (within the scope of the {@code MetadataStore} that returned
+   * it)
+   *
+   * @param childPath path of entry to look for.
+   * @return entry, or null if it is not present or not being tracked.
+   */
+  public FileStatus get(Path childPath) {
{code}

FYI, I've since changed DirListingMetadata to deal with PathMetadata instead of 
FileStatus.  This is to implement consistent file deletions.  (i.e. you can 
get() a PathMetadata that contains a flag isDeleted = true to record a file 
deletion, even if you're not tracking the full contents of the directory).

I think this function will change to expose the same PathMetadata return value. 
 Thoughts?

I'm happy to submit a follow-up patch that adds this stuff.

{code}
+checkChildPath(childPath);
{code}

Cool, thanks for adding this.

{code}
+  /**
+   * Lists metadata for all direct children of a path.
+   *
+   * @param path the path to list
+   * @return metadata for all direct children of {@code path}, not {@code 
null},
+   * but possibly empty
+   * @throws IOException if there is an error
+   */
+  DirListingMetadata listChildren(Path path) throws IOException;
+
{code}

I neglected to update this javadoc when I changed the return type.  I now
have this comment as:

{code}
  /**
   * Lists metadata for all direct children of a path.
   *
   * @param path the path to list
   * @return metadata for all direct children of {@code path} which are
   * being tracked by the MetadataStore, or null if the path was not
   * found in the MetadataStore.
   * @throws IOException if there is an error
   */
{code}

So, if the path exists but dir is empty, we get a DirListingMetadata with an 
empty
list of children.

If the path does not exist in the MS, we return null (could instead throw
FileNotFoundException, but I don't think this qualifies as "exceptional", and
also it is not really File Not Found, but File Not Tracked by this
MetadataStore.  It could be confusing if propagated. )





> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-06 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469109#comment-15469109
 ] 

Aaron Fabbri commented on HADOOP-13448:
---

{quote}Are you interested in doing another revision, or should I?{quote}

Welcome back.  You can do next revision. I'm here to help, though, so shout if 
you'd like me to do the follow up.
...
{quote}
This is nice, but I expect in practice we'll instantiate through 
ReflectionUtils#newInstance and lose the type bounds unless we do an unsafe 
cast.
{quote}
Ah, ok.. We can drop the type parameter and just use {{FileStatus}}, I guess.

{quote}
The presence of the DirListingMetadata#put method implies mutable usage. Should 
this class provide thread safety, or is there an expectation that the caller 
will enforce thread-safe access? Either way, it would be good to add a comment 
about thread safety to the class-level JavaDocs.
{quote}
Good point.  Intention is that put() is only called inside MetadataStore 
implementations.  We should probably enforce that somehow (package private?).
 
{quote}I suppose a higher-level decision for all of the above is mutable vs. 
immutable objects. {quote}

I agree with your points here.  Both mutable and immutable DirListingMetadata 
ideas seem to have significant downsides, as you say, depending on application.

I feel like we can make DirListingMetadata expose a thread-safe API independent 
of whether or not the contents can change during a reference's lifetime.  The 
lifetime behavior may be important to clients, however.  This reminds me of the 
GCS link you shared before:  They [expose a 
flag|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/1447da82f2bded2ac8493b07797a5c2483b70497/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/DirectoryListCache.java#L128]
 that seems to inform clients about the lifetime of the data.  We could do 
something similar, with the LocalMetadataStore (in-memory) returning true, and 
something like the DynamoDB MetadataStore returning false.

{quote}
Does DirListingMetadata need a method to return all children in the listing? In 
your prototype on HADOOP-13345, there was CachedDirectory#getFileStatuses.
Similarly, does DirListingMetadata need mutator methods for isAuthoritative?
{quote}

Yes, and yes.  We'll also need a remove() on DirListingMetadata (I added one 
for the LocalMetadataStore I'm working on).




> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-06 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468963#comment-15468963
 ] 

Chris Nauroth commented on HADOOP-13448:


[~fabbri], thanks for continuing work on this while I was out.  The changes 
look good to me.  Here are a few comments.  Are you interested in doing another 
revision, or should I?

bq. For PathMetadata#toString(), it'd be nice to have isDirectory flag in the 
output.

This is an old comment referring to revision 001, but I figured I'd respond 
anyway.  The {{toString}} output did indicate whether it was referring to a 
directory or a file due to inclusion of the specific subclass name: 
{{DirectoryPathMetadata}} or {{FilePathMetadata}}.

bq. Could also use a private boolean isDirectory instead introducing separate 
private class to return true/false.

The private class separation I was trying earlier was an attempt to set us up 
to reduce memory footprint for metadata instances by only storing certain 
members in relevant subclasses that need them.  This may be premature of me, so 
please feel free to put a boolean member variable in there instead.

bq. I use a type bound to allow them to hold any subclass of FileStatus, e.g. 
S3AFileStatus.

This is nice, but I expect in practice we'll instantiate through 
{{ReflectionUtils#newInstance}} and lose the type bounds unless we do an unsafe 
cast.  (Thanks, Java!)

bq. Should MetadataStore be an interface?

I had set it up as an abstract base class just for convenience of 
implementation inheritance getting a {{Logger}} instance.  At this point, 
interface does look like the more appropriate choice.

bq. How do folks feel about this API consuming/producing FileStatus'?

I think this is fine.


The presence of the {{DirListingMetadata#put}} method implies mutable usage.  
Should this class provide thread safety, or is there an expectation that the 
caller will enforce thread-safe access?  Either way, it would be good to add a 
comment about thread safety to the class-level JavaDocs.

Does {{DirListingMetadata}} need a method to return all children in the 
listing?  In your prototype on HADOOP-13345, there was 
{{CachedDirectory#getFileStatuses}}.

Similarly, does {{DirListingMetadata}} need mutator methods for 
{{isAuthoritative}}?

I suppose a higher-level decision for all of the above is mutable vs. immutable 
objects.  For a {{MetadataStore}} implemented by a persistent back-end, the 
objects probably need to be reconsituted after each query, so immutable would 
work well.  OTOH, for the in-memory implementation, it's possibly more 
efficient to allow mutability.

Jenkins seems to be rebooting at the moment, so I couldn't review the 
Checkstyle and JavaDoc warnings.


> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-09-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15456742#comment-15456742
 ] 

Hadoop QA commented on HADOOP-13448:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
13s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
16s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
36s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
29s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 10s{color} | {color:orange} hadoop-tools/hadoop-aws: The patch generated 1 
new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
13s{color} | {color:red} hadoop-aws in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
15s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 17m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HADOOP-13448 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12826710/HADOOP-13448-HADOOP-13345.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux ef3bb0ce5d78 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / 763f049 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/10441/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-aws.txt
 |
| javadoc | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/10441/artifact/patchprocess/patch-javadoc-hadoop-tools_hadoop-aws.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/10441/testReport/ |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/10441/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> S3Guard: Define MetadataStore interface.
> 

[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-08-23 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15433626#comment-15433626
 ] 

Aaron Fabbri commented on HADOOP-13448:
---

Thanks for the patch! Looks pretty good, especially since we will evolve it as 
needed.

- I really like having initialize() / Closable as well.
- put() / putNew() this is fine to eliminate putNew().. I wasn't convinced the 
distinction was necessary.  We'll need to go through the same exercise of 
defining precise semantics, and writing test cases, that we've done for the FS 
Contract test stuff.  We can always add putNew() if the distinction is useful.

On PathMetadata:
- Could also use a private boolean isDirectory instead introducing separate 
private class to return true/false.  Besides saving a couple of lines of code, 
it may make the next suggestion clearer in the code... (Not a biggie though)

On listChildren() return type:
- I think we need more than a List if we are to express 
fully-cached directories, so callers--e.g. listStatus()--know if they may avoid 
a round trip to blobstore.  How about an additional type, say, 
DirectoryListMetadata (I'd called it CachedDirectory).  It is essentially a 
struct of List + extra state.  The extra state I want, so far, is 
just the "fully cached" boolean flag.

Please shout if any of that is not clear... I'm a little sleep-deprived today. 



> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13448-HADOOP-13345.001.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-08-23 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15433301#comment-15433301
 ] 

Lei (Eddy) Xu commented on HADOOP-13448:


Hi, [~cnauroth] The interface looks very nice. Just one small comment.  For 
{{PathMetadata#toString()}}, it'd be nice to have {{isDirectory}} flag in the 
output. 

Btw, {{DirectoryPathMetadata/FilePathMetadata}} will be implemented in place 
later? 

Thanks.



> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13448-HADOOP-13345.001.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-08-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432212#comment-15432212
 ] 

Hadoop QA commented on HADOOP-13448:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
33s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
16s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
46s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
32s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
13s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 17m 53s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12824993/HADOOP-13448-HADOOP-13345.001.patch
 |
| JIRA Issue | HADOOP-13448 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 7624106b85a1 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / 763f049 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/10341/testReport/ |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/10341/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13448-HADOOP-13345.001.patch
>
>
> Define the 

[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-08-10 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416334#comment-15416334
 ] 

Aaron Fabbri commented on HADOOP-13448:
---

Ok, cool.  Sounds like we are on the same page.  As I said before, you did a 
great job keeping the new stuff separate.  Interesting point about the save on 
close.  The fun of emulating directories on flat keys continues...

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-08-10 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416136#comment-15416136
 ] 

Chris Nauroth commented on HADOOP-13448:


bq. Why does your {{DynamoDBConsistentStore#save()}} implementation walk the 
path to the root and save all ancestor paths as well?

That's a good observation.  I think this is a weakness of my prototype, not a 
desirable choice intended to carry through to the full implementation.

More specifically, I approached my prototype by developing a separate 
hadoop-s3guard module with a new {{ConsistentS3AFileSystem}} class defined as a 
subclass of the existing {{S3AFileSystem}} class.  The benefit of this approach 
was that I didn't need to make a lot of code changes directly in hadoop-aws, so 
I could develop the prototype isolated from the churn of merge conflicts on 
upstream hadoop-aws patches.  (There was a lot of optimization and bug fixing 
happening concurrently at the time.)  The drawback of this approach was that it 
constrained my implementation.  For {{mkdirs}}, I could only call the 
superclass and then pass the path to {{ConsistentStore#save}}, so the 
consistent store code needed a complete implementation using solely that path 
argument.  There was no way for me to preserve the information discovered in 
{{S3AFileSystem#innerMkdirs}} about which intermediate directories were 
missing, as was done in your prototype.

I came to the conclusion that the subclassing approach wouldn't be ideal for 
reasons like this.  We can get better results by hooking into implementation 
details more deeply, and that led me to the refactoring proposed on 
HADOOP-13447.  Between {{S3Store}}, {{AbstractS3AccessPolicy}} and the 
{{MetadataStore}} interface, we should feel free to evolve those interfaces 
however it best suits requirements.  They are internal interfaces, so they 
don't need to be constrained by the Hadoop compatibility guidelines, as long as 
{{S3AFileSystem}} can translate back to the public {{FileSystem}} interface at 
the end.  In the example you gave here, maybe that means something like 
{{S3Store#mkdirs}} returning a result object that lists which directories in 
the ancestry were not pre-existing.

Another smaller reason my prototype worked that way is that it was also easy to 
hook a call to {{ConsistentStore#save}} onto the close of the stream returned 
by {{FileSystem#create}}.  Unlike {{mkdirs}}, there is no such walk up the 
ancestry to check for pre-existing directories there, so I had to take care of 
it entirely within my code.  This is really more of a bug in the existing S3A 
code though that I was working around.  (See HADOOP-13221.)

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13448) S3Guard: Define MetadataStore interface.

2016-08-10 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15415658#comment-15415658
 ] 

Aaron Fabbri commented on HADOOP-13448:
---

To start the interface discussion:

I was looking at the two ConsistentStore/MetadataStore interfaces we posted to 
HADOOP-13345, and had a question:  Why does your 
{{DynamoDBConsistentStore#save()}} implementation walk the path to the root and 
save all ancestor paths as well?  I understand that this is mirroring what the 
S3A client does.  I guess I preferred leaving the enumeration of ancestor 
directories that need to be created to the client.  The case I'm thinking of is 
a workload that does this:

{code}
path = "/dirs"
for x in range(0,10) :
  path += "/dir" + str(x)
  s3aFS.mkdirs(path)
{code}

In this case, we're making N directories that get deeper and deeper with the 
same ancestors.  This seems like an theta(N^2) cost with your prototype, but 
only N cost with ours.  The downside to our approach is that the client code 
has to explicitly track the missing ancestors that need to be passed to the 
MetadataStore.  In our patch it looked like this at the end of 
{{innerMkdirs()}}:

{code}
} catch (FileNotFoundException fnfe) {
  instrumentation.errorIgnored();
  if (dirsToCreate != null)
dirsToCreate.add(fPart);
}
fPart = fPart.getParent();
  } while (fPart != null);

  String key = pathToKey(f);
  createFakeDirectory(key);
  metadataStoreMkdirs(dirsToCreate, permission);
  return true;
}
  }
{code}

I might be missing something here, let me know what you think.

> S3Guard: Define MetadataStore interface.
> 
>
> Key: HADOOP-13448
> URL: https://issues.apache.org/jira/browse/HADOOP-13448
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org