[ 
https://issues.apache.org/jira/browse/HADOOP-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15527640#comment-15527640
 ] 

Aaron Fabbri commented on HADOOP-13448:
---------------------------------------

Hey [~cnauroth] thanks for the feedback

{quote}
Regarding option (c), what is your thoughts about that it moves the complexity 
to the caller
{quote}

I see the complexity as unavoidable if we want (1) on-demand storage of 
metadata (i.e. don't have to pre-load the entire s3a bucket's metadata before 
turning on MetadataStore, it just incrementally picks up state), and (2) the 
ability to do atomic move.  

I feel like #1 is very valuable here, and ultimately more robust.

{quote}
what about we just do the move(srcKey, dstKey) together with copyFile() in the 
loop
{quote}
I'd be fine with this if we don't mind relaxing #2.  That is, there is no 
atomic {{MetadataStore#move()}}.  We are still able to explicitly track path 
deletions.

We can always add back this "batch move" interface I propose as (c), if the 
DynamoDB implementation wants to implement atomic move, or wants to batch this 
stuff to make it more efficient.

That said, it is probably not too hard for s3a to accumulate lists of paths and 
pass them in as a batch towards the end.

Some options:

1. Keep the (c) batch interface (I have a patch for this I can post on 
HADOOP-13631)
2. Implement a non-batch, non-recurive move(srcPath, destPath) as you mention.
3. Implement both single move() and batch move(), with default batch 
implementation just looping over input collections and calling the single 
variant.

I'm good with any of these.  We should feel comfortable punting complexity to 
future patches until we have more context with the dynamo db implementation and 
the s3a integration.



> S3Guard: Define MetadataStore interface.
> ----------------------------------------
>
>                 Key: HADOOP-13448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13448
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>             Fix For: HADOOP-13345
>
>         Attachments: HADOOP-13448-HADOOP-13345.001.patch, 
> HADOOP-13448-HADOOP-13345.002.patch, HADOOP-13448-HADOOP-13345.003.patch, 
> HADOOP-13448-HADOOP-13345.004.patch, HADOOP-13448-HADOOP-13345.005.patch
>
>
> Define the common interface for metadata store operations.  This is the 
> interface that any metadata back-end must implement in order to integrate 
> with S3Guard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to