steveloughran commented on issue #1655: HADOOP-16629: support copyFile in 
s3afilesystem
URL: https://github.com/apache/hadoop/pull/1655#issuecomment-546934788
 
 
   see #1679 for my proposal for a Filesystem API for multipart uploads. That 
is a draft implementation right now which lacks:
   
   1. FileContext support.
   1. updated specification.
   1. parent directory checks.
   
   I must highlight the `BulkOperationState` issue. A big part of speeding up 
rename/delete/commit in S3A was eliminating and needless duplicate checks of 
S3Guard state during the bulk operations -we did this by caching the ongoing 
state. If you plan to copy many files in the same operation, you will need this.
   
   If you look at how RenameOperation uses copyFile(); it is updating its 
RenameTracker to keep that operation state consistent; for the CommitOperations 
it also gets passed around by way of a CommitContext. copy (as the multipart 
upload) is going to have to do the same.
   
   I can help by rounding out the #1679 proposal with the tracking of that 
state to show you what to do, but
   
   * yes, you are going to have to design something which is designed to work 
across different stores
   * and be stable across time.
   * with spec and tests.
   
   I know HDFS is not above adding some private API to help out HBase and then, 
well nobody notices, pulling at that up into hadoop-common, but I don't like 
that see 
(HDFS-8631)[https://issues.apache.org/jira/browse/HDFS-8631?focusedCommentId=16961004&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16961004].
 I do expect a fair amount of rigour here because we are going to be 
maintaining these APIs for decades. 
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to