yangagile commented on pull request #2235:
URL: https://github.com/apache/hadoop/pull/2235#issuecomment-678966359


   > I've discussed some of what I'd like in 
https://issues.apache.org/jira/browse/HDFS-15484?focusedCommentId=17162752&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17162752
   > 
   > * matching hadoop common JIRA to make clear this is going near the FS APIs
   > * something in the fileysystem spec to cover the new interface, define its 
semantics, etc. In particular, we need all things which are "obvious" to be 
written down, because often it turns out they aren't obvious at all.
   > * tests which really set out to break things. Writing the spec will help 
you think of them
   > 
   > Some ideas for tests
   > 
   > * renaming root
   > * rename to root
   > * rename to self
   > * path under self
   > * path above self
   > * two sources to same dest
   > * chained rename
   > * swapping paths
   > 
   > API wise, this could be our chance to fix rename properly, that is: I 
should be able to use this for a single rename((src, dest), opts) and have it 
do what I want. And as discussed, I Want something which works well with object 
stores
   > 
   > * use a builder to let apps specify options (see openFile()) and use the 
same base builder classes
   > * Return a future of the outcome. If we can get the HADOOP-16830 
IOStatistics patch in first, the outome returned can be declared as it 
something which implement IOStatisticSource. This matters to me, as I want to 
know the costs of rename operations.
   > 
   > I think we should also add a rename option about atomicity; for a single 
rename() this would be that the rename itself is atomic. For a batch with size 
> 1, this means "the entire set of renames are atomic".
   > 
   > FileContext and ViewFS will also need to pass this through. Sorry.
   > 
   > One thing we could do here is actually provide a base implementation which 
iterates through the list/array of (src, dest) paths. This would let us add a 
non-atomic implementation to all filesystems/filecontexts. That would be very 
nice as it really would let me switch to using this API wherever we used 
rename(), such as distcp and MR committers.
   > 
   > rename() is that the trickiest of all FS API calls to get right. I don't 
think we fully understand what right is. certainly if I was asked about the 
nuances (src = file, dest = dir) and (src = dir, dest=dir) I'm not confident I 
could give an answer which is consistent for both POSIX and HDFS. This is our 
opportunity to make some progress here!
   > 
   > I know, this is going to add more work. But it is time.
   
   Thanks @steveloughran for the detailed introduction.
   Yes, we should do the right things, and implemnt step by step.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to