[ 
https://issues.apache.org/jira/browse/HDFS-15484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17162752#comment-17162752
 ] 

Steve Loughran commented on HDFS-15484:
---------------------------------------

First: As this goes near the hadoop FS APIs, it needs a matching hadoop-common 
JIRA for broader visibility, and the relevant FS spec updated, contract tests. 
The HDFS team do often forget that expectation, which creates needless tension, 
etc (HADOOP-16898)


I can see the benefits of this for HDFS and that this is straightforward to 
implement there, especially if you define it's semantics and path element 
validity as "what HDFS does"

I worry about object storage though -what if they don't implement it, and if 
they do, what exactly are they expected to do.


At a quick glance, there is no immediate benefit in the object stores, and 
viewFS won't be able to handle renames across mounts. If any of the object 
stores implemented it, without contract tests or specifications is a risk that 
implement it "Wrong". And we all know about how much trouble rename() is 
already.
Meanwhile, people coding for this need to know how to use it, whether it can 
fail halfway through or not -and what that means, and how to identify and cope 
with far systems which don't implement it.


If you really do want to do this, you are going to have to make sure the API 
works

# define *batch* in this world
# Semantics, including atomicity, partial failures, ordering of renames
# what would rename-(/a:/b:/c, /b:/c:/a:) do? 
# rename(/a:/a/subdir,  /b/c:/b) ?
# rename(/a:/nonexistent, /b:/c)
# rename(/a:/b, /c:/c are)
# rename dir onto file, file onto file, parent dir doesn't exist, parent is a 
file,...
# what does an FS which doesn't support the operation do?
# how current applications discover whether or not this option is supported? 
PathCapabilities?
# how do all the current filesystems other than HDFS react to a batch call? (at 
a glance: badly)
# what to do on a filesystem with multiple mounts, e.g viewFS
# what about filesystems where ":" is a valid element of a path? recurrent 
issue: 
https://issues.apache.org/jira/issues/?jql=project%20%3D%20HADOOP%20AND%20text%20~%20%22path%20colon%22
# where are the contract tests, especially for the troublespots?

Yes, a lot of these things seem "obvious", but the semantics of rename() make 
it the hardest API call to implement, and the fact that HDFS behaves 
differently from posix is part of the problem: 
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/filesystem.md#boolean-renamepath-src-path-d


FWIW, I've been thinking recently about doing an async rename in the form of a 
builder API like openFile; one where clients can block waiting for a response 
(one which can even include the statistics of HADOOP-16839). All those stores 
with O(files) or O(data) rename semantics would then let the caller do other 
things during rename; if we adde,


{code}

Future<RenameFileOutcome> f = renameFile(sourceFile)
  .withDestFile(dest)   // must be a filename with a parent path existing, no 
file at end
  .withDestPermissions(0777)
  .build()
  
f.get()

Future<RenameDirOutcome> f = renameDir(srcDir)
  .withDestDir()   // must be a nonexist dir with parent existing
  .must("fs.operation.rename.atomic", true)   // request atomic dir rename; 
stores which don't MUST fail
  .build()
 
Why the separate entries: caller has to know what they are doing. Which 
generally they do, we just lose that fact in the API call and end up having to 
look at the remote FS to work out what to do.

I've not done any work on this. If we did start it, a batch sequence would be 
similar

Future<RenameBatchOutcome> f = renameBatch(parent of all renames) // allows 
viewfs to forward
    .withDir(src, dest)
    .withFile(src, dest)
    ...
    build

+add all the definitons about order etc, ideally with safety checks for obvious 
conflicts (same src, same dest, chain) to fail consistently there.

Reporting the state after a partial failure would be tricky though. 

If changes to rename are being done, it would be time to do a cloud-native 
rename operation. I have no plans to write this, so if someone where to start 
the new API for both FS and FC, I'm happy to let them...

(Sorry for causing trouble, but rename is the FS API which causes more problems 
in cloud land than any other)

> Add option in enum Rename to suport batch rename
> ------------------------------------------------
>
>                 Key: HDFS-15484
>                 URL: https://issues.apache.org/jira/browse/HDFS-15484
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: dfsclient, namenode, performance
>            Reporter: Yang Yun
>            Assignee: Yang Yun
>            Priority: Minor
>         Attachments: HDFS-15484.001.patch
>
>
> Sometime we need rename many files after a task,  add a new option in enum 
> Rename to support batch rename, which only need one RPC and one lock. For 
> example,
> rename(new Path("/dir1/f1::/dir2/f2"), new Path("/dir3/f1::dir4/f4"), 
> Rename.BATCH)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to