Also, we'll have to use StandardMoveOptions.IGNORE_MISSING_FILES for supporting failures of the rename step. I think this is a good change to do if the change significantly improves the performance of some of the FileSystems (note that some FileSystems, for example GCS, implement rename in the form of a copy+delete, so there will be no significant performance improvements for such FileSystems).
-Cham On Thu, Jul 26, 2018 at 10:14 AM Reuven Lax <[email protected]> wrote: > We might be able to replace this with Filesystem.rename(). One thing to > keep in mind - the destination files might be in a different directory, so > we would need to make sure that all Filesystems support cross-directory > rename. > > On Thu, Jul 26, 2018 at 9:58 AM Lukasz Cwik <[email protected]> wrote: > >> +dev >> >> On Thu, Jul 26, 2018 at 2:40 AM Jozef Vilcek <[email protected]> >> wrote: >> >>> Hello, >>> >>> just came across FileBasedSink.WriteOperation class which does have >>> moveToOutput() method. Implementation does a Filesystem.copy() instead of >>> "move". With large files I find it quote no efficient if underlying FS >>> supports more efficient ways, so I wonder what is the story behind it? Must >>> it be a copy? >>> >>> >>> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java#L761 >>> >>
