David Janicek created BEAM-10295:
------------------------------------

             Summary: FileBasedSink: allow setting temp directory provider per 
dynamic destination
                 Key: BEAM-10295
                 URL: https://issues.apache.org/jira/browse/BEAM-10295
             Project: Beam
          Issue Type: Improvement
          Components: io-java-hadoop-file-system, sdk-java-core
            Reporter: David Janicek


Dynamic file destinations allow value-dependent writes in FileBasedSink. When 
using hadoop file system this means user can write some values to destination 
at *cluster-A* and some values to destination at *cluster-B*.

Since BEAM-7613 was fixed this works fine until the *moveToOutputFiles* method 
is called. This method internally calls *FileSystems.rename* which obviously 
requires that source files (temporary files) and target files (resolved by 
dynamic destination's function) are on the same cluster. But the temp directory 
provider can be set only one per file sink.

This could be fixed by adding some kind of *getTempDirectoryProvider* method 
into dynamic destinations (e.g. into *DefaultFilenamePolicy.Params*).

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to