David Janicek created BEAM-10295:
------------------------------------
Summary: FileBasedSink: allow setting temp directory provider per
dynamic destination
Key: BEAM-10295
URL: https://issues.apache.org/jira/browse/BEAM-10295
Project: Beam
Issue Type: Improvement
Components: io-java-hadoop-file-system, sdk-java-core
Reporter: David Janicek
Dynamic file destinations allow value-dependent writes in FileBasedSink. When
using hadoop file system this means user can write some values to destination
at *cluster-A* and some values to destination at *cluster-B*.
Since BEAM-7613 was fixed this works fine until the *moveToOutputFiles* method
is called. This method internally calls *FileSystems.rename* which obviously
requires that source files (temporary files) and target files (resolved by
dynamic destination's function) are on the same cluster. But the temp directory
provider can be set only one per file sink.
This could be fixed by adding some kind of *getTempDirectoryProvider* method
into dynamic destinations (e.g. into *DefaultFilenamePolicy.Params*).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)