It is quite complicated. See
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/WriteFiles.java
in
particular the expand() method. At a high level, it assigns a shard index
to every element and then groups by destination and shard index (implicitly
also b
Cool thanks!
How does it work internally? Are all the elements routed to the same path
grouped and processed within the same bundle?
Thanks!
On Tue, Feb 13, 2018 at 9:03 PM Eugene Kirpichov
wrote:
> It will do its best to throw an exception if duplicate names are produced
> within one pane. Ot
It will do its best to throw an exception if duplicate names are produced
within one pane. Otherwise, it will overwrite.
On Tue, Feb 13, 2018 at 11:58 AM Carlos Alonso wrote:
> Cool, thanks.
>
> What if the destination is not properly coded and the File naming policy
> then produces a duplicated
Cool, thanks.
What if the destination is not properly coded and the File naming policy
then produces a duplicated path? Will it throw an exception? Overwrite?
Thanks!
On Tue, Feb 13, 2018 at 6:23 PM Eugene Kirpichov
wrote:
> Dynamic file writes generate 1 set of files (shards) for every pane f
Dynamic file writes generate 1 set of files (shards) for every pane firing
of every window of every destination. File naming policy is required to
produce different names for every combination of (destination, shard index,
window, pane) so you never have to append or overwrite. A new element
arrivi
Hi everyone!!
I'm wondering how a TextIO with dynamic routing knows/decides when to
finalise a file and what happens if after it is finalised, another element
routed for the same file appears.
Thanks!