What do you mean by shard the output file? Can it be split at any byte location, or only at specific points?
On Mon, Dec 2, 2019 at 2:05 PM Christopher Larsen < christopher.lar...@quantiphi.com> wrote: > Hi Reuven, > > We would like to write each element to one file but still allow the runner > to shard the output file which could yield more than one output file per > element. > > On Mon, Dec 2, 2019 at 11:55 AM Reuven Lax <re...@google.com> wrote: > >> I'm not sure I completely understand the question. Are you saying that >> you want each element to write to only one file, guaranteeing that two >> elements are never written to the same file? >> >> On Mon, Dec 2, 2019 at 11:53 AM Christopher Larsen < >> christopher.lar...@quantiphi.com> wrote: >> >>> Hi All, >>> >>> TL/DR: can you extend FileIO.sink<T> to write one or more file per >>> element instead of one or more elements per file? >>> >>> In working with Thrift files we have found that since a .thrift file >>> needs to be compiled to generate code the order of the contents of the file >>> are important (ie, the namespace and includes elements need to come before >>> definitions are defined). >>> >>> The issue that we are facing is that by implementing >>> FileIO.sink<Document> we cannot determine how many Document objects are >>> written to a file since this is determined by the runner. This can result >>> in more than one Document being written to a file which will cause >>> compilation errors. >>> >>> We know that this can be controlled by writeDynamic but since we believe >>> the default behavior for the connector should be to output a Document to >>> one or more files (depending on sharding) we were wondering how to best >>> accomplish this. >>> >>> Best, >>> Chris >>> >>> *This message contains information that may be privileged or >>> confidential and is the property of the Quantiphi Inc and/or its >>> affiliates**. >>> It is intended only for the person to whom it is addressed. **If you >>> are not the intended recipient, any review, dissemination, distribution, >>> copying, storage or other use of all or any portion of this message is >>> strictly prohibited. If you received this message in error, please >>> immediately notify the sender by reply e-mail and delete this message in >>> its **entirety* >>> >> > *This message contains information that may be privileged or confidential > and is the property of the Quantiphi Inc and/or its affiliates**. It is > intended only for the person to whom it is addressed. **If you are not > the intended recipient, any review, dissemination, distribution, copying, > storage or other use of all or any portion of this message is strictly > prohibited. If you received this message in error, please immediately > notify the sender by reply e-mail and delete this message in its * > *entirety* >