What do you mean by shard the output file? Can it be split at any byte
location, or only at specific points?

On Mon, Dec 2, 2019 at 2:05 PM Christopher Larsen <
christopher.lar...@quantiphi.com> wrote:

> Hi Reuven,
>
> We would like to write each element to one file but still allow the runner
> to shard the output file which could yield more than one output file per
> element.
>
> On Mon, Dec 2, 2019 at 11:55 AM Reuven Lax <re...@google.com> wrote:
>
>> I'm not sure I completely understand the question. Are you saying that
>> you want each element to write to only one file, guaranteeing that two
>> elements are never written to the same file?
>>
>> On Mon, Dec 2, 2019 at 11:53 AM Christopher Larsen <
>> christopher.lar...@quantiphi.com> wrote:
>>
>>> Hi All,
>>>
>>> TL/DR: can you extend FileIO.sink<T> to write one or more file per
>>> element instead of one or more elements per file?
>>>
>>> In working with Thrift files we have found that since a .thrift file
>>> needs to be compiled to generate code the order of the contents of the file
>>> are important (ie, the namespace and includes elements need to come before
>>> definitions are defined).
>>>
>>> The issue that we are facing is that by implementing
>>> FileIO.sink<Document> we cannot determine how many Document objects are
>>> written to a file since this is determined by the runner. This can result
>>> in more than one Document being written to a file which will cause
>>> compilation errors.
>>>
>>> We know that this can be controlled by writeDynamic but since we believe
>>> the default behavior for the connector should be to output a Document to
>>> one or more files (depending on sharding) we were wondering how to best
>>> accomplish this.
>>>
>>> Best,
>>> Chris
>>>
>>> *This message contains information that may be privileged or
>>> confidential and is the property of the Quantiphi Inc and/or its 
>>> affiliates**.
>>> It is intended only for the person to whom it is addressed. **If you
>>> are not the intended recipient, any review, dissemination, distribution,
>>> copying, storage or other use of all or any portion of this message is
>>> strictly prohibited. If you received this message in error, please
>>> immediately notify the sender by reply e-mail and delete this message in
>>> its **entirety*
>>>
>>
> *This message contains information that may be privileged or confidential
> and is the property of the Quantiphi Inc and/or its affiliates**. It is
> intended only for the person to whom it is addressed. **If you are not
> the intended recipient, any review, dissemination, distribution, copying,
> storage or other use of all or any portion of this message is strictly
> prohibited. If you received this message in error, please immediately
> notify the sender by reply e-mail and delete this message in its *
> *entirety*
>

Reply via email to