Re: Unexpected in TestStream in Portable Mode

Jan Lukavský Tue, 07 Sep 2021 11:46:04 -0700

Makes sense, I missed that part. That is why a generic "inlining" schemeis problematic, because that depends on how does the runner encodes theelements on the wire. And it is where TestStream's output needs to beencoded into raw bytes, because the wire coder is unknown the the SDKwhen submitting the job.


Thanks for the clarification and for bearing with me!


 Jan

On 9/7/21 7:55 PM, Robert Bradshaw wrote:

On Mon, Sep 6, 2021 at 1:29 AM Jan Lukavský <[email protected]> wrote:

It is currently the latter for runners using this code (which not all
do, e.g. the ULR and Dataflow runners). I don't think we want to
ossify this decision as part of the spec. (Note that even what's
"known" and "unknown" can change from runner to runner.)

This is interesting and unexpected for me. How do runners decide about
how they encode elements between SDK harness and the runner? How do they
inform the SDK harness about this decision? My impression was that this
is well-defined at the model level. If not, then we have the reason for
misunderstanding in this conversation. :-)

The coder id to use for a channel is specified by the runner on the
channel (both the input and output) operation when sending a process
bundle descriptor. They decide based on their capabilities (e.g. what
coders they understand vs. what needs wrapping) SDK capabilities (e.g.
for optimizations like param windowed value coder) and their needs
(e.g. to do a GBK, they need the key and value bytes, not just the
key-value bytes).

Re: Unexpected in TestStream in Portable Mode

Reply via email to