Thanks for sharing your thoughts which give me more help to deep
understanding the design of FnAPI, and It make more sense to me.
Great thanks Robert !
Best,
Jincheng
Robert Bradshaw 于2019年11月12日周二 上午2:10写道:
> On Fri, Nov 8, 2019 at 10:04 PM jincheng sun
> wrote:
> >
> > > Let us first defin
On Fri, Nov 8, 2019 at 10:04 PM jincheng sun wrote:
>
> > Let us first define what are "standard coders". Usually it should be the
> > coders defined in the Proto. However, personally I think the coders defined
> > in the Java ModelCoders [1] seems more appropriate. The reason is that for
> > a
The SDKs need to know each of the coders defined in the proto. Go and
Python can't use the Java coders. Making a standard definition for the
coder, adding it to the proto enum, and implementing that coder in each SDK
is what makes the coders standard.
In other words, the Java model coders are the
> Let us first define what are "standard coders". Usually it should be the
coders defined in the Proto. However, personally I think the coders defined
in the Java ModelCoders [1] seems more appropriate. The reason is that for
a coder which has already appeared in Proto and still not added to the Ja
Hi Robert Bradshaw,
Thanks a lot for the explanation. Very interesting topic!
Let us first define what are "standard coders". Usually it should be the
coders defined in the Proto. However, personally I think the coders defined
in the Java ModelCoders [1] seems more appropriate. The reason is that
And by "I wasn't clear" I meant "I misread the options".
On Fri, Nov 8, 2019, 4:14 PM Robert Burke wrote:
> Reading back, I wasn't clear: the Go SDK does Option (1), putting the LP
> explicitly during encoding [1] for the runner proto, and explicitly expects
> LPs to contain a custom coder URN o
Reading back, I wasn't clear: the Go SDK does Option (1), putting the LP
explicitly during encoding [1] for the runner proto, and explicitly expects
LPs to contain a custom coder URN on decode for execution [2]. (Modulo an
old bug in Dataflow where the urn was empty)
[1]
https://github.com/apache
On Fri, Nov 8, 2019 at 2:09 AM jincheng sun wrote:
>
> Hi,
>
> Sorry for my late reply. It seems the conclusion has been reached. I just
> want to share my personal thoughts.
>
> Generally, both option 1 and 3 make sense to me.
>
> >> The key concept here is not "standard coder" but "coder that t
Thank you for your comments. Here is the updated PR according to option
(1): https://github.com/apache/beam/pull/9997
-Max
On 08.11.19 11:08, jincheng sun wrote:
Hi,
Sorry for my late reply. It seems the conclusion has been reached. I
just want to share my personal thoughts.
Generally, bot
Hi,
Sorry for my late reply. It seems the conclusion has been reached. I just
want to share my personal thoughts.
Generally, both option 1 and 3 make sense to me.
>> The key concept here is not "standard coder" but "coder that the
>> runner does not understand." This knowledge is only in the run
While the Go SDK doesn't yet support a State API, Option 3) is what the Go SDK
does for all non-standard coders (aka custom coders) anyway.
For wire transfer, the Java Runner also adds a LengthPrefixCoder for the
coder and its subcomponents. The problem is that this is an implicit
assumption
On Thu, Nov 7, 2019 at 8:22 AM Robert Bradshaw wrote:
> On Thu, Nov 7, 2019 at 6:26 AM Maximilian Michels wrote:
> >
> > Thanks for the feedback thus far. Some more comments:
> >
> > > Instead, the runner knows ahead of time that it
> > > will need to instantiate this coder, and should update th
On Thu, Nov 7, 2019 at 6:26 AM Maximilian Michels wrote:
>
> Thanks for the feedback thus far. Some more comments:
>
> > Instead, the runner knows ahead of time that it
> > will need to instantiate this coder, and should update the bundle
> > processor to specify KvCoder,
> > VarIntCoder> as the c
While the Go SDK doesn't yet support a State API, Option 3) is what the Go
SDK does for all non-standard coders (aka custom coders) anyway.
While this means that for certain custom encodings of user types there may
be the overhead of length prefixing it, it's not likely to be the most
significant
Thanks for the feedback thus far. Some more comments:
Instead, the runner knows ahead of time that it
will need to instantiate this coder, and should update the bundle
processor to specify KvCoder,
VarIntCoder> as the coder so both can pull it out in a consistent way.
By "update the bundle pro
On Wed, Nov 6, 2019 at 2:55 AM Maximilian Michels wrote:
>
> Let me try to clarify:
>
> > The Coder used for State/Timers in a StatefulDoFn is pulled out of the
> > input PCollection. If a Runner needs to partition by this coder, it
> > should ensure the coder of this PCollection matches with the
Let me try to clarify:
The Coder used for State/Timers in a StatefulDoFn is pulled out of the
input PCollection. If a Runner needs to partition by this coder, it
should ensure the coder of this PCollection matches with the Coder
used to create the serialized bytes that are used for partitioning
Specifically, "We have no way of telling from the Runner side, if a length
prefix has been used or not." seems false. The runner has all the
information since length prefix is a model coder. Didn't we agree that all
coders should be self-delimiting in runner/SDK interactions, requiring
length-prefi
+1 to what Robert said.
On Tue, Nov 5, 2019 at 2:36 PM Robert Bradshaw wrote:
> The Coder used for State/Timers in a StatefulDoFn is pulled out of the
> input PCollection. If a Runner needs to partition by this coder, it
> should ensure the coder of this PCollection matches with the Coder
> used
The Coder used for State/Timers in a StatefulDoFn is pulled out of the
input PCollection. If a Runner needs to partition by this coder, it
should ensure the coder of this PCollection matches with the Coder
used to create the serialized bytes that are used for partitioning
(whether or not this is le
Hi,
I wanted to get your opinion on something that I have been struggling
with. It is about the coders for state requests in portable pipelines.
In contrast to "classic" Beam, the Runner is not guaranteed to know
which coder is used by the SDK. If the SDK happens to use a standard
coder (als
21 matches
Mail list logo