It would still be a standard coder - the distinction I'm proposing is that
there are certain coders that _must_ be implemented by a new runner/sdk
(for example windowedvalue, varint, kv, ...) since they are important for
SDK - runner communication, but now we're starting to standardize coders
that are useful for cross-language and schemas.

On Fri, Sep 27, 2019 at 5:35 PM Chad Dombrova <chad...@gmail.com> wrote:

> Would BooleanCoder continue to fall into this category?  I was under the
> impression we might make it a full fledge standard coder with this PR.
>
>
>
> On Fri, Sep 27, 2019 at 5:32 PM Brian Hulette <bhule...@google.com> wrote:
>
>> +1, thank you!
>>
>> Note In my Row Coder PR I added a new section for "Additional Standard
>> Coders" - i.e. coders that have a URN, but aren't required for a new
>> runner/sdk to implement the beam model:
>> https://github.com/apache/beam/pull/9188/files#diff-f0d64c2cfc4583bfe2a7e5ee59818ae2R646
>>
>> I think this would belong there as well, assuming that is a
>> distinction we want to make.
>>
>> On Fri, Sep 27, 2019 at 5:22 PM Thomas Weise <t...@apache.org> wrote:
>>
>>> +1 for adding the coder
>>>
>>> Please also add a test here:
>>> https://github.com/apache/beam/blob/master/model/fn-execution/src/main/resources/org/apache/beam/model/fnexecution/v1/standard_coders.yaml
>>>
>>>
>>> On Fri, Sep 27, 2019 at 5:17 PM Chad Dombrova <chad...@gmail.com> wrote:
>>>
>>>> Are there any dissenting votes to making a BooleanCoder a standard
>>>> (portable) coder?
>>>>
>>>> I'm happy to make a PR to implement a BooleanCoder in python (and to
>>>> add the Java BooleanCoder to the ModelCoderRegistrar) if everyone agrees
>>>> that this is useful.
>>>>
>>>> -chad
>>>>
>>>>
>>>> On Fri, Sep 27, 2019 at 3:32 PM Robert Bradshaw <rober...@google.com>
>>>> wrote:
>>>>
>>>>> I think boolean is useful to have. What I'm more skeptical of is
>>>>> adding standard types for variations like UnsignedInteger16, etc. that
>>>>> don't have natural representations in all languages.
>>>>>
>>>>> On Fri, Sep 27, 2019 at 2:46 PM Brian Hulette <bhule...@google.com>
>>>>> wrote:
>>>>> >
>>>>> > Some more context from an offline discussion I had with +Robert
>>>>> Bradshaw a while ago: We both agreed all of the coders listed in BEAM-7996
>>>>> should be implemented in Python, but didn't come to a conclusion on 
>>>>> whether
>>>>> or not they should actually be _standard_ coders, versus just being
>>>>> implicitly standard as part of row coder.
>>>>> >
>>>>> > On Fri, Sep 27, 2019 at 2:29 PM Kenneth Knowles <k...@apache.org>
>>>>> wrote:
>>>>> >>
>>>>> >> Yes, noted here:
>>>>> https://github.com/apache/beam/pull/9188/files#diff-f0d64c2cfc4583bfe2a7e5ee59818ae2R678
>>>>> and that links to https://issues.apache.org/jira/browse/BEAM-7996
>>>>> >>
>>>>> >> Kenn
>>>>> >>
>>>>> >> On Fri, Sep 27, 2019 at 12:57 PM Reuven Lax <re...@google.com>
>>>>> wrote:
>>>>> >>>
>>>>> >>> Java has one, implemented as a byte coder. My guess is that nobody
>>>>> has gotten around to implementing it yet for portability.
>>>>> >>>
>>>>> >>> On Fri, Sep 27, 2019 at 12:44 PM Chad Dombrova <chad...@gmail.com>
>>>>> wrote:
>>>>> >>>>
>>>>> >>>> Hi all,
>>>>> >>>> It seems a bit unfortunate that there isn’t a portable way to
>>>>> serialize a boolean value.
>>>>> >>>>
>>>>> >>>> I’m working on porting my external PubsubIO PR over to use the
>>>>> improved schema-based external transform API in python, but because of 
>>>>> this
>>>>> limitation I can’t use boolean values. For example, this fails:
>>>>> >>>>
>>>>> >>>> ReadFromPubsubSchema = typing.NamedTuple(
>>>>> >>>>     'ReadFromPubsubSchema',
>>>>> >>>>     [
>>>>> >>>>         ('topic', typing.Optional[unicode]),
>>>>> >>>>         ('subscription', typing.Optional[unicode]),
>>>>> >>>>         ('id_label',  typing.Optional[unicode]),
>>>>> >>>>         ('with_attributes', bool),
>>>>> >>>>         ('timestamp_attribute',  typing.Optional[unicode]),
>>>>> >>>>     ]
>>>>> >>>> )
>>>>> >>>>
>>>>> >>>> It fails because coders.get_coder(bool) returns the non-portable
>>>>> pickle coder.
>>>>> >>>>
>>>>> >>>> In the short term I can hack something into the external
>>>>> transform API to use varint coder for bools, but this kind of hacky
>>>>> approach to portability won’t work in scenarios where round-tripping is
>>>>> required without user intervention. In other words, in python it is not
>>>>> uncommon to test if x is True, in which case the integer 1 would fail this
>>>>> test. All of that is to say that a BooleanCoder would be a convenient way
>>>>> to ensure the proper type is used everywhere.
>>>>> >>>>
>>>>> >>>> So, I was just wondering why it’s not there? Are there concerns
>>>>> over whether booleans are universal enough to make part of the portability
>>>>> standard?
>>>>> >>>>
>>>>> >>>> -chad
>>>>>
>>>>

Reply via email to