Some more context from an offline discussion I had with +Robert Bradshaw
<rober...@google.com> a while ago: We both agreed all of the coders listed
in BEAM-7996 should be implemented in Python, but didn't come to a
conclusion on whether or not they should actually be _standard_ coders,
versus just being implicitly standard as part of row coder.

On Fri, Sep 27, 2019 at 2:29 PM Kenneth Knowles <k...@apache.org> wrote:

> Yes, noted here:
> https://github.com/apache/beam/pull/9188/files#diff-f0d64c2cfc4583bfe2a7e5ee59818ae2R678
>  and
> that links to https://issues.apache.org/jira/browse/BEAM-7996
>
> Kenn
>
> On Fri, Sep 27, 2019 at 12:57 PM Reuven Lax <re...@google.com> wrote:
>
>> Java has one, implemented as a byte coder. My guess is that nobody has
>> gotten around to implementing it yet for portability.
>>
>> On Fri, Sep 27, 2019 at 12:44 PM Chad Dombrova <chad...@gmail.com> wrote:
>>
>>> Hi all,
>>> It seems a bit unfortunate that there isn’t a portable way to serialize
>>> a boolean value.
>>>
>>> I’m working on porting my external PubsubIO PR over to use the improved
>>> schema-based external transform API in python, but because of this
>>> limitation I can’t use boolean values. For example, this fails:
>>>
>>> ReadFromPubsubSchema = typing.NamedTuple(
>>>     'ReadFromPubsubSchema',
>>>     [
>>>         ('topic', typing.Optional[unicode]),
>>>         ('subscription', typing.Optional[unicode]),
>>>         ('id_label',  typing.Optional[unicode]),
>>>         ('with_attributes', bool),
>>>         ('timestamp_attribute',  typing.Optional[unicode]),
>>>     ]
>>> )
>>>
>>> It fails because coders.get_coder(bool) returns the non-portable pickle
>>> coder.
>>>
>>> In the short term I can hack something into the external transform API
>>> to use varint coder for bools, but this kind of hacky approach to
>>> portability won’t work in scenarios where round-tripping is required
>>> without user intervention. In other words, in python it is not uncommon to
>>> test if x is True, in which case the integer 1 would fail this test.
>>> All of that is to say that a BooleanCoder would be a convenient way to
>>> ensure the proper type is used everywhere.
>>>
>>> So, I was just wondering why it’s not there? Are there concerns over
>>> whether booleans are universal enough to make part of the portability
>>> standard?
>>>
>>> -chad
>>>
>>

Reply via email to