Yeah I believe they're all bugs/missing features in the python
implementation.  The nullable BitSet one is arguably a bug in the java
implementation, but since there's no low-level spec on how Rows are
actually encoded it's hard to say who's right.  I think Go might have the
same bug there, in which case that's two languages doing it "wrong" and one
doing it "right". :P

On Tue, Oct 12, 2021 at 4:20 PM Reuven Lax <[email protected]> wrote:

> These are bugs in Python, correct?
>
> On Tue, Oct 12, 2021 at 1:18 PM Steve Niemitz <[email protected]> wrote:
>
>> It seems like there's a good amount of incompatibility between java and
>> python wrt beam Rows.  For example the following are unsupported in python
>> (that I've noticed so far)
>> - BYTE
>> - INT16
>> - OneOf
>>
>> Additionally, it seems like nullable fields don't really work correctly,
>> the java BitSetCoder won't encoding trailing empty bytes in the BitSet, but
>> the python side is expecting every num_fields / 8 bytes to be present. [1]
>>
>> Certainly these are bugs, but in general it seems to point to a lack of
>> integration testing for xlang interop in general.  I plan on submitting PRs
>> to fix the bugs above (or at least some of them), are there tests I can
>> change to better exercise these paths?
>>
>> [1]
>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/row_coder.py#L198
>>
>

Reply via email to