[DISCUSS] change the encoding scheme of Python StrUtf8Coder

Heejong Lee Wed, 03 Apr 2019 17:04:47 -0700

Hi all,

It looks like UTF-8 String Coder in Java and Python SDKs uses different
encoding schemes. StringUtf8Coder in Java SDK puts the varint length of the
input string before actual data bytes however StrUtf8Coder in Python SDK
directly encodes the input string to bytes value. For the last few weeks,
I've been testing and fixing cross-language IO transforms and this
discrepancy is a major blocker for me. IMO, we should unify the encoding
schemes of UTF8 strings across the different SDKs and make it a standard
coder. Any thoughts?


Thanks,

[DISCUSS] change the encoding scheme of Python StrUtf8Coder

Reply via email to