Heejong Lee created BEAM-7008:
---------------------------------
Summary: standardize UTF-8 string coder encodings
Key: BEAM-7008
URL: https://issues.apache.org/jira/browse/BEAM-7008
Project: Beam
Issue Type: Bug
Components: sdk-java-core, sdk-py-core
Reporter: Heejong Lee
Assignee: Heejong Lee
It looks like UTF-8 String Coder in Java and Python SDKs uses different
encoding schemes. StringUtf8Coder in Java SDK puts the varint length of the
input string before actual data bytes however StrUtf8Coder in Python SDK
directly encodes the input string to bytes value. We should unify the encoding
schemes of UTF8 strings across the different SDKs and make it a standard coder.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)