[
https://issues.apache.org/jira/browse/BEAM-10529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17490435#comment-17490435
]
Brian Hulette commented on BEAM-10529:
--------------------------------------
[~johnjcasey] reached out offline for some pointers on defining a standard
coder.
1. Assign a URN for the coder and document it in the model,
[beam_runner_api.proto|https://github.com/apache/beam/blob/0e057fd71f96a12d74307799adabf63da9003d11/model/pipeline/src/main/proto/beam_runner_api.proto#L790]
2. Each SDK (Java, Python, Go) has an approach for registering Coder
implementations for URNs (you can find these by searching for a coder URN in
the codebase, e.g. in Java it's
[ModelCoderRegistrar|https://github.com/apache/beam/blob/master/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ModelCoderRegistrar.java]).
You'll need to find all of these places and register the SDK's NullCoder
implementation for the new URN.
3. Add test cases to
[standard_coders.yaml|https://github.com/apache/beam/blob/master/model/fn-execution/src/main/resources/org/apache/beam/model/fnexecution/v1/standard_coders.yaml],
these will be picked up by
[CommonCoderTest.java|https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/wire/CommonCoderTest.java],
[standard_coders_test.py|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/standard_coders_test.py],
[fromyaml.go|https://github.com/apache/beam/blob/master/sdks/go/test/regression/coders/fromyaml/fromyaml.go],
to verify the coders in their respective SDKs .
> Kafka XLang fails for ?empty? key/values
> ----------------------------------------
>
> Key: BEAM-10529
> URL: https://issues.apache.org/jira/browse/BEAM-10529
> Project: Beam
> Issue Type: Bug
> Components: cross-language, io-java-kafka
> Reporter: Luke Cwik
> Assignee: John Casey
> Priority: P1
>
> It looks like the Javadoc for ByteArrayDeserializer and StringDeserializer
> can return null[1, 2] and we aren't using
> NullableCoder.of(ByteArrayCoder.of()) in the expansion[3]. Note that KafkaIO
> does this correctly in its regular coder inference logic[4].
> 1:
> [https://kafka.apache.org/21/javadoc/org/apache/kafka/common/serialization/ByteArrayDeserializer.html#deserialize-java.lang.String-byte:A-|https://kafka.apache.org/21/javadoc/org/apache/kafka/common/serialization/ByteArrayDeserializer.html#deserialize-java.lang.String-byte:A-2:]
> [2:|https://kafka.apache.org/21/javadoc/org/apache/kafka/common/serialization/ByteArrayDeserializer.html#deserialize-java.lang.String-byte:A-2:]
>
> [https://kafka.apache.org/21/javadoc/org/apache/kafka/common/serialization/StringDeserializer.html#deserialize-java.lang.String-byte:A-]
> 3:
> [https://github.com/apache/beam/blob/af2d6b0379d64b522ecb769d88e9e7e7b8900208/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L478]
> 4:
> [https://github.com/apache/beam/blob/af2d6b0379d64b522ecb769d88e9e7e7b8900208/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/LocalDeserializerProvider.java#L85]
--
This message was sent by Atlassian Jira
(v8.20.1#820001)