[
https://issues.apache.org/jira/browse/BEAM-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977470#comment-15977470
]
Kenneth Knowles commented on BEAM-2021:
---------------------------------------
Yea, that's pretty much it. We'll have an added layer of "CoderEncoders"
registered as services.
For core things, somewhat easy:
- In SDK core: KvCoder with just basic accessors getKeyCoder() etc.
- In core construction:
- KvCoderEncoder that knows the URN and to put the getKeyCoder() and
getValueCoder() into the component coders
- If there were some payload that needed to be added, it can be built
directly as a proto, so that it is cross-SDK (and cross-runner if they need to
know about it). What the payload might be depends on the URN, so you can have a
HeapCoder with explicit component coders but also a java-serialized comparator.
For non-core stuff like IOs or library transforms, it should be similar. In the
extension library, include:
- The FooCoder with its natural interface
- A registered service for its FooCoderEncoder
So what I mean by the tricky bit is the design decision between:
- CoderEncoder interface lives in the SDK and does not have proto on its API
surface (or we figure out a way for this to be safe)
- CoderEncoder interface lives in core construction (or another module...) and
IOs that want to have cross-language/grokkable/compact encodings take a
dependency
- Other option?
> Fix Java's Coder class hierarchy
> --------------------------------
>
> Key: BEAM-2021
> URL: https://issues.apache.org/jira/browse/BEAM-2021
> Project: Beam
> Issue Type: Improvement
> Components: beam-model-runner-api, sdk-java-core
> Affects Versions: First stable release
> Reporter: Kenneth Knowles
> Assignee: Thomas Groh
>
> This is thoroughly out of hand. In the runner API world, there are two paths:
> 1. URN plus component coders plus custom payload (in the form of component
> coders alongside an SdkFunctionSpec)
> 2. Custom coder (a single URN) and payload is serialized Java. I think this
> never has component coders.
> The other base classes have now been shown to be extraneous: they favor
> saving ~3 lines of boilerplate for rarely written code at the cost of
> readability. Instead they should just be dropped.
> The custom payload is an Any proto in the runner API. But tying the Coder
> interface to proto would be unfortunate from a design perspective and cannot
> be done anyhow due to dependency hell.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)