[
https://issues.apache.org/jira/browse/BEAM-11154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247371#comment-17247371
]
Beam JIRA Bot commented on BEAM-11154:
--------------------------------------
This issue was marked "stale-assigned" and has not received a public comment in
7 days. It is now automatically unassigned. If you are still working on it, you
can assign it to yourself again. Please also give an update about the status of
the work.
> Missing coder in pipeline components with dataflow runner v2
> ------------------------------------------------------------
>
> Key: BEAM-11154
> URL: https://issues.apache.org/jira/browse/BEAM-11154
> Project: Beam
> Issue Type: Bug
> Components: runner-dataflow
> Reporter: Yichi Zhang
> Priority: P2
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> When running pipelines with Top combine function on dataflow runner v2, the
> backend complains about missing coder id for example missing
> BoundedHeapCoder1.
> After some troubleshooting this problem seems more generic:
> The step context translation phase would not recognize already registered
> Coder with incorrect hashCode() function, and will try to give it a new
> uniqified name to the pipeline_proto_coder_id,
> code pointers:
> https://github.com/apache/beam/blob/5675108933de6eb601ca2e4f21870d2ababe0ec7/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/SdkComponents.java#L268
> In this case, since the comparator field in BoundedHeapCoder often does not
> implement hashCode() and equals() the BoundedHeapCoder will also have a
> different hashCode() each time a new instance is created. The duplicated
> coder does not exist in already translated pipeline proto and will lead to
> the aforementioned missing coder id issue.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)