[ 
https://issues.apache.org/jira/browse/BEAM-11154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247371#comment-17247371
 ] 

Beam JIRA Bot commented on BEAM-11154:
--------------------------------------

This issue was marked "stale-assigned" and has not received a public comment in 
7 days. It is now automatically unassigned. If you are still working on it, you 
can assign it to yourself again. Please also give an update about the status of 
the work.

> Missing coder in pipeline components with dataflow runner v2
> ------------------------------------------------------------
>
>                 Key: BEAM-11154
>                 URL: https://issues.apache.org/jira/browse/BEAM-11154
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow
>            Reporter: Yichi Zhang
>            Priority: P2
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When running pipelines with Top combine function on dataflow runner v2, the 
> backend complains about missing coder id for example missing 
> BoundedHeapCoder1.
> After some troubleshooting this problem seems more generic:
> The step context translation phase would not recognize already registered 
> Coder with incorrect hashCode() function, and will try to give it a new 
> uniqified name to the pipeline_proto_coder_id,
> code pointers:
> https://github.com/apache/beam/blob/5675108933de6eb601ca2e4f21870d2ababe0ec7/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/SdkComponents.java#L268
> In this case, since the comparator field in BoundedHeapCoder often does not 
> implement hashCode() and equals() the BoundedHeapCoder will also have a 
> different hashCode() each time a new instance is created. The duplicated 
> coder does not exist in already translated pipeline proto and will lead to 
> the aforementioned missing coder id issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to