[
https://issues.apache.org/jira/browse/BEAM-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17486778#comment-17486778
]
Kyle Weaver commented on BEAM-13750:
------------------------------------
I was able to repro this using the following pipeline:
https://gist.github.com/ibzib/a103a487e1000d69289e6f104dbb4865
I looked at the resulting pipeline graph:
ref_PCollection_PCollection_1 is outputted by
'ref_AppliedPTransform_Read-from-Pub-Sub_2'.
Transform 'ref_AppliedPTransform_Read-from-Pub-Sub_2' has URN
'beam:transform:pickled_python:v1' and subtransform
'ref_AppliedPTransform_Read-from-Pub-Sub-Read_3' with URN
'beam:transform:pubsub_read:v1'.
Dataflow knows how to interpret these URNs, but the Flink runner does not. For
now, beam.io.external.gcp.pubsub.ReadFromPubSub should be used instead of
beam.io.gcp.pubsub.ReadFromPubSub. For a long-term fix we should either reject
the native version of ReadFromPubSub, or else expand it in the Flink runner.
> ReadFromPubSub fails on Flink Runner
> ------------------------------------
>
> Key: BEAM-13750
> URL: https://issues.apache.org/jira/browse/BEAM-13750
> Project: Beam
> Issue Type: Bug
> Components: runner-flink
> Reporter: Kyle Weaver
> Priority: P2
> Labels: portability-flink
>
> This was reported by a user on Slack. They are using Beam Python on Flink on
> kubernetes.
> Caused by: java.lang.IllegalArgumentException: PCollectionNodes
> [PCollectionNode{id=ref_PCollection_PCollection_1, PCollection=unique_name:
> "22Read from Pub/Sub/Read.None"
> coder_id: "ref_Coder_BytesCoder_1"
> is_bounded: UNBOUNDED
> windowing_strategy_id: "ref_Windowing_Windowing_1"
> }] were consumed but never produced
> It looks like a bug in the java fuser, similar to BEAM-6473.
> https://the-asf.slack.com/archives/C9H0YNP3P/p1642787607002700
--
This message was sent by Atlassian Jira
(v8.20.1#820001)