[
https://issues.apache.org/jira/browse/BEAM-3595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883566#comment-16883566
]
Chad Dombrova commented on BEAM-3595:
-------------------------------------
I see that this is marked as fixed, but when using v2.13.0 of the python sdk I
get bad a bad urn for pardo: "urn:beam:transform:pardo:v1" (also, there's a
least a dozen mentions of this ticket in the source code). In order to read my
pipeline back into Java I have to remove the urn: prefix. The urns for other
transforms don't include the urn: prefix.
Here's a bad pardo message:
{noformat}
transforms {
key: "ref_AppliedPTransform_Sleep_4"
value {
spec {
urn: "urn:beam:transform:pardo:v1"
payload: "\n\204\002\n\201\002\n
beam:dofn:pickled_python_info:v1\032\3Jj0F4"
}
inputs {
key: "0"
value: "ref_PCollection_PCollection_1"
}
outputs {
key: "None"
value: "ref_PCollection_PCollection_2"
}
unique_name: "Sleep"
}
}
{noformat}
Additionally, the coders seem to be authored incorrectly, with a double nested
spec:
{noformat}
coders {
key: "ref_Coder_VarIntCoder_1"
value {
spec {
spec {
urn: "beam:coder:varint:v1"
}
}
}
}
{noformat}
I have to remove one level of the spec to get it to read properly in Java.
To be clear, the issue that I'm describing occurs when I manually write and
read the pipeline via protobufs: everything works fine when I submit using
`pipe.run()`. I surmise that there is some function in the python sdk that
fixes up the pipeline message before sending it, otherwise I assume it _would_
be broken on receipt. Is that correct?
> Normalize URNs across SDKs and runners.
> ---------------------------------------
>
> Key: BEAM-3595
> URL: https://issues.apache.org/jira/browse/BEAM-3595
> Project: Beam
> Issue Type: Bug
> Components: beam-model
> Reporter: Robert Bradshaw
> Assignee: Eugene Kirpichov
> Priority: Major
> Fix For: 2.5.0
>
>
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)