Valentyn Tymofieiev created BEAM-6584:
-----------------------------------------
Summary: Python SDK creates job graphs with duplicated states when
using fn_api execution mode.
Key: BEAM-6584
URL: https://issues.apache.org/jira/browse/BEAM-6584
Project: Beam
Issue Type: Bug
Components: sdk-py-harness
Reporter: Valentyn Tymofieiev
We observed this on apache_beam.examples.wordcount with Dataflow runner.
The graph for this wordcount job contains two steps with the same name
"write/Write/WriteImpl/FinalizeWrite/MapToVoidKey1".
...
{
"kind": "PAR_DO_KIND",
"id": "s41",
"name": "write/Write/WriteImpl/FinalizeWrite/MapToVoidKey1",
"displayData": [
{
"key": "fn",
"namespace": "apache_beam.transforms.core.ParDo",
"strValue": "apache_beam.transforms.core.CallableWrapperDoFn",
"shortStrValue": "CallableWrapperDoFn",
"label": "Transform Function"
},
{
"key": "fn",
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn",
"strValue": "\u003clambda\u003e",
"label": "Transform Function"
}
],
"outputCollectionName": [
"write/Write/WriteImpl/FinalizeWrite/MapToVoidKey1.out0"
],
"inputCollectionName": [
"write/Write/WriteImpl/Extract.out0"
]
},
...
{
"kind": "PAR_DO_KIND",
"id": "s31",
"name": "write/Write/WriteImpl/FinalizeWrite/MapToVoidKey1",
"displayData": [
{
"key": "fn",
"namespace": "apache_beam.transforms.core.ParDo",
"strValue": "apache_beam.transforms.core.CallableWrapperDoFn",
"shortStrValue": "CallableWrapperDoFn",
"label": "Transform Function"
},
{
"key": "fn",
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn",
"strValue": "\u003clambda\u003e",
"label": "Transform Function"
}
],
"outputCollectionName": [
"write/Write/WriteImpl/FinalizeWrite/MapToVoidKey1.out0"
],
"inputCollectionName": [
"write/Write/WriteImpl/Extract.out0"
]
},
...
CC: [~foegler] [~altay] [~robertwb]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)