[ 
https://issues.apache.org/jira/browse/BEAM-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16762315#comment-16762315
 ] 

Ahmet Altay commented on BEAM-6584:
-----------------------------------

This problem is affecting all iterable side inputs. Somehow all side inputs 
will be translated into:

Producer -> MapToVoidKeyX -> _DataflowIterableSideInput -> Consumer

Producer -> MapToVoidKeyX

MapToVoidKeyX appears in both cases.

I tracked the graph up to the call to the to_runner_api() and did not see a 
malformed date. The issue happens here:

self.proto_pipeline, self.proto_context = 
pipeline.to_runner_api(return_context=True)

(here = 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L345)]

 

> Python SDK creates job graphs with duplicated states when using fn_api 
> execution mode. 
> ---------------------------------------------------------------------------------------
>
>                 Key: BEAM-6584
>                 URL: https://issues.apache.org/jira/browse/BEAM-6584
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-harness
>            Reporter: Valentyn Tymofieiev
>            Priority: Major
>
> We observed this on apache_beam.examples.wordcount with Dataflow runner.
> The graph for this wordcount job contains two steps with the same name 
> "write/Write/WriteImpl/FinalizeWrite/MapToVoidKey1".
> {noformat}
> ...
>  {
>         "kind": "PAR_DO_KIND",
>         "id": "s41",
>         "name": "write/Write/WriteImpl/FinalizeWrite/MapToVoidKey1",
>         "displayData": [
>           {
>             "key": "fn",
>             "namespace": "apache_beam.transforms.core.ParDo",
>             "strValue": "apache_beam.transforms.core.CallableWrapperDoFn",
>             "shortStrValue": "CallableWrapperDoFn",
>             "label": "Transform Function"
>           },
>           {
>             "key": "fn",
>             "namespace": "apache_beam.transforms.core.CallableWrapperDoFn",
>             "strValue": "\u003clambda\u003e",
>             "label": "Transform Function"
>           }
>         ],
>         "outputCollectionName": [
>           "write/Write/WriteImpl/FinalizeWrite/MapToVoidKey1.out0"
>         ],
>         "inputCollectionName": [
>           "write/Write/WriteImpl/Extract.out0"
>         ]
>       },
> ...
> {
>         "kind": "PAR_DO_KIND",
>         "id": "s31",
>         "name": "write/Write/WriteImpl/FinalizeWrite/MapToVoidKey1",
>         "displayData": [
>           {
>             "key": "fn",
>             "namespace": "apache_beam.transforms.core.ParDo",
>             "strValue": "apache_beam.transforms.core.CallableWrapperDoFn",
>             "shortStrValue": "CallableWrapperDoFn",
>             "label": "Transform Function"
>           },
>           {
>             "key": "fn",
>             "namespace": "apache_beam.transforms.core.CallableWrapperDoFn",
>             "strValue": "\u003clambda\u003e",
>             "label": "Transform Function"
>           }
>         ],
>         "outputCollectionName": [
>           "write/Write/WriteImpl/FinalizeWrite/MapToVoidKey1.out0"
>         ],
>         "inputCollectionName": [
>           "write/Write/WriteImpl/Extract.out0"
>         ]
>       },
> ...
> {noformat}
> CC: [~foegler] [~altay] [~robertwb]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to