[ 
https://issues.apache.org/jira/browse/BEAM-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760386#comment-16760386
 ] 

Valentyn Tymofieiev commented on BEAM-6584:
-------------------------------------------

Step names in the graph without beam_fn_api experiment:
 
{noformat}
        "name": "count",
        "name": "format",
        "name": "group",
        "name": "pair_with_one",
        "name": "read/Read",
        "name": "split",
        "name": "write/Write/WriteImpl/DoOnce/Read",
        "name": "write/Write/WriteImpl/Extract",
        "name": "write/Write/WriteImpl/FinalizeWrite/FinalizeWrite",
        "name": 
"write/Write/WriteImpl/FinalizeWrite/_UnpickledSideInput(Extract.out.0)",
        "name": 
"write/Write/WriteImpl/FinalizeWrite/_UnpickledSideInput(InitializeWrite.out.0)",
        "name": 
"write/Write/WriteImpl/FinalizeWrite/_UnpickledSideInput(PreFinalize.out.0)",
        "name": "write/Write/WriteImpl/GroupByKey",
        "name": "write/Write/WriteImpl/InitializeWrite",
        "name": "write/Write/WriteImpl/Pair",
        "name": "write/Write/WriteImpl/PreFinalize/PreFinalize",
        "name": 
"write/Write/WriteImpl/PreFinalize/_UnpickledSideInput(Extract.out.0)",
        "name": 
"write/Write/WriteImpl/PreFinalize/_UnpickledSideInput(InitializeWrite.out.0)",
        "name": "write/Write/WriteImpl/WindowInto(WindowIntoFn)",
        "name": 
"write/Write/WriteImpl/WriteBundles/_UnpickledSideInput(InitializeWrite.out.0)",
        "name": "write/Write/WriteImpl/WriteBundles/WriteBundles",
{noformat}

Step names in the graph with beam_fn_api experiment:

{noformat}
        "name": "count",
        "name": "format",
        "name": "group",
        "name": "pair_with_one",
        "name": "read/read/impulse",
        "name": "read/read/readsplits",
        "name": "read/read/reshuffle/addrandomkeys",
        "name": "read/read/reshuffle/removerandomkeys",
        "name": 
"read/read/reshuffle/reshuffleperkey/flatmap(restore_timestamps)",
        "name": "read/read/reshuffle/reshuffleperkey/groupbykey",
        "name": "read/read/reshuffle/reshuffleperkey/map(reify_timestamps)",
        "name": "read/read/split",
        "name": "split",
        "name": "write/write/writeimpl/doonce/flatmap(\u003clambda at 
core.py:2113\u003e)",
        "name": "write/write/writeimpl/doonce/impulse",
        "name": "write/write/writeimpl/doonce/map(decode)",
        "name": "write/write/writeimpl/extract",
        "name": 
"write/write/writeimpl/finalizewrite/_dataflowiterablesideinput(maptovoidkey0.out.0)",
        "name": 
"write/write/writeimpl/finalizewrite/_dataflowiterablesideinput(maptovoidkey1.out.0)",
        "name": 
"write/write/writeimpl/finalizewrite/_dataflowiterablesideinput(maptovoidkey2.out.0)",
        "name": "write/write/writeimpl/finalizewrite/finalizewrite",
        "name": "write/write/writeimpl/finalizewrite/maptovoidkey0",
        "name": "write/write/writeimpl/finalizewrite/maptovoidkey0",
        "name": "write/write/writeimpl/finalizewrite/maptovoidkey1",
        "name": "write/write/writeimpl/finalizewrite/maptovoidkey1",
        "name": "write/write/writeimpl/finalizewrite/maptovoidkey2",
        "name": "write/write/writeimpl/finalizewrite/maptovoidkey2",
        "name": "write/write/writeimpl/groupbykey",
        "name": "write/write/writeimpl/initializewrite",
        "name": "write/write/writeimpl/pair",
        "name": 
"write/write/writeimpl/prefinalize/_dataflowiterablesideinput(maptovoidkey0.out.0)",
        "name": 
"write/write/writeimpl/prefinalize/_dataflowiterablesideinput(maptovoidkey1.out.0)",
        "name": "write/write/writeimpl/prefinalize/maptovoidkey0",
        "name": "write/write/writeimpl/prefinalize/maptovoidkey0",
        "name": "write/write/writeimpl/prefinalize/maptovoidkey1",
        "name": "write/write/writeimpl/prefinalize/maptovoidkey1",
        "name": "write/write/writeimpl/prefinalize/prefinalize",
        "name": "write/write/writeimpl/windowinto(windowintofn)",
        "name": 
"write/write/writeimpl/writebundles/_dataflowiterablesideinput(maptovoidkey0.out.0)",
        "name": "write/write/writeimpl/writebundles/maptovoidkey0",
        "name": "write/write/writeimpl/writebundles/maptovoidkey0",
        "name": "write/write/writeimpl/writebundles/writebundles",
{noformat}

 

> Python SDK creates job graphs with duplicated states when using fn_api 
> execution mode. 
> ---------------------------------------------------------------------------------------
>
>                 Key: BEAM-6584
>                 URL: https://issues.apache.org/jira/browse/BEAM-6584
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-harness
>            Reporter: Valentyn Tymofieiev
>            Priority: Major
>
> We observed this on apache_beam.examples.wordcount with Dataflow runner.
> The graph for this wordcount job contains two steps with the same name 
> "write/Write/WriteImpl/FinalizeWrite/MapToVoidKey1".
> {noformat}
> ...
>  {
>         "kind": "PAR_DO_KIND",
>         "id": "s41",
>         "name": "write/Write/WriteImpl/FinalizeWrite/MapToVoidKey1",
>         "displayData": [
>           {
>             "key": "fn",
>             "namespace": "apache_beam.transforms.core.ParDo",
>             "strValue": "apache_beam.transforms.core.CallableWrapperDoFn",
>             "shortStrValue": "CallableWrapperDoFn",
>             "label": "Transform Function"
>           },
>           {
>             "key": "fn",
>             "namespace": "apache_beam.transforms.core.CallableWrapperDoFn",
>             "strValue": "\u003clambda\u003e",
>             "label": "Transform Function"
>           }
>         ],
>         "outputCollectionName": [
>           "write/Write/WriteImpl/FinalizeWrite/MapToVoidKey1.out0"
>         ],
>         "inputCollectionName": [
>           "write/Write/WriteImpl/Extract.out0"
>         ]
>       },
> ...
> {
>         "kind": "PAR_DO_KIND",
>         "id": "s31",
>         "name": "write/Write/WriteImpl/FinalizeWrite/MapToVoidKey1",
>         "displayData": [
>           {
>             "key": "fn",
>             "namespace": "apache_beam.transforms.core.ParDo",
>             "strValue": "apache_beam.transforms.core.CallableWrapperDoFn",
>             "shortStrValue": "CallableWrapperDoFn",
>             "label": "Transform Function"
>           },
>           {
>             "key": "fn",
>             "namespace": "apache_beam.transforms.core.CallableWrapperDoFn",
>             "strValue": "\u003clambda\u003e",
>             "label": "Transform Function"
>           }
>         ],
>         "outputCollectionName": [
>           "write/Write/WriteImpl/FinalizeWrite/MapToVoidKey1.out0"
>         ],
>         "inputCollectionName": [
>           "write/Write/WriteImpl/Extract.out0"
>         ]
>       },
> ...
> {noformat}
> CC: [~foegler] [~altay] [~robertwb]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to