Luke Cwik created BEAM-4582:
-------------------------------

             Summary: Incorrectly translates 
apache_beam.runners.dataflow.native_io.streaming_create.DecodeAndEmitDoFn when 
creating the Dataflow pipeline json description
                 Key: BEAM-4582
                 URL: https://issues.apache.org/jira/browse/BEAM-4582
             Project: Beam
          Issue Type: Bug
          Components: runner-dataflow
            Reporter: Luke Cwik
            Assignee: Charles Chen


When executing against Dataflow, the JSON pipeline description contains the 
following JSON which doesn't appear in the pipeline proto:

 
{code:java}
    {
      "kind": "ParallelDo", 
      "name": "s2", 
      "properties": {
        "display_data": [
          {
            "key": "fn", 
            "label": "Transform Function", 
            "namespace": "apache_beam.transforms.core.ParDo", 
            "shortValue": "DecodeAndEmitDoFn", 
            "type": "STRING", 
            "value": 
"apache_beam.runners.dataflow.native_io.streaming_create.DecodeAndEmitDoFn"
          }
        ], 
        "non_parallel_inputs": {}, 
        "output_info": [
          {
            "encoding": {
              "@type": "kind:windowed_value", 
              "component_encodings": [
                {
                  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
                  "component_encodings": [
                    {
                      "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
                      "component_encodings": []
                    }, 
                    {
                      "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
                      "component_encodings": []
                    }
                  ], 
                  "is_pair_like": true
                }, 
                {
                  "@type": "kind:global_window"
                }
              ], 
              "is_wrapper": true
            }, 
            "output_name": "out", 
            "user_name": "Some Numbers/Decode Values.out"
          }
        ], 
        "parallel_input": {
          "@type": "OutputReference", 
          "output_name": "out", 
          "step_name": "s1"
        }, 
        "serialized_fn": "ref_AppliedPTransform_AppliedPTransform_45", 
        "user_name": "Some Numbers/Decode Values"
      }
    }, 
{code}
This causes the DataflowRunner to use a legacy code path and ask the Python SDK 
harness to execute a transform with a payload 
*ref_AppliedPTransform_AppliedPTransform_45* instead of sending the PTransform 
proto.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to