[jira] [Comment Edited] (BEAM-6692) Spark Translator - RESHUFFLE_URN

Kyle Weaver (JIRA) Mon, 13 May 2019 13:58:14 -0700


    [ 
https://issues.apache.org/jira/browse/BEAM-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16838788#comment-16838788
 ]


Kyle Weaver edited comment on BEAM-6692 at 5/13/19 8:57 PM:
------------------------------------------------------------

Python pipelines run with pre_optimize=all fail when run on the Spark runner 
[1]. This seems to be because the URNs used for optimization are hard-coded to 
Flink's, which includes reshuffle [2].

EDIT: not so. see https://issues.apache.org/jira/browse/BEAM-7282

[1] [https://gist.github.com/ibzib/c432b45b90f7ddb62eb39e1784b55ba8]

[2] 
[https://github.com/apache/beam/blob/c565881b3041730f4e1206ed8404e4b0317e5037/sdks/python/apache_beam/runners/portability/portable_runner.py#L204]


was (Author: ibzib):
Python pipelines run with pre_optimize=all fail when run on the Spark runner 
[1]. This seems to be because the URNs used for optimization are hard-coded to 
Flink's, which includes reshuffle [2].

[1] [https://gist.github.com/ibzib/c432b45b90f7ddb62eb39e1784b55ba8]

[2] 
[https://github.com/apache/beam/blob/c565881b3041730f4e1206ed8404e4b0317e5037/sdks/python/apache_beam/runners/portability/portable_runner.py#L204]

> Spark Translator - RESHUFFLE_URN
> --------------------------------
>
>                 Key: BEAM-6692
>                 URL: https://issues.apache.org/jira/browse/BEAM-6692
>             Project: Beam
>          Issue Type: Task
>          Components: runner-spark
>            Reporter: Ankur Goenka
>            Assignee: Kyle Weaver
>            Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (BEAM-6692) Spark Translator - RESHUFFLE_URN

Reply via email to