[jira] [Comment Edited] (BEAM-8970) Spark portable runner supports Yarn

2020-01-27 Thread Kyle Weaver (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024769#comment-17024769
 ] 

Kyle Weaver edited comment on BEAM-8970 at 1/27/20 11:54 PM:
-

Hi Enis, thanks for the feedback. I'm not sure it's possible to use the Spark 
REST API along with YARN, because normally the Spark REST API is started along 
with the Spark master.

You should be able to spark-submit portable jars. To create portable jars:

[--runner=SparkRunner,
--output_executable_path=~/path/to/output.jar]

(Without using the spark_submit_uber_jar option.)

Also, note that this will require YARN nodes to have installed or otherwise be 
able to access Beam worker code. [~angoenka] might know more.




was (Author: ibzib):
Hi Enis, thanks for the feedback. I'm not sure it's possible to use the Spark 
REST API along with YARN, because normally the Spark REST API is started along 
with the Spark master.

You should be able to spark-submit portable jars. To create portable jars:

{{
['--runner=SparkRunner',
--output_executable_path "$OUTPUT_JAR"]
}}

(Without using the spark_submit_uber_jar option.)

Also, note that this will require YARN nodes to have installed or otherwise be 
able to access Beam worker code. [~angoenka] might know more.



> Spark portable runner supports Yarn
> ---
>
> Key: BEAM-8970
> URL: https://issues.apache.org/jira/browse/BEAM-8970
> Project: Beam
>  Issue Type: Wish
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-8970) Spark portable runner supports Yarn

2020-01-27 Thread Kyle Weaver (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024769#comment-17024769
 ] 

Kyle Weaver edited comment on BEAM-8970 at 1/27/20 11:55 PM:
-

Hi Enis, thanks for the feedback. I'm not sure it's possible to use the Spark 
REST API along with YARN, because normally the Spark REST API is started along 
with the Spark master.

You should be able to spark-submit portable jars. To create portable jars:

['--runner=SparkRunner',
'--output_executable_path=~/path/to/output.jar']

(Without using the spark_submit_uber_jar option.)

Also, note that this will require YARN nodes to have installed or otherwise be 
able to access Beam worker code. [~angoenka] might know more.




was (Author: ibzib):
Hi Enis, thanks for the feedback. I'm not sure it's possible to use the Spark 
REST API along with YARN, because normally the Spark REST API is started along 
with the Spark master.

You should be able to spark-submit portable jars. To create portable jars:

[--runner=SparkRunner,
--output_executable_path=~/path/to/output.jar]

(Without using the spark_submit_uber_jar option.)

Also, note that this will require YARN nodes to have installed or otherwise be 
able to access Beam worker code. [~angoenka] might know more.



> Spark portable runner supports Yarn
> ---
>
> Key: BEAM-8970
> URL: https://issues.apache.org/jira/browse/BEAM-8970
> Project: Beam
>  Issue Type: Wish
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-8970) Spark portable runner supports Yarn

2020-01-25 Thread Enis Nazif (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023657#comment-17023657
 ] 

Enis Nazif edited comment on BEAM-8970 at 1/25/20 10:12 PM:


looking at this issue, to run a pipeline on YARN backed spark cluster, a user 
should be able to specify runner options of
{code:java}
['--runner=SparkRunner',
'--spark_submit_uber_jar',
'--spark_rest_url=http://spark-rest-api:6066',
'--spark_master_url='yarn']{code}
As it stands, the 'spark_master_url' isn't being passed into the request 
created in in 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/spark_uber_jar_job_server.py#L145]

It seems that this is necessary to support YARN

Failing this, an alternative way may be to bypass the Spark REST API (which 
seems like fairly hidden functionality) and instead directly spark-submit the 
portable jars that are created. 

 

 

 

 


was (Author: enazif):
looking at this issue, to run a pipeline on YARN backed sparked, a user should 
be able to specify runner options of
{code:java}
['--runner=SparkRunner',
'--spark_submit_uber_jar',
'--spark_rest_url=http://spark-rest-api:6066',
'--spark_master_url='yarn']{code}
As it stands, the 'spark_master_url' isn't being passed into the request 
created in in 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/spark_uber_jar_job_server.py#L145]

It seems that this is necessary to support YARN

Failing this, an alternative way may be to bypass the Spark REST API (which 
seems like fairly hidden functionality) and instead directly spark-submit the 
portable jars that are created. 

 

 

 

 

> Spark portable runner supports Yarn
> ---
>
> Key: BEAM-8970
> URL: https://issues.apache.org/jira/browse/BEAM-8970
> Project: Beam
>  Issue Type: Wish
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-8970) Spark portable runner supports Yarn

2020-01-25 Thread Enis Nazif (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023657#comment-17023657
 ] 

Enis Nazif edited comment on BEAM-8970 at 1/25/20 10:12 PM:


looking at this issue, to run a pipeline on YARN backed sparked, a user should 
be able to specify runner options of
{code:java}
['--runner=SparkRunner',
'--spark_submit_uber_jar',
'--spark_rest_url=http://spark-rest-api:6066',
'--spark_master_url='yarn']{code}
As it stands, the 'spark_master_url' isn't being passed into the request 
created in in 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/spark_uber_jar_job_server.py#L145]

It seems that this is necessary to support YARN

Failing this, an alternative way may be to bypass the Spark REST API (which 
seems like fairly hidden functionality) and instead directly spark-submit the 
portable jars that are created. 

 

 

 

 


was (Author: enazif):
looking at this issue, to run a pipeline on YARN backed sparked, a user should 
be able to specify runner options of
{code:java}
['--runner=SparkRunner',
'--spark_submit_uber_jar',
'--spark_rest_url=http://spark-rest-api:6066',
'--spark_master_url='yarn']{code}
As it stands, the 'spark_master_url' isn't being passed into the request 
created in in 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/spark_uber_jar_job_server.py#L145]

It seems that this is necessary to support YARN

Failing this, an alternative way may be to bypass the Spark REST API (which 
seems like fairly hidden functionality) and instead directly 
{noformat}
spark-submit{noformat}
spark-submit` the portable jars that are created. 

 

 

 

 

> Spark portable runner supports Yarn
> ---
>
> Key: BEAM-8970
> URL: https://issues.apache.org/jira/browse/BEAM-8970
> Project: Beam
>  Issue Type: Wish
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-8970) Spark portable runner supports Yarn

2020-01-25 Thread Enis Nazif (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023657#comment-17023657
 ] 

Enis Nazif edited comment on BEAM-8970 at 1/25/20 10:11 PM:


looking at this issue, to run a pipeline on YARN backed sparked, a user should 
be able to specify runner options of
{code:java}
['--runner=SparkRunner',
'--spark_submit_uber_jar',
'--spark_rest_url=http://spark-rest-api:6066',
'--spark_master_url='yarn']{code}
As it stands, the 'spark_master_url' isn't being passed into the request 
created in in 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/spark_uber_jar_job_server.py#L145]

It seems that this is necessary to support YARN

Failing this, an alternative way may be to bypass the Spark REST API (which 
seems like fairly hidden functionality) and instead directly 
{noformat}
spark-submit{noformat}
spark-submit` the portable jars that are created. 

 

 

 

 


was (Author: enazif):
looking at this issue, to run a pipeline on YARN backed sparked, a user should 
be able to specify runner options of
{code:java}
['--runner=SparkRunner',
'--spark_submit_uber_jar',
'--spark_rest_url=http://spark-rest-api:6066',
'--spark_master_url='yarn']{code}
As it stands, the `spark_master_url` isn't being passed into the request 
created in in 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/spark_uber_jar_job_server.py#L145]

It seems that this is necessary to support YARN

Failing this, an alternative way may be to bypass the Spark REST API (which 
seems like fairly hidden functionality) and instead directly `spark-submit` the 
portable jars that are created. 

 

 

 

> Spark portable runner supports Yarn
> ---
>
> Key: BEAM-8970
> URL: https://issues.apache.org/jira/browse/BEAM-8970
> Project: Beam
>  Issue Type: Wish
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)