> So hopefully setting --spark-master-url to be yarn will work too. This is not supported.
On Tue, Jun 23, 2020 at 2:58 PM Xinyu Liu <xinyuliu...@gmail.com> wrote: > I am doing some prototyping on this too. I used spark-submit script > instead of the rest api. In my simple setup, I ran > SparkJobServerDriver.main() directly in the AM as a spark job, which > will submit the python job to the default spark master url pointing to > "local". I also use --files in the spark-submit script to upload the python > packages and boot script. On the python side, I was using the following > pipeline options for submission (thanks to Thomas): > > pipeline_options = PipelineOptions([ > > "--runner=PortableRunner", > > "--job_endpoint=your-job-server:8099", > > "--environment_type=PROCESS", > "--environment_config={\"command\": \"./boot\"}")] > > I used my own boot script for customized python packaging. WIth this setup > I was able to get a simple hello-world program running. I haven't tried to > run the job server separately from the AM yet. So hopefully setting > --spark-master-url to be yarn will work too. > > Thanks, > Xinyu > > On Tue, Jun 23, 2020 at 12:18 PM Kyle Weaver <kcwea...@google.com> wrote: > >> Hi Kamil, there is a JIRA for this: >> https://issues.apache.org/jira/browse/BEAM-8970 It's theoretically >> possible but remains untested as far as I know :) >> >> As I indicated in a comment, you can set --output_executable_path to >> create a jar that you can then submit to yarn via spark-submit. >> >> If you can get this working, I'd additionally like to script the jar >> submission in python to save users the extra step. >> >> Thanks, >> Kyle >> >> On Tue, Jun 23, 2020 at 9:16 AM Kamil Wasilewski < >> kamil.wasilew...@polidea.com> wrote: >> >>> Hi all, >>> >>> I'm trying to run a Beam pipeline using Spark on YARN. My pipeline is >>> written in Python, so I need to use a portable runner. Does anybody know >>> how I should configure job server parameters, especially >>> --spark-master-url? Is there anything else I need to be aware of while >>> using such setup? >>> >>> If it makes a difference, I use Google Dataproc. >>> >>> Best, >>> Kamil >>> >>