thanks guys,
yes, it all makes sense. I actually have it implemented the way Ismaël is 
proposing (using the --runner= parameter) but just don't like the redundant 
syntax when submitting to ie. spark cluster (spark-submit bla bla bla --runner 
SparkRunner) not mentioning it allows submitting to ie. spark but specifying 
different runner (spark-submit bla bla bla --runner FlinkRunner) so was really 
just looking for some cleaner nonredundant syntax to submit the job.
cheers,a. 

   

 On Tuesday, 4 April 2017, 18:12, Thomas Groh <[email protected]> wrote:
 

 In addition to what Ismaël said, there was another reason why I wanted to be 
careful with this kind of automatic inference based on the classpath. When 
you're submitting a job that can potentially run forever, we want to be very 
explicit about it (since it can easily outlive the process you're submitting it 
from, and may not loudly signal that the job will still be active). The added 
complexity from requiring the runner type on the submitter's end is relatively 
low, especially given that most runners will already require additional 
configuration to function properly or at all.
On Tue, Apr 4, 2017 at 6:01 AM, Ismaël Mejía <[email protected]> wrote:

Antony, You can do this explicitly when building your pipeline from
the command args:

Options options =
PipelineOptionsFactory. fromArgs(args).withValidation( ).as(Options.class);

and when you run your app you pass --runner=YourFavoriteRunner and it
will resolve, however different runners can need a bit of tuning. You
can look at the examples module for how to enable profiles per runner,
and some instructions in how to execute this with maven.

https://github.com/apache/ beam/tree/master/examples/java

Also remember that if you run in a cluster you have to submit your
jar, e.g. spark-submit or flink run, and this will be different in
that style of deployment.

I am not sure that resolving the runners implicitly is a good thing,
for the issue that Dan mentions, each runner may need to be tuned with
different options, and additionally because if we have multiple
runners in the classpath we would need to define some priority to
resolve them and I don't think it is a good thing to prefer one runner
over the others.

Ismaël




   

Reply via email to