In addition to what Ismaël said, there was another reason why I wanted to be careful with this kind of automatic inference based on the classpath. When you're submitting a job that can potentially run forever, we want to be very explicit about it (since it can easily outlive the process you're submitting it from, and may not loudly signal that the job will still be active). The added complexity from requiring the runner type on the submitter's end is relatively low, especially given that most runners will already require additional configuration to function properly or at all.
On Tue, Apr 4, 2017 at 6:01 AM, Ismaël Mejía <[email protected]> wrote: > Antony, You can do this explicitly when building your pipeline from > the command args: > > Options options = > PipelineOptionsFactory.fromArgs(args).withValidation().as(Options.class); > > and when you run your app you pass --runner=YourFavoriteRunner and it > will resolve, however different runners can need a bit of tuning. You > can look at the examples module for how to enable profiles per runner, > and some instructions in how to execute this with maven. > > https://github.com/apache/beam/tree/master/examples/java > > Also remember that if you run in a cluster you have to submit your > jar, e.g. spark-submit or flink run, and this will be different in > that style of deployment. > > I am not sure that resolving the runners implicitly is a good thing, > for the issue that Dan mentions, each runner may need to be tuned with > different options, and additionally because if we have multiple > runners in the classpath we would need to define some priority to > resolve them and I don't think it is a good thing to prefer one runner > over the others. > > Ismaël >
