Is this alternative still being considered? Creating a portable jar sounds like a good solution to re-use the existing runner specific deployment mechanism (e.g. Flink k8s operator) and in general simplify the deployment story.
On Fri, Aug 9, 2019 at 12:46 AM Robert Bradshaw <[email protected]> wrote: > The expansion service is a separate service. (The flink jar happens to > bring both up.) However, there is negotiation to receive/validate the > pipeline options. > > On Fri, Aug 9, 2019 at 1:54 AM Thomas Weise <[email protected]> wrote: > > > > We would also need to consider cross-language pipelines that (currently) > assume the interaction with an expansion service at construction time. > > > > On Thu, Aug 8, 2019, 4:38 PM Kyle Weaver <[email protected]> wrote: > >> > >> > It might also be useful to have the option to just output the proto > and artifacts, as alternative to the jar file. > >> > >> Sure, that wouldn't be too big a change if we were to decide to go the > SDK route. > >> > >> > For the Flink entry point we would need to allow for the job server > to be used as a library. > >> > >> We don't need the whole job server, we only need to add a main method > to FlinkPipelineRunner [1] as the entry point, which would basically just > do the setup described in the doc then call FlinkPipelineRunner::run. > >> > >> [1] > https://github.com/apache/beam/blob/master/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkPipelineRunner.java#L53 > >> > >> Kyle Weaver | Software Engineer | github.com/ibzib | > [email protected] > >> > >> > >> On Thu, Aug 8, 2019 at 4:21 PM Thomas Weise <[email protected]> wrote: > >>> > >>> Hi Kyle, > >>> > >>> It might also be useful to have the option to just output the proto > and artifacts, as alternative to the jar file. > >>> > >>> For the Flink entry point we would need to allow for the job server to > be used as a library. It would probably not be too hard to have the Flink > job constructed via the context execution environment, which would require > no changes on the Flink side. > >>> > >>> Thanks, > >>> Thomas > >>> > >>> > >>> On Thu, Aug 8, 2019 at 9:52 AM Kyle Weaver <[email protected]> > wrote: > >>>> > >>>> Re Javaless/serverless solution: > >>>> I take it this would probably mean that we would construct the jar > directly from the SDK. There are advantages to this: full separation of > Python and Java environments, no need for a job server, and likely a > simpler implementation, since we'd no longer have to work within the > constraints of the existing job server infrastructure. The only downside I > can think of is the additional cost of implementing/maintaining jar > creation code in each SDK, but that cost may be acceptable if it's simple > enough. > >>>> > >>>> Kyle Weaver | Software Engineer | github.com/ibzib | > [email protected] > >>>> > >>>> > >>>> On Thu, Aug 8, 2019 at 9:31 AM Thomas Weise <[email protected]> wrote: > >>>>> > >>>>> > >>>>> > >>>>> On Thu, Aug 8, 2019 at 8:29 AM Robert Bradshaw <[email protected]> > wrote: > >>>>>> > >>>>>> > Before assembling the jar, the job server runs to create the > ingredients. That requires the (matching) Java environment on the Python > developers machine. > >>>>>> > >>>>>> We can run the job server and have it create the jar (and if we keep > >>>>>> the job server running we can use it to interact with the running > >>>>>> job). However, if the jar layout is simple enough, there's no need > to > >>>>>> even build it from Java. > >>>>>> > >>>>>> Taken to the extreme, this is a one-shot, jar-based JobService API. > We > >>>>>> choose a standard layout of where to put the pipeline description > and > >>>>>> artifacts, and can "augment" an existing jar (that has a > >>>>>> runner-specific main class whose entry point knows how to read this > >>>>>> data to kick off a pipeline as if it were a users driver code) into > >>>>>> one that has a portable pipeline packaged into it for submission to > a > >>>>>> cluster. > >>>>> > >>>>> > >>>>> It would be nice if the Python developer doesn't have to run > anything Java at all. > >>>>> > >>>>> As we just discussed offline, this could be accomplished by > including the proto that is produced by the SDK into the pre-existing jar. > >>>>> > >>>>> And if the jar has an entry point that creates the Flink job in the > prescribed manner [1], it can be directly submitted to the Flink REST API. > That would allow for Java free client. > >>>>> > >>>>> [1] > https://lists.apache.org/thread.html/6db869c53816f4e2917949a7c6992c2b90856d7d639d7f2e1cd13768@%3Cdev.flink.apache.org%3E > >>>>> >
