Re: Portable Java Pipeline Support

2021-04-26 Thread Kyle Weaver
> For Samza Runner, we are looking to leverage java portable mode to achieve “split deployment” where runner is independently packaged w/o user code and user code should only exist in the submission/worker process. I believe this is supported by portable mode and therefore we would prefer to use LO

Re: Portable Java Pipeline Support

2021-04-26 Thread Ke Wu
That makes sense. For Samza Runner, we are looking to leverage java portable mode to achieve “split deployment” where runner is independently packaged w/o user code and user code should only exist in the submission/worker process. I believe this is supported by portable mode and therefore we w

Re: Portable Java Pipeline Support

2021-04-26 Thread Kyle Weaver
The reason is the Flink and Spark runners are written in Java. So when the runner needs to execute user code written in Java, an EMBEDDED environment can be started in the runner. Whereas the runner cannot natively execute Python code, so it needs to call out to an external process. In the case of

Re: Portable Java Pipeline Support

2021-04-26 Thread Ke Wu
Thank you Kyle, I have created BEAM-12227 to track the unimplemented exception. Is there any specific reason that Java tests are using EMBEDDED mode while python usually in LOOPBACK mode? Best, Ke > On Apr 23, 2021, at 4:01 PM, Kyle Weaver w

Re: Portable Java Pipeline Support

2021-04-23 Thread Kyle Weaver
I couldn't find any existing ticket for this issue (you may be the first to discover it). Feel free to create one with your findings. (FWIW I did find a ticket for documenting portable Java pipelines [1]). For the Flink and Spark runners, we run most of our Java tests using EMBEDDED mode. For port

Re: Portable Java Pipeline Support

2021-04-23 Thread Ke Wu
Thank you, Kyle, for the detailed answer. Do we have a ticket track fix the LOOPBACK mode? LOOPBACK mode will be essential, especially for local testing as Samza Runner adopts portable mode and we are intended to run it with Java pipeline a lot. In addition, I noticed that this issue does not

Re: Portable Java Pipeline Support

2021-04-23 Thread Kyle Weaver
Yes, we can expect to run java pipelines in portable mode. I'm guessing the method unimplemented exception is a bug, and we haven't caught it because (as far as I know) we don't test the Java loopback worker. As an alternative, you can try building the Java docker environment with "./gradlew :sdks

Portable Java Pipeline Support

2021-04-23 Thread Ke Wu
Hi All, I am working on add portability support for Samza Runner and having been playing around on the support in Flink and Spark runner recently. One thing I noticed is the lack of documentation on how to run a java pipeline in a portable mode. Almost all document focuses on how to run a pyth