The order is: user python code -> job server -> *flink cluster -> SDK harness*
1. User python code defines the Beam pipeline. 2. The job server executes the Beam pipeline on the Flink cluster. To do so, it must translate Beam operations into Flink native operations. 3. The Flink cluster executes the transforms specified by Beam, the same as it would execute any "normal" Flink pipeline. 4. When needed, a Beam-defined Flink transform (running on a Flink task manager) will invoke the SDK harness to run the python user code. Might be a little overly simplistic, but that's the gist. For more info, there are several public talks on this available online. I think this one is the latest: https://youtu.be/hxHGLrshnCY?t=1769 <https://www.youtube.com/watch?v=hxHGLrshnCY> On Thu, Dec 12, 2019 at 1:32 AM thinkdoom <[email protected]> wrote: > 1. what i grasp there are 4 part, and the data flow and call step is > describe as below, is it right? > user python code -> beam-runners-flink-1.8-job-server -> SDK harness -> > flink cluster. > 1. What's every part's responsiblity for the 4 part? > >
