Hi Le, Let me try to answer to your multiple questions, one by one:
> I'm trying to understand the internal mechanism used by Flink Statefun to > dispatch functions to Flink cluster. In particular, I was trying to find a > good example demonstrating Statefun's "Logical Co-location, Physical > Separation" properties (as pointed out by [1]). > I'm not sure that I understand what dispatch functions to the Flink cluster mean here, but I will try to give you a general description of how things work with StateFun, and please follow up with any clarifications :-) In general, StateFun is a very specific Flink streaming job, and as such it will be running on a Flink cluster. Now, a remote function is a function that runs in a different process that executes (for now) an HTTP server and runs the StateFun SDK. These processes can be located at the same machine as the Flink's TaskManagers and communicate via a unix domain socket, or they can be at a different machine, or they can even be deployed behind a load balancer, and autoscaled up and down on demand. Now, as long as StateFun knows how to translate a function logical address to an HTTP endpoint that serves it, StateFun can dispatch function calls to these remote function processes. By logical co-location, physical separation: a Flink worker that executes the StateFun job, is responsible for the state and messaging of a specific key (address) but the function itself can be running at a different physical process. A good example of this kind of deployment you can find Gordon's talk [1], that demonstrates deploying the remote functions on AWS lambda. My understanding based on the doc was that there are three modes to deploy > statefun to Flink cluster ranging from remote functions to embedded > functions (decreasing storage locality and increasing efficiency). > Therefore, in the scenario of remote functions, functions are deployed with > its state and message independent from flink processes. And function should > be able to be executed in any Flink process as if it is a stateless > application. > Remote functions are indeed "effectively stateless" and state is being provided as part of an invocation request. But the state is managed in a fault tolerant way by Flink. > I have tried out a couple of examples from the statefun but judging by > allocation result the subtask of the job seems to bind statically with each > task slot in the Flink cluster (I'm assuming the example such as DataStream > uses embedded function instead?). > You are correct, the StateFun job has a fixed topology independent of the number of functions or function types. Therefore you can have many different function types and many billions of function instances. A single FunctionDispatcher operator, will handle transparently the multiplexing of different function types and instances behind the scenes. I hope that clarifies a bit. Igal. [1] https://www.youtube.com/watch?v=tuSylBadNSo On Tue, Dec 29, 2020 at 10:58 PM Le Xu <sharonx...@gmail.com> wrote: > Hello! > > I'm trying to understand the internal mechanism used by Flink Statefun to > dispatch functions to Flink cluster. In particular, I was trying to find a > good example demonstrating Statefun's "Logical Co-location, Physical > Separation" properties (as pointed out by [1]). > > My understanding based on the doc was that there are three modes to deploy > statefun to Flink cluster ranging from remote functions to embedded > functions (decreasing storage locality and increasing efficiency). > Therefore, in the scenario of remote functions, functions are deployed with > its state and message independent from flink processes. And function should > be able to be executed in any Flink process as if it is a stateless > application. I have tried out a couple of examples from the statefun but > judging by allocation result the subtask of the job seems to bind > statically with each task slot in the Flink cluster (I'm assuming the > example such as DataStream uses embedded function instead?). > > I also came across this tutorial [2] demonstrating the usage of remote > function. The README indicates [3] that "Since the functions are > stateless, and running in their own container, we can redeploy and rescale > it independently of the rest of the infrastructure." which seems to > indicate that the function performs scaling manually by the user that could > occupy arbitrary resources (e.g., task slots) from the Flink cluster on > demand. But I wasn't sure how to explicitly specify the amount of > parallelism for each function dynamically. > Is there a good example to visualize statefun "physical separation" > behavior by forcing the same function to be invoked at different task slots > / machines (either on-demand or automatically)? > > Any help will be appreciated! > > Thanks! > > Le > > > [1] > https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.2/concepts/distributed_architecture.html#remote-functions > [2] https://github.com/ververica/flink-statefun-workshop > [3] > https://github.com/ververica/flink-statefun-workshop#restart-the-functions >