Re: Visualizing Flink Statefun's "Logical Co-location, Physical Separation" Properties

Igal Shilman Mon, 04 Jan 2021 07:08:56 -0800

Hi Le,
Let me try to answer to your multiple questions, one by one:

> I'm trying to understand the internal mechanism used by Flink Statefun to
> dispatch functions to Flink cluster. In particular, I was trying to find a
> good example demonstrating Statefun's "Logical Co-location, Physical
> Separation" properties (as pointed out by [1]).
>

I'm not sure that I understand what dispatch functions to the Flink cluster
mean here, but I will try to give you a general description of how things
work with StateFun, and please follow up
with any clarifications :-)

In general, StateFun is a very specific Flink streaming job, and as such it
will be running on a Flink cluster. Now, a remote function is a function
that runs in a different process
that executes (for now) an HTTP server and runs the StateFun SDK. These
processes can be located at the same machine as the Flink's TaskManagers
and communicate via a unix domain socket, or they can be at a different
machine, or they can even be deployed behind a load balancer, and
autoscaled up and down on demand.
Now, as long as StateFun knows how to translate a function logical address
to an HTTP endpoint that serves it, StateFun can dispatch function calls to
these remote function processes.
By logical co-location, physical separation: a Flink worker that executes
the StateFun job, is responsible for the state and messaging of a specific
key (address) but the function itself can be running at a different
physical process.
A good example of this kind of deployment you can find Gordon's talk [1],
that demonstrates deploying the remote functions on AWS lambda.

My understanding based on the doc was that there are three modes to deploy
> statefun to Flink cluster ranging from remote functions to embedded
> functions (decreasing storage locality and increasing efficiency).
> Therefore, in the scenario of remote functions, functions are deployed with
> its state and message independent from flink processes. And function should
> be able to be executed in any Flink process as if it is a stateless
> application.
>

Remote functions are indeed "effectively stateless" and state is being
provided as part of an invocation request. But the state is managed in a
fault tolerant way by Flink.

> I have tried out a couple of examples from the statefun but judging by
> allocation result the subtask of the job seems to bind statically with each
> task slot in the Flink cluster (I'm assuming the example such as DataStream
> uses embedded function instead?).
>

You are correct, the StateFun job has a fixed topology independent of the
number of functions or function types. Therefore you can have many
different function types and many billions of function instances.
A single FunctionDispatcher operator, will handle transparently the
multiplexing of different function types and instances behind the scenes.

I hope that clarifies a bit.

Igal.

[1] https://www.youtube.com/watch?v=tuSylBadNSo

On Tue, Dec 29, 2020 at 10:58 PM Le Xu <sharonx...@gmail.com> wrote:

> Hello!
>
> I'm trying to understand the internal mechanism used by Flink Statefun to
> dispatch functions to Flink cluster. In particular, I was trying to find a
> good example demonstrating Statefun's "Logical Co-location, Physical
> Separation" properties (as pointed out by [1]).
>
> My understanding based on the doc was that there are three modes to deploy
> statefun to Flink cluster ranging from remote functions to embedded
> functions (decreasing storage locality and increasing efficiency).
> Therefore, in the scenario of remote functions, functions are deployed with
> its state and message independent from flink processes. And function should
> be able to be executed in any Flink process as if it is a stateless
> application. I have tried out a couple of examples from the statefun but
> judging by allocation result the subtask of the job seems to bind
> statically with each task slot in the Flink cluster (I'm assuming the
> example such as DataStream uses embedded function instead?).
>
> I also came across this tutorial [2] demonstrating the usage of remote
> function. The README indicates [3] that "Since the functions are
> stateless, and running in their own container, we can redeploy and rescale
> it independently of the rest of the infrastructure." which seems to
> indicate that the function performs scaling manually by the user that could
> occupy arbitrary resources (e.g., task slots) from the Flink cluster on
> demand. But I wasn't sure how to explicitly specify the amount of
> parallelism for each function dynamically.
> Is there a good example to visualize statefun "physical separation"
> behavior by forcing the same function to be invoked at different task slots
> / machines (either on-demand or automatically)?
>
> Any help will be appreciated!
>
> Thanks!
>
> Le
>
>
> [1]
> https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.2/concepts/distributed_architecture.html#remote-functions
> [2] https://github.com/ververica/flink-statefun-workshop
> [3]
> https://github.com/ververica/flink-statefun-workshop#restart-the-functions
>

Re: Visualizing Flink Statefun's "Logical Co-location, Physical Separation" Properties

Reply via email to