Re: Statefun 2.0 questions

Igal Shilman Thu, 14 May 2020 05:26:25 -0700

Hi,
I'm glad things are getting clearer, looking forward to seeing how statefun
is working out for you :-)


To change the parallelism you can simply set the "parallelism.default" [1]
key in flink-conf.yaml.
It is located in the statefun container at /opt/flink/conf/flink-conf.yaml.
To avoid rebuilding the container you can mount the flink-conf.yaml
externally, and if you are using Kubernetes then
simply define flink-conf.yaml it as a config map.

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#basic-setup

Good luck,
Igal.

On Wed, May 13, 2020 at 11:55 AM Wouter Zorgdrager <zorgdrag...@gmail.com>
wrote:

> Dear Igal, all,
>
> Thanks a lot. This is very helpful. I understand the architecture a bit
> more now. We can just scale the stateful functions and put a load balancer
> in front and Flink will contact them. The only part of the scaling I don't
> understand yet is how to scale the 'Flink side'. So If I understand
> correctly the Kafka ingress/egress parts runs on the Flink cluster and
> contacts the remote workers through HTTP. How can I scale this Kafka part
> then? For a normal Flink job I would just change the parallelism, but I
> couldn't really find that option yet. Is there some value I need to set in
> the module.yaml.
>
> Once again, thanks for the help so far. It has been useful.
>
> Regards,
> Wouter
>
> Op wo 13 mei 2020 om 00:03 schreef Igal Shilman <i...@ververica.com>:
>
>> Hi Wouter,
>>
>> Triggering a stateful function from a frontend indeed requires an ingress
>> between them, so the way you've approached this is also the way we were
>> thinking of.
>> As Gordon mentioned a potential improvement might be an HTTP ingress,
>> that would allow triggering stateful functions directly from the front end
>> servers.
>> But this kind of ingress is not implemented yet.
>>
>> Regarding scaling: Your understanding is correct, you can scale both the
>> Flink cluster and the remote "python-stateful-function" cluster
>> independently.
>> Scaling the Flink cluster, tho, requires taking a savepoint, bumping the
>> job parallelism, and starting the cluster with more workers from the
>> savepoint taken previously.
>>
>> Scaling "python-stateful-function" workers can be done transparently to
>> the Flink cluster, but the exact details are deployment specific.
>> - For example the python workers are a k8s service.
>> - Or the python workers are deployed behind a load balancer
>> - Or you add new entries to the DNS record of your python worker.
>>
>> I didn't understand "ensuring that it ends op in the correct Flink job"
>> can you please clarify?
>> Flink would be the one contacting the remote workers and not the other
>> way around. So as long as the new instances
>> are visible to Flink they would be reached with the same shared state.
>>
>> I'd recommend watching [1] and the demo at the end, and [2] for a demo
>> using stateful functions on AWS lambda.
>>
>> [1] https://youtu.be/NF0hXZfUyqE
>> [2] https://www.youtube.com/watch?v=tuSylBadNSo
>>
>> It seems like you are on the correct path!
>> Good luck!
>> Igal.
>>
>>
>> On Tue, May 12, 2020 at 11:18 PM Wouter Zorgdrager <zorgdrag...@gmail.com>
>> wrote:
>>
>>> Hi Igal, all,
>>>
>>> In the meantime we found a way to serve Flink stateful functions in a
>>> frontend. We decided to add another (set of) Flask application(s) which
>>> link to Kafka topics. These Kafka topics then serve as ingress and egress
>>> for the statefun cluster. However, we're wondering how we can scale this
>>> cluster. On the documentation page some nice figures are provided for
>>> different setups but no implementation details are given. In our case we
>>> are using a remote cluster so we have a Docker instance containing the
>>> `python-stateful-function` and of course the Flink cluster containing a
>>> `master` and `worker`. If I understood correctly, in a remote setting, we
>>> can scale both the Flink cluster and the `python-stateful-function`.
>>> Scaling the Flink cluster is trivial because I can add just more
>>> workers/task-managers (providing more taskslots) just by scaling the worker
>>> instance. However, how can I scale the stateful function also ensuring that
>>> it ends op in the correct Flink job (because we need shared state there). I
>>> tried scaling the Docker instance as well but that didn't seem to work.
>>>
>>> Hope you can give me some leads there.
>>> Thanks in advance!
>>>
>>> Kind regards,
>>> Wouter
>>>
>>> Op do 7 mei 2020 om 17:17 schreef Wouter Zorgdrager <
>>> zorgdrag...@gmail.com>:
>>>
>>>> Hi Igal,
>>>>
>>>> Thanks for your quick reply. Getting back to point 2, I was wondering
>>>> if you could trigger indeed a stateful function directly from Flask and
>>>> also get the reply there instead of using Kafka in between. We want to
>>>> experiment running stateful functions behind a front-end (which should be
>>>> able to trigger a function), but we're a bit afraid that using Kafka
>>>> doesn't scale well if on the frontend side a user has to consume all Kafka
>>>> messages to find the correct reply/output for a certain request/input. Any
>>>> thoughts?
>>>>
>>>> Thanks in advance,
>>>> Wouter
>>>>
>>>> Op do 7 mei 2020 om 10:51 schreef Igal Shilman <i...@ververica.com>:
>>>>
>>>>> Hi Wouter!
>>>>>
>>>>> Glad to read that you are using Flink for quite some time, and also
>>>>> exploring with StateFun!
>>>>>
>>>>> 1) yes it is correct and you can follow the Dockerhub contribution PR
>>>>> at [1]
>>>>>
>>>>> 2) I’m not sure I understand what do you mean by trigger from the
>>>>> browser.
>>>>> If you mean, for testing / illustration purposes triggering the
>>>>> function independently of StateFun, you would need to write some 
>>>>> JavaScript
>>>>> and preform the POST (assuming CORS are enabled)
>>>>> Let me know if you’d like getting further information of how to do it.
>>>>> Broadly speaking, GET is traditionally used to get data from a
>>>>> resource and POST to send data (the data is the invocation batch in our
>>>>> case).
>>>>>
>>>>> One easier walk around for you would be to expose another endpoint in
>>>>> your Flask application, and call your stateful function directly from 
>>>>> there
>>>>> (possibly populating the function argument with values taken from the 
>>>>> query
>>>>> params)
>>>>>
>>>>> 3) I would expect a performance loss when going from the embedded SDK
>>>>> to the remote one, simply because the remote function is at a different
>>>>> process, and a round trip is required. There are different ways of
>>>>> deployment even for remote functions.
>>>>> For example they can be co-located with the Task managers and
>>>>> communicate via the loop back device /Unix domain socket, or they can be
>>>>> deployed behind a load balancer with an auto-scaler, and thus reacting to
>>>>> higher request rate/latency increases by spinning new instances (something
>>>>> that is not yet supported with the embedded API)
>>>>>
>>>>> Good luck,
>>>>> Igal.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> [1] https://github.com/docker-library/official-images/pull/7749
>>>>>
>>>>>
>>>>> On Wednesday, May 6, 2020, Wouter Zorgdrager <zorgdrag...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I've been using Flink for quite some time now and for a university
>>>>>> project I'm planning to experiment with statefun. During the walkthrough
>>>>>> I've run into some issues, I hope you can help me with.
>>>>>>
>>>>>> 1) Is it correct that the Docker image of statefun is not yet
>>>>>> published? I couldn't find it anywhere, but was able to run it by 
>>>>>> building
>>>>>> the image myself.
>>>>>> 2) In the example project using the Python SDK, it uses Flask to
>>>>>> expose a function using POST. Is there also a way to serve GET request so
>>>>>> that you can trigger a stateful function by for instance using your
>>>>>> browser?
>>>>>> 3) Do you expect a lot of performance loss when using the Python SDK
>>>>>> over Java?
>>>>>>
>>>>>> Thanks in advance!
>>>>>>
>>>>>> Regards,
>>>>>> Wouter
>>>>>>
>>>>>

Re: Statefun 2.0 questions

Reply via email to