Hi, I'm glad things are getting clearer, looking forward to seeing how statefun is working out for you :-)
To change the parallelism you can simply set the "parallelism.default" [1] key in flink-conf.yaml. It is located in the statefun container at /opt/flink/conf/flink-conf.yaml. To avoid rebuilding the container you can mount the flink-conf.yaml externally, and if you are using Kubernetes then simply define flink-conf.yaml it as a config map. [1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#basic-setup Good luck, Igal. On Wed, May 13, 2020 at 11:55 AM Wouter Zorgdrager <zorgdrag...@gmail.com> wrote: > Dear Igal, all, > > Thanks a lot. This is very helpful. I understand the architecture a bit > more now. We can just scale the stateful functions and put a load balancer > in front and Flink will contact them. The only part of the scaling I don't > understand yet is how to scale the 'Flink side'. So If I understand > correctly the Kafka ingress/egress parts runs on the Flink cluster and > contacts the remote workers through HTTP. How can I scale this Kafka part > then? For a normal Flink job I would just change the parallelism, but I > couldn't really find that option yet. Is there some value I need to set in > the module.yaml. > > Once again, thanks for the help so far. It has been useful. > > Regards, > Wouter > > Op wo 13 mei 2020 om 00:03 schreef Igal Shilman <i...@ververica.com>: > >> Hi Wouter, >> >> Triggering a stateful function from a frontend indeed requires an ingress >> between them, so the way you've approached this is also the way we were >> thinking of. >> As Gordon mentioned a potential improvement might be an HTTP ingress, >> that would allow triggering stateful functions directly from the front end >> servers. >> But this kind of ingress is not implemented yet. >> >> Regarding scaling: Your understanding is correct, you can scale both the >> Flink cluster and the remote "python-stateful-function" cluster >> independently. >> Scaling the Flink cluster, tho, requires taking a savepoint, bumping the >> job parallelism, and starting the cluster with more workers from the >> savepoint taken previously. >> >> Scaling "python-stateful-function" workers can be done transparently to >> the Flink cluster, but the exact details are deployment specific. >> - For example the python workers are a k8s service. >> - Or the python workers are deployed behind a load balancer >> - Or you add new entries to the DNS record of your python worker. >> >> I didn't understand "ensuring that it ends op in the correct Flink job" >> can you please clarify? >> Flink would be the one contacting the remote workers and not the other >> way around. So as long as the new instances >> are visible to Flink they would be reached with the same shared state. >> >> I'd recommend watching [1] and the demo at the end, and [2] for a demo >> using stateful functions on AWS lambda. >> >> [1] https://youtu.be/NF0hXZfUyqE >> [2] https://www.youtube.com/watch?v=tuSylBadNSo >> >> It seems like you are on the correct path! >> Good luck! >> Igal. >> >> >> On Tue, May 12, 2020 at 11:18 PM Wouter Zorgdrager <zorgdrag...@gmail.com> >> wrote: >> >>> Hi Igal, all, >>> >>> In the meantime we found a way to serve Flink stateful functions in a >>> frontend. We decided to add another (set of) Flask application(s) which >>> link to Kafka topics. These Kafka topics then serve as ingress and egress >>> for the statefun cluster. However, we're wondering how we can scale this >>> cluster. On the documentation page some nice figures are provided for >>> different setups but no implementation details are given. In our case we >>> are using a remote cluster so we have a Docker instance containing the >>> `python-stateful-function` and of course the Flink cluster containing a >>> `master` and `worker`. If I understood correctly, in a remote setting, we >>> can scale both the Flink cluster and the `python-stateful-function`. >>> Scaling the Flink cluster is trivial because I can add just more >>> workers/task-managers (providing more taskslots) just by scaling the worker >>> instance. However, how can I scale the stateful function also ensuring that >>> it ends op in the correct Flink job (because we need shared state there). I >>> tried scaling the Docker instance as well but that didn't seem to work. >>> >>> Hope you can give me some leads there. >>> Thanks in advance! >>> >>> Kind regards, >>> Wouter >>> >>> Op do 7 mei 2020 om 17:17 schreef Wouter Zorgdrager < >>> zorgdrag...@gmail.com>: >>> >>>> Hi Igal, >>>> >>>> Thanks for your quick reply. Getting back to point 2, I was wondering >>>> if you could trigger indeed a stateful function directly from Flask and >>>> also get the reply there instead of using Kafka in between. We want to >>>> experiment running stateful functions behind a front-end (which should be >>>> able to trigger a function), but we're a bit afraid that using Kafka >>>> doesn't scale well if on the frontend side a user has to consume all Kafka >>>> messages to find the correct reply/output for a certain request/input. Any >>>> thoughts? >>>> >>>> Thanks in advance, >>>> Wouter >>>> >>>> Op do 7 mei 2020 om 10:51 schreef Igal Shilman <i...@ververica.com>: >>>> >>>>> Hi Wouter! >>>>> >>>>> Glad to read that you are using Flink for quite some time, and also >>>>> exploring with StateFun! >>>>> >>>>> 1) yes it is correct and you can follow the Dockerhub contribution PR >>>>> at [1] >>>>> >>>>> 2) I’m not sure I understand what do you mean by trigger from the >>>>> browser. >>>>> If you mean, for testing / illustration purposes triggering the >>>>> function independently of StateFun, you would need to write some >>>>> JavaScript >>>>> and preform the POST (assuming CORS are enabled) >>>>> Let me know if you’d like getting further information of how to do it. >>>>> Broadly speaking, GET is traditionally used to get data from a >>>>> resource and POST to send data (the data is the invocation batch in our >>>>> case). >>>>> >>>>> One easier walk around for you would be to expose another endpoint in >>>>> your Flask application, and call your stateful function directly from >>>>> there >>>>> (possibly populating the function argument with values taken from the >>>>> query >>>>> params) >>>>> >>>>> 3) I would expect a performance loss when going from the embedded SDK >>>>> to the remote one, simply because the remote function is at a different >>>>> process, and a round trip is required. There are different ways of >>>>> deployment even for remote functions. >>>>> For example they can be co-located with the Task managers and >>>>> communicate via the loop back device /Unix domain socket, or they can be >>>>> deployed behind a load balancer with an auto-scaler, and thus reacting to >>>>> higher request rate/latency increases by spinning new instances (something >>>>> that is not yet supported with the embedded API) >>>>> >>>>> Good luck, >>>>> Igal. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> [1] https://github.com/docker-library/official-images/pull/7749 >>>>> >>>>> >>>>> On Wednesday, May 6, 2020, Wouter Zorgdrager <zorgdrag...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I've been using Flink for quite some time now and for a university >>>>>> project I'm planning to experiment with statefun. During the walkthrough >>>>>> I've run into some issues, I hope you can help me with. >>>>>> >>>>>> 1) Is it correct that the Docker image of statefun is not yet >>>>>> published? I couldn't find it anywhere, but was able to run it by >>>>>> building >>>>>> the image myself. >>>>>> 2) In the example project using the Python SDK, it uses Flask to >>>>>> expose a function using POST. Is there also a way to serve GET request so >>>>>> that you can trigger a stateful function by for instance using your >>>>>> browser? >>>>>> 3) Do you expect a lot of performance loss when using the Python SDK >>>>>> over Java? >>>>>> >>>>>> Thanks in advance! >>>>>> >>>>>> Regards, >>>>>> Wouter >>>>>> >>>>>