Re: Reducing Checkpoint Count for Chain Operator

2023-02-02 Thread Talat Uyarer via user
Hi Schwalbe, >- There is no way to have only one file unless you lower the >parallelism to 1 (= only one subtask) > > Even with single parallelism there are multiple checkpoint files for chained operators. >- So which files do you see: 1 “_metadata” + multiple data files (or >ju

Re: Standalone cluster memory configuration

2023-02-02 Thread Hang Ruan
Hi, Theodor, The description in https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/config/#memory-configuration map help you to config the memory for flink. > Flink tries to shield users as much as possible from the complexity of > configuring the JVM for data-intensive pr

RE: Reducing Checkpoint Count for Chain Operator

2023-02-02 Thread Schwalbe Matthias
Hi Talat Uyarer, * There is no way to have only one file unless you lower the parallelism to 1 (= only one subtask) * So which files do you see: 1 “_metadata” + multiple data files (or just one)? * The idea of having multiple files is to allow multiple threads to be able to stare c

Clear global state

2023-02-02 Thread Dario Heinisch
Hey all, Is it somehow possible to hook into all states in a current Job and clear them all at once? Currently the way I do it is just to stop the job and then restarting it. Was wonderding if there is a way where I can do it without restarting the job. I know about adding TTL to states but

Re: Reducing Checkpoint Count for Chain Operator

2023-02-02 Thread Talat Uyarer via user
Hi Schwalbe, weijie, Thanks for your reply. >- Each state primitive/per subtask stores state into a separate file > > In this picture You can see Operator Chain https://nightlies.apache.org/flink/flink-docs-master/fig/tasks_chains.svg Source and Map are in the same chain. Today Flink create

Standalone cluster memory configuration

2023-02-02 Thread Theodor Wübker
Hello everyone, I have a Standalone Custer running in a docker-swarm with a very simple docker-compose configuration [3]. When I run my job there with a parallelism greater than one, I get an out of memory error. Nothing out of the ordinary, so I wanted to increase the JVM heap. I did that by

Re: beam + flink + k8

2023-02-02 Thread P Singh
Hi Jan, localhost:5 is not open i got this error.. I have a question if I have deployed with the same configuration on GKE and port-forward job manager to localhost.. can't I port forward the task manager to localhost:5. I don't know the IP of the running pod on GKE. That's why I'm getting

Re: beam + flink + k8

2023-02-02 Thread Jan Lukavský
That would suggest it is uploading (you can verify that using jnettop). But I'd leave this open for others to answer, because now it is purely Flink (not Beam) question. Best,  Jan On 2/2/23 10:46, bigdatadeveloper8 wrote: Hi Jan, Job manager is configured and working.. when I submit python

Re: Non-temporal watermarks

2023-02-02 Thread James Sandys-Lumsdaine
I can describe a use that has been successful for me. We have a Flink workflow that calculates reports over many days and have it currently set up to recompute the last 10 days or so when recovering this "deep history" from our databases and then switches over to live flow to process all subsequ

Re: Non-temporal watermarks

2023-02-02 Thread Gen Luo
Hi, This is an interesting topic. I suppose the watermark is defined based on the event time since it's mainly used, or designed, for the event time processing. Flink provides the event time processing mechanism because it's widely needed. Every event has its event time and we usually need to grou

Re: beam + flink + k8

2023-02-02 Thread bigdatadeveloper8
Hi Jan, Job manager is configured and working.. when I submit python Job to flink it's not showing or flink UI or simply hangs without any error. Sent from my Galaxy Original message From: Jan Lukavský Date: 02/02/2023 15:07 (GMT+05:30) To: user@flink.apache.org Subject: Re

Re: beam + flink + k8

2023-02-02 Thread Jan Lukavský
I'm not sure how exactly minikube exposes the jobmanager, but in GKE you likely need to port-forward it, e.g.  $ kubectl port-forward svc/flink-jobmanager 8081:8081 This should make jobmanager accessible via localhost:8081. For production cases you might want to use a different approach, like

Re: Non-temporal watermarks

2023-02-02 Thread Jan Lukavský
Hi, I will not speak about details related to Flink specifically, the concept of watermarks is more abstract, so I'll leave implementation details aside. Speaking generally, yes, there is a set of requirements that must be met in order to be able to generate a system that uses watermarks.

What is the state of Scala wrappers?

2023-02-02 Thread Erwan Loisant
Hi, Back in October, the Flink team announced that the Scala API was to be deprecated them removed. Which I think is perfectly fine, having third party develop Scala wrappers is a good approach. With the announce I expected those wrapper projects to get steam, however both projects linked in t