Re: Is there an HA solution to run flink job with multiple source
Hi, our use is that the data sources are independent, we are using flink to ingest data from kafka sources, do a bit of filtering and then write it to S3. Since we ingest from multiple kafka sources, and they are independent, we consider them all optional. Even if 1 just kafka is up and running, we would like to process it's data. We use a single flink job, since we find it easier to manage less flink jobs, and that way we also use less resources So far, the idea from Xuyang seems doable to me, I'll explore the idea of subclassing existing Kafka source and making sure that kafka source can function even if kafka is down. In the essence, we would like to treat situation of kafka being down, being the same as if kafka is up, but has no data. The caveat I can think of, is to add metrics and logs when kafka is down, so we can still be aware of it, if we need to. Cheers, Barisa On Wed, 1 Jun 2022 at 23:23, Alexander Fedulov wrote: > Hi Bariša, > > The way I see it is you either > - need data from all sources because you are doing some > conjoint processing. In that case stopping the pipeline is usually the > right thing to do. > - the streams consumed from multiple servers are not combined and hence > could be processed in independent Flink jobs. > Maybe you could explain where specifically your situation does not fit in > one of those two scenarios? > > Best, > Alexander Fedulov > > > On Wed, Jun 1, 2022 at 10:57 PM Jing Ge wrote: > >> Hi Bariša, >> >> Could you share the reason why your data processing pipeline should keep >> running when one kafka source is down? >> It seems like any one among the multiple kafka sources is optional for >> the data processing logic, because any kafka source could be the one that >> is down. >> >> Best regards, >> Jing >> >> On Wed, Jun 1, 2022 at 5:59 PM Xuyang wrote: >> >>> I think you can try to use a custom source to do that although the one >>> of the kafka sources is down the operator is also running(just do nothing). >>> The only trouble is that you need to manage the checkpoint and something >>> else yourself. But the good news is that you can copy the implementation of >>> existing kafka source and change a little code conveniently. >>> >>> At 2022-06-01 22:38:39, "Bariša Obradović" wrote: >>> >>> Hi, >>> we are running a flink job with multiple kafka sources connected to >>> different kafka servers. >>> >>> The problem we are facing is when one of the kafka's is down, the flink >>> job starts restarting. >>> Is there anyway for flink to pause processing of the kafka which is >>> down, and yet continue processing from other sources? >>> >>> Cheers, >>> Barisa >>> >>>
Is there an HA solution to run flink job with multiple source
Hi, we are running a flink job with multiple kafka sources connected to different kafka servers. The problem we are facing is when one of the kafka's is down, the flink job starts restarting. Is there anyway for flink to pause processing of the kafka which is down, and yet continue processing from other sources? Cheers, Barisa
Re: Is it possible to specify max process memory in flink 1.8.2, similar to what is possible in flink 1.11
Small update: we believe that the off heap memory is used by the parquet writer ( used in sink to write to S3 ) On Wed, 24 Feb 2021 at 23:25, Bariša wrote: > I'm running flink 1.8.2 in a container, and under heavy load, container > gets OOM from the kernel. > I'm guessing that that reason for the kernel OOM is large size of the > off-heap memory. Is there a way I can limit it in flink 1.8.2? > > I can see that newer version of flink has a config param, checking here is > it possible to do something similar in flink 1.8.2, without a flink upgrade? > > Cheers, > Barisa >
Is it possible to specify max process memory in flink 1.8.2, similar to what is possible in flink 1.11
I'm running flink 1.8.2 in a container, and under heavy load, container gets OOM from the kernel. I'm guessing that that reason for the kernel OOM is large size of the off-heap memory. Is there a way I can limit it in flink 1.8.2? I can see that newer version of flink has a config param, checking here is it possible to do something similar in flink 1.8.2, without a flink upgrade? Cheers, Barisa
Re: Wondering how to check if taskmanager is registered with the jobmanager, without asking job manager
Thnx Piotr. I agree, that would work. It's a bit chicken and the egg problem, since at that point we can't just spin up a task manager, and have it register itself, we need to have flinkmanager know how many task managers should be there. Bit more logic, but doable. Thnx for the tip. Cheers, Barisa On Wed, 10 Oct 2018 at 09:05, Piotr Nowojski wrote: > Hi, > > I don’t think that’s exposed on the TaskManager. > > Maybe it would simplify things a bit if you implement this as a single > “JobManager” health check, not multiple TaskManagers health check - for > example verify that there are expected number of registered TaskManagers. > It might cover your case. > > Piotrek > > On 9 Oct 2018, at 12:21, Bariša wrote: > > As part of deploying task managers and job managers, I'd like to expose > healthcheck on both task managers and job managers. > > For the task managers, one of the requirements that they are healthy, is > that they have successfully registered themselves with the job manager. > > Is there a way to achieve this, without making a call to job manager ( to > do that, I first need to make a call to the zookeeper to find the job > manager, so I'm trying to simplify the health check ). > > Ideally, taskmanager would have a metric that says, ( am registered ), but > afaik, that doesn't exist > https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#cluster > > > P.S. > This is my first post in the email list, happy to update/change my > question, if I messed up, or misunderstood something. > > Cheers, > Barisa > > >
Wondering how to check if taskmanager is registered with the jobmanager, without asking job manager
As part of deploying task managers and job managers, I'd like to expose healthcheck on both task managers and job managers. For the task managers, one of the requirements that they are healthy, is that they have successfully registered themselves with the job manager. Is there a way to achieve this, without making a call to job manager ( to do that, I first need to make a call to the zookeeper to find the job manager, so I'm trying to simplify the health check ). Ideally, taskmanager would have a metric that says, ( am registered ), but afaik, that doesn't exist https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#cluster P.S. This is my first post in the email list, happy to update/change my question, if I messed up, or misunderstood something. Cheers, Barisa