Hi Brent,

Sounds like a good plan to start with. Application mode gives the best 
isolation level, but can be costly for large number of small jobs as at least 2 
containers are required (1 JobManager and 1 TaskManager) for each job. If your 
jobs are mostly small in size (1 or 2 TM), you might want to give a try to the 
Session Mode with the operator. Jobs can be grouped to run on a few session 
clusters to achieve higher resource utilization at the cost of lower isolation 
level and being more difficult with debugging. Running 10s on the same session 
cluster would a good choice to start with.

Best,
Zhanghao Chen
________________________________
From: Brent <brentwritesc...@gmail.com>
Sent: Saturday, February 17, 2024 3:01
To: user@flink.apache.org <user@flink.apache.org>
Subject: Flink use case feedback request

Hey everyone,

I've been looking at Flink to handle a fairly complex use case and was hoping 
for some feedback on if the approach I'm thinking about with Flink seems 
reasonable.  When researching what people build on Flink, it seems like a lot 
of the focus tends to be on running fewer heavyweight/complex jobs whereas the 
approach I'm thinking about involves executing many potentially smaller and 
more lightweight jobs.

The core idea is that we have a lot (think 100s or 1000s) of incoming data 
streams (maybe via something like Apache Pulsar) and we have rules, of various 
complexities, that need to be executed against individual streams.  If a rule 
matches, an event needs to be emitted to an output stream.  The rules could be 
as simple as "In any event, if you see field X set to value 'foo', it's a 
match" or more complex like "If you see an event of type A followed by an event 
of type B followed by an event of type C in a certain time window, then it's a 
match."  These rules are long-running (could be hours, days, weeks, or longer).

It *seems* to me like Application Mode 
(https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/deployment/overview/)
 with the Kubernetes Operator 
(https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/overview/#application-deployments)
 which will create a new cluster per application seems like what I'd want. I'm 
envisioning each of these long-running rules (which potentially each read a 
different data stream) is its own job in its own application (maybe later some 
can be combined, but to start, they'll all be separate).

Does that seem like the right approach to running a number of somewhat small 
jobs concurrently on Flink?  Are there any "gotchas" to this I'm not thinking 
of? Any alternate approaches that are worth considering?  Are there any users 
we know of who do something like this currently?

Thanks for your time and insight!

~Brent

Reply via email to