Hi,

I am running a flink cluster to process clickstream data  (generate user
level, page level, site level statistics)

I want to understand the cons and pros of submitting multiple jobs (each
job handles one simple processing/computation) vs one/few complex jobs.  At
present, the events are read from single Kafka topic (in future it can
change and I can have multiple topics). Here are my thoughts:

Multiple simple jobs: Failure/bugs in one job won't impact other
computations. It can be stopped independently of others.

Are there any overheads or performance penalties? If there are none and
this is the recommended way, then I have a follow up question - Can I
update the jar without stopping all flink streaming jobs? I mean stop job
the job that has a bug (leave others running), replace jar [that contains
*all* jobs code] and then restart the stopped job.

Thanks,
Tarandeep

Reply via email to