Re: Increase batch interval in case of delay

2021-07-01 Thread Mich Talebzadeh
Just looking at this, what is your frequency interval ingesting ~1000 records per sec. By the rule of thumb your capacity planning should account for twice the normal ingestion rate. Regarding your point: "... Hence, ideally I'd like to increase the number of batches/records that are being proce

Re: Increase batch interval in case of delay

2021-07-01 Thread András Kolbert
After a 10 minutes delay, taking a 10 minutes batch will not take 10 times more than a 1-minute batch. It's mainly because of the I/O write operations to HDFS, and also because certain active users will be active in 1-minute batch, processing this customer only once (if we take 10 batches) will sa

Re: Increase batch interval in case of delay

2021-07-01 Thread Sean Owen
Wouldn't this happen naturally? the large batches would just take a longer time to complete already. On Thu, Jul 1, 2021 at 6:32 AM András Kolbert wrote: > Hi, > > I have a spark streaming application which generally able to process the > data within the given time frame. However, in certain hou