Just looking at this, what is your frequency interval ingesting ~1000
records per sec. By the rule of thumb your capacity planning should account
for twice the normal ingestion rate.
Regarding your point:
"... Hence, ideally I'd like to increase the number of batches/records
that are being proce
After a 10 minutes delay, taking a 10 minutes batch will not take 10 times
more than a 1-minute batch.
It's mainly because of the I/O write operations to HDFS, and also because
certain active users will be active in 1-minute batch, processing this
customer only once (if we take 10 batches) will sa
Wouldn't this happen naturally? the large batches would just take a longer
time to complete already.
On Thu, Jul 1, 2021 at 6:32 AM András Kolbert
wrote:
> Hi,
>
> I have a spark streaming application which generally able to process the
> data within the given time frame. However, in certain hou