Hi,

*Context*:
We are using the Spark Streaming Library.

We have created a StreamingListener to implement some logic when
onBatchCompleted() event is triggered. This StreamingListener is registered
with the StreamingContext.

We are using Spark on Kubernetes. The Spark version is 2.4.2. batchDuration
is 500ms. Checkpointing is enabled.

*Issue*:
We are observing 3 to 5 seconds latency between the time when a batch is
completed and the time when onBatchCompleted() event is triggered. This is
happening for every batch.

We have measured the batch completion time from Spark Driver UI and also
from the Driver logs.

An example with 5 second latency:
if a batch is completed at 10:25:30(HH:MM:SS), the approximate time at
which onBatchCompleted() event for this batch gets triggered is 10:25:35

We were expecting a sub-second latency between the time when a batch gets
completed and the time when onBatchCompleted() event for the batch gets
triggered.

Did anyone face this issue before?

What factors can contribute to this latency?

Thanks for any pointers for debugging the issue.

Regards,
Rahul

Reply via email to