One thing I noticed is that when the trigger interval in foreachBatch is
set to something low (in this case 2 seconds, equivalent to the batch
interval that source sends data to Kafka topic (every 2 seconds)

trigger(processingTime='2 seconds')

Spark sends the warning that the queue is falling behind

```
batchId is 22,  rows is 40
21/03/18 16:29:05 WARN
org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor: Current
batch is falling behind. The trigger interval is 2000 milliseconds, but
spent 13686 milliseconds
batchId is 23,  rows is 40
21/03/18 16:29:21 WARN
org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor: Current
batch is falling behind. The trigger interval is 2000 milliseconds, but
spent 15316 milliseconds
batchId is 24,  rows is 40
```
So, assuming  that the batch interval is somehow fixed, one needs to look
at how to adjust the resources that process the topic in a timely manner.

Any comments welcome



   view my Linkedln profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

Reply via email to