One thing I noticed is that when the trigger interval in foreachBatch is set to something low (in this case 2 seconds, equivalent to the batch interval that source sends data to Kafka topic (every 2 seconds)
trigger(processingTime='2 seconds') Spark sends the warning that the queue is falling behind ``` batchId is 22, rows is 40 21/03/18 16:29:05 WARN org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor: Current batch is falling behind. The trigger interval is 2000 milliseconds, but spent 13686 milliseconds batchId is 23, rows is 40 21/03/18 16:29:21 WARN org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor: Current batch is falling behind. The trigger interval is 2000 milliseconds, but spent 15316 milliseconds batchId is 24, rows is 40 ``` So, assuming that the batch interval is somehow fixed, one needs to look at how to adjust the resources that process the topic in a timely manner. Any comments welcome view my Linkedln profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.