[ https://issues.apache.org/jira/browse/SPARK-23397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360793#comment-16360793 ]
Shahbaz Hussain commented on SPARK-23397: ----------------------------------------- Yes ,if current Batch Processing time is greater than Batch Interval , the next Batch is delayed or Queued. However ,in this case ,when we have a complex spark application ,the batch execution is missed. Ex: Lets say if my application is started at 12:20:00 and with batch interval of 5 Seconds ,in case if Job Creation time is 20 seconds ,it misses those many batches and in spark UI we would see the next batch of 12:20:25 appear and the batches of 12:20:05,12:20:10,12:20:15,12:20:20 not getting triggered. > Scheduling delay causes Spark Streaming to miss batches. > -------------------------------------------------------- > > Key: SPARK-23397 > URL: https://issues.apache.org/jira/browse/SPARK-23397 > Project: Spark > Issue Type: Bug > Components: DStreams > Affects Versions: 2.2.1 > Reporter: Shahbaz Hussain > Priority: Major > > * For Complex Spark (Scala) based D-Stream based applications ,which requires > creating Ex: 40 Jobs for every batch ,its been observed that ,batches does > not get created on the specific time ,ex: if i started a Spark Streaming > based application with batch interval as 20 seconds and application is > creating 40 odd Jobs ,observe the next batch does not create 20 seconds later > than previous job creation time. > * This is due to the fact that Job Creation is Single Threaded, if Job > Creation delay is greater than Batch Interval time ,batch execution misses > its schedule. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org