Shahbaz Hussain commented on SPARK-23397:

Yes ,if current Batch Processing time is greater than Batch Interval , the next 
Batch is delayed or Queued. However ,in this case ,when we have a complex spark 
application ,the batch execution is missed. Ex: Lets say if my application is 
started at 12:20:00 and with batch interval of 5 Seconds ,in case if Job 
Creation time is 20 seconds ,it misses those many batches and in spark UI we 
would see the next batch of 12:20:25 appear and the batches of 
12:20:05,12:20:10,12:20:15,12:20:20 not getting triggered.

> Scheduling delay causes Spark Streaming to miss batches.
> --------------------------------------------------------
>                 Key: SPARK-23397
>                 URL: https://issues.apache.org/jira/browse/SPARK-23397
>             Project: Spark
>          Issue Type: Bug
>          Components: DStreams
>    Affects Versions: 2.2.1
>            Reporter: Shahbaz Hussain
>            Priority: Major
> * For Complex Spark (Scala) based D-Stream based applications ,which requires 
> creating Ex: 40 Jobs for every batch ,its been observed that ,batches does 
> not get created on the specific time ,ex: if i started a Spark Streaming 
> based application with batch interval as 20 seconds and application is 
> creating 40 odd Jobs ,observe the next batch does not create 20 seconds later 
> than previous job creation time.
>  * This is due to the fact that Job Creation is Single Threaded, if Job 
> Creation delay is greater than Batch Interval time ,batch execution misses 
> its schedule.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to