[ https://issues.apache.org/jira/browse/SPARK-23397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-23397. ------------------------------- Resolution: Not A Problem It's the part in "foreachRDD" that gets executed at each batch; one answer is to make sure you don't re-execute logic at each batch that you don't need to. But you're just in general saying that sometimes complex operations take a long time, or, that you'd prefer a certain operation were faster. Neither relates to the original issue here, about scheduling. > Scheduling delay causes Spark Streaming to miss batches. > -------------------------------------------------------- > > Key: SPARK-23397 > URL: https://issues.apache.org/jira/browse/SPARK-23397 > Project: Spark > Issue Type: Bug > Components: DStreams > Affects Versions: 2.2.1 > Reporter: Shahbaz Hussain > Priority: Major > > * For Complex Spark (Scala) based D-Stream based applications ,which requires > creating Ex: 40 Jobs for every batch ,its been observed that ,batches does > not get created on the specific time ,ex: if i started a Spark Streaming > based application with batch interval as 20 seconds and application is > creating 40 odd Jobs ,observe the next batch does not create 20 seconds later > than previous job creation time. > * This is due to the fact that Job Creation is Single Threaded, if Job > Creation delay is greater than Batch Interval time ,batch execution misses > its schedule. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org