Github user sidhavratha commented on the issue:

    https://github.com/apache/spark/pull/21685
  
    If batch duration is 10 second, every 10 second 1 new batch will start 
irrespective of last batch was completed or not.
    
    If a particular batch (10 second duration - which is supposed to complete 
in 10 second), takes more time to complete (for ex. 50 second in attached 
screenshot) that additional 40 sec will get added as scheduling delay of next 
batch. If poll time is included in processing time it can cause this sudden 
jump of scheduling delays of batches.
    
    These scheduling delay will get cleared if some batches take less than 10 
sec. For ex. first batch in screenshot had 4s scheduled delay which got cleared 
for next batch as that batch took only 5s to process.
    
    We are using backpressure to automatically control record count based of 
batch speed.
    
    <img width="1253" alt="screen shot 2018-07-02 at 6 51 13 pm" 
src="https://user-images.githubusercontent.com/2279976/42166788-c1c3890a-7e29-11e8-8d74-c2c251c7a6a1.png";>



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to