yingshikong186 opened a new pull request #27286: only one batch
URL: https://github.com/apache/spark/pull/27286
 
 
   ## What changes were proposed in this pull request?
   When the current job is not completed block streaming batch commit, until 
completed。The next job will merge all batch which  during the blocking.
   ## How was this patch tested?
    the input seq [1, 2, 3, 4, 5, 6]。 
   batch duration: 1s。
   The 3th batch will take a long time。Normally the other batches will be 
completed quickly.
   We expect:
         1. the 4th batch will not be commited during the 3th batch computing, 
and 4th batch will be merge in the next batch. So that
   the size of jobSets  is always less than 1。
         2. the num completedBatches less than the size of seq。
         3. the data is not lost

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to