smrosenberry commented on issue #23926: [SPARK-26872][STREAMING] Use a configurable value for final termination in the JobScheduler.stop() method URL: https://github.com/apache/spark/pull/23926#issuecomment-469093849 With a "long" batch interval, you have to wait the interval time before processing starts for the first batch; with a "short" batch interval, you'll get a "number" of empty batches queued until your first batch completes. Since the additional batches are empty, at least a couple of them squeak through with empty results files. We may not agree on the workaround mechanism (and even at that I agree the workaround is an abuse of the Spark streaming feature), but I suspect we are in agreement that hard coded values in software lead to inflexibility and limitations that are best avoided. On Sun, Mar 3, 2019 at 8:38 PM Sean Owen <[email protected]> wrote: > How about you wait for the batch to finish, and then shut it down? > possibly with shutdownNow()? if there are no more batches, that should > terminate quickly anyway, no? > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <https://github.com/apache/spark/pull/23926#issuecomment-469089954>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/AjwYhv70qFqmtas-6_y18ekZDKubn6Seks5vTHkJgaJpZM4bYwKz> > . >
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
