Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/21220#discussion_r185678739 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala --- @@ -266,93 +276,62 @@ class MicroBatchExecution( } /** - * Queries all of the sources to see if any new data is available. When there is new data the - * batchId counter is incremented and a new log entry is written with the newest offsets. + * Attempts to construct the next batch based on whether new data is available and/or updated --- End diff -- this paragraph is highly confusing. Could you please reword? Maybe something like: ``` Attempts to construct a batch according to: - Availability of new data - Existence of timeouts in stateful operators ```
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org