GitHub user zsxwing opened a pull request:
https://github.com/apache/spark/pull/22230
[SPARK-25214][SS][FOLLOWUP]Fix the issue that Kafka v2 source may return
duplicated records when `failOnDataLoss=false`
## What changes were proposed in this pull request?
This is a follow up PR for #22207 to fix a potential flaky test.
`processAllAvailable` doesn't work for continuous processing so we should not
use it for a continuous query.
## How was this patch tested?
Jenkins.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/zsxwing/spark SPARK-25214-2
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22230.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22230
----
commit a52425676ddcaa6a2737a95aedc80b4d8452023e
Author: Shixiong Zhu <zsxwing@...>
Date: 2018-08-24T21:42:11Z
don't use query.processAllAvailable for continuous processing
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]