GitHub user QuentinAmbard opened a pull request:
https://github.com/apache/spark/pull/21917
[SPARK-24720][STREAMING-KAFKA] add option to align ranges with offset
having records to support kafka transaction
## What changes were proposed in this pull request?
This fix adds an option to align the ranges of each partition to be aligned
with offset having records.
To enable this behavior, set
spark.streaming.kafka.alignRangesToCommittedTransaction = true
Note that if a lot of transactions are abort, multiple poll of 1sec might
be executed for each partition.
We rewind the partition of spark.streaming.kafka.offsetSearchRewind offset
to search the last offset with records.
spark.streaming.kafka.offsetSearchRewind should be set to be > number of record
in 1 typical transaction depending of the use case (by default 10).
the first rewind is executed at
(TO_OFFSET-spark.streaming.kafka.offsetSearchRewind^1), if no data is found, we
retry at (TO_OFFSET - spark.streaming.kafka.offsetSearchRewind^2) etc until we
reach FROM_OFFSET.
## How was this patch tested?
Unit test for the rewinder. No integration test for transaction since the
current kafka version doesn't support transactions. Tested against a custom
streaming use-case.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/QuentinAmbard/spark SPARK-24720
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21917.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21917
----
commit a5b52c94b9f7eaa293d7882bde0fb432ef3fa632
Author: quentin <quentin.ambard@...>
Date: 2018-07-30T14:43:56Z
SPARK-24720 add option to align ranges with offset having records to
support kafka transaction
commit 79d83db0f535fe1e9e5f534a6a0b4fe7c3d6257f
Author: quentin <quentin.ambard@...>
Date: 2018-07-30T14:47:33Z
correction indentation
commit 05c7e7fb96806c07bc9b0513ef59fbcdd5ae9118
Author: quentin <quentin.ambard@...>
Date: 2018-07-30T14:53:45Z
remove wrong comment edit
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]