GitHub user tzulitai opened a pull request:

    https://github.com/apache/flink/pull/4301

    [FLINK-7143] [kafka] Fix indeterminate partition assignment in 
FlinkKafkaConsumer

    This PR changes the mod operation for partition assignment from `i % 
numTasks == subtaskIndex` to `partition.hashCode % numTasks == subtaskIndex`.
    
    The bug was initially caused by #3378, when moving away from sorting the 
partition list. Apparently, the tests for partition assignment was not strict 
enough and did not catch this. This PR additionally adds verifications that the 
partitions end up in the expected subtasks, and that different partition 
ordering will still have the same partition assignments.
    
    Note: a fix is not required for the `master` branch, since the partition 
discovery changes already indirectly fixed the issue. However, test coverage 
for deterministic assignment should likewise be improved in `master` as well. A 
separate PR will be opened for that.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tzulitai/flink FLINK-7143-1.3

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4301.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4301
    
----
commit 563f605d00f5d184fce2eb505c59033f22d3d0ab
Author: Tzu-Li (Gordon) Tai <[email protected]>
Date:   2017-07-11T17:03:01Z

    [FLINK-7143] [kafka] Fix indeterminate partition assignment in 
FlinkKafkaConsumer

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to