[GitHub] flink pull request #5108: [FLINK-8181] [kafka] Make FlinkFixedPartitioner in...

tzulitai Fri, 01 Dec 2017 07:14:02 -0800

Github user tzulitai commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5108#discussion_r154368266
  
    --- Diff: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/partitioner/FlinkFixedPartitioner.java
 ---
    @@ -68,6 +78,13 @@ public int partition(T record, byte[] key, byte[] value, 
String targetTopic, int
                        partitions != null && partitions.length > 0,
                        "Partitions of the target topic is empty.");
     
    -           return partitions[parallelInstanceId % partitions.length];
    +           if (topicToFixedPartition.containsKey(targetTopic)) {
    --- End diff --
    
    @aljoscha yes, the semantics is a bit odd / needs some clarification before 
we move on. I've been having a go at implementing state checkpointing for the 
`FlinkFixedPartitioner` today, and for example one unclear case I bumped into 
was the following:
    Subtask 1 writes to partition X for "some-topic"
    Subtask 2 writes to partition Y for "some-topic"
    On restore and say the sink is rescaled to DOP of 1, should the single 
subtask continue writing to partition X or Y for "some-topic"?
    
    Regarding the default Kafka behaviour:
    It's hash partitioning on the attached key for the records. I've also 
thought about using that as the default instead of the fixed partitioner; see 
the relevant discussion here: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/FlinkKafkaProducerXX-td16951.html

---

[GitHub] flink pull request #5108: [FLINK-8181] [kafka] Make FlinkFixedPartitioner in...

Reply via email to