Github user tzulitai commented on a diff in the pull request:
https://github.com/apache/flink/pull/5108#discussion_r154368266
--- Diff:
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/partitioner/FlinkFixedPartitioner.java
---
@@ -68,6 +78,13 @@ public int partition(T record, byte[] key, byte[] value,
String targetTopic, int
partitions != null && partitions.length > 0,
"Partitions of the target topic is empty.");
- return partitions[parallelInstanceId % partitions.length];
+ if (topicToFixedPartition.containsKey(targetTopic)) {
--- End diff --
@aljoscha yes, the semantics is a bit odd / needs some clarification before
we move on. I've been having a go at implementing state checkpointing for the
`FlinkFixedPartitioner` today, and for example one unclear case I bumped into
was the following:
Subtask 1 writes to partition X for "some-topic"
Subtask 2 writes to partition Y for "some-topic"
On restore and say the sink is rescaled to DOP of 1, should the single
subtask continue writing to partition X or Y for "some-topic"?
Regarding the default Kafka behaviour:
It's hash partitioning on the attached key for the records. I've also
thought about using that as the default instead of the fixed partitioner; see
the relevant discussion here:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/FlinkKafkaProducerXX-td16951.html
---