Re: [DISCUSSION] Using `DataStreamUtils.reinterpretAsKeyedStream` on a pre-partitioned Kafka topic

2019-12-03 Thread Robin Cassan
Thanks for your answers, we will have a look at adapting the Kafka source to assign the input partitions depending on the assigned Keygroups. If anyone has already done such a thing I'd love your advice! Cheers Robin Le lun. 2 déc. 2019 à 08:48, Gyula Fóra a écrit : > Hi! > > As far as I know,

Re: [DISCUSSION] Using `DataStreamUtils.reinterpretAsKeyedStream` on a pre-partitioned Kafka topic

2019-12-01 Thread Gyula Fóra
Hi! As far as I know, even if you prepartition the data exactly the same way in kafka using the key groups, you have no guarantee that the kafka consumer source would pick up the right partitions. Maybe if you have exactly as many kafka partitions as keygroups/max parallelism, partitioned corre

Re: [DISCUSSION] Using `DataStreamUtils.reinterpretAsKeyedStream` on a pre-partitioned Kafka topic

2019-12-01 Thread Congxian Qiu
Hi >From the doc[1], the DataStream MUST already be pre-partitioned in EXACTLY the same way Flink’s keyBy would partition the data in a shuffle w.r.t. key-group assignment. you should make sure that the key locates in the right key-group, and the key-group locates in the right parallelism. you can