Re: Kafka and parallelism

2018-02-07 Thread Christophe Jolif
Ok thanks! I should have seen this. Sorry. -- Christophe On Wed, Feb 7, 2018 at 10:27 AM, Tzu-Li (Gordon) Tai wrote: > Hi Christophe, > > Yes, you can achieve writing to different topics per-message using the > `KeyedSerializationSchema` provided to the Kafka producer. > The schema interface ha

Re: Kafka and parallelism

2018-02-07 Thread Tzu-Li (Gordon) Tai
Hi Christophe, Yes, you can achieve writing to different topics per-message using the `KeyedSerializationSchema` provided to the Kafka producer. The schema interface has a `getTargetTopic` method which allows you to override the default target topic for a given record. I agree that the method is

Re: Kafka and parallelism

2018-02-07 Thread Christophe Jolif
Hi Gordon, or anyone else reading this, Still on this idea that I consume a Kafka topic pattern. I want to then to sink the result of the processing in a set of topics depending on from where the original message came from (i.e. if this comes from origin-topic-1 I will serialize the result in des

Re: Kafka and parallelism

2018-02-05 Thread Tzu-Li (Gordon) Tai
Yes, the answer to that would be no. If you do not explicitly set a parallelism for the consumer, the parallelism by default will be whatever the parallelism of the job is, and is independent of how many Kafka partitions there are. Cheers, Gordon On 5 February 2018 at 11:42:21 AM, Christophe J

Re: Kafka and parallelism

2018-02-05 Thread Christophe Jolif
Thanks. It helps indeed. I guess the last point it does not explicitly answer is "does just creating a kafka consumer reading from multiple partition set the parallelism to the number of partitions". But reading between the lines I think this answer is clearly no. You have to set your parallelism

Re: Kafka and parallelism

2018-02-05 Thread Tzu-Li (Gordon) Tai
Hi Christophe, You can set the parallelism of the FlinkKafkaConsumer independently of the total number of Kafka partitions (across all subscribed streams, including newly created streams that match a subscribed pattern). The consumer deterministically assigns each partition to a single consumer

Kafka and parallelism

2018-02-03 Thread Christophe Jolif
Hi, If I'm sourcing from a KafkaConsumer do I have to explicitly set the Flink job parallelism to the number of partions or will it adjust automatically accordingly? In other word if I don't call setParallelism will get 1 or the number of partitions? The reason I'm asking is that I'm listening to