How does Flink recovers uncommited Kafka offset in AsyncIO?

2019-07-01 Thread wang xuchen
Hi Flink experts, I am prototyping a real time system that reads from Kafka source with Flink and calls out to an external system as part of the event processing. One of the most important requirements are read from Kafka should NEVER stall, even in face of some async external calls slowness

[no subject]

2019-07-01 Thread wang xuchen
Hi Flink experts, I am prototyping a real time system that reads from Kafka source with Flink and calls out to an external system as part of the event processing. One of the most important requirements are read from Kafka should NEVER stall, even in face of some async external calls slowness

Flink Kafka ordered offset commit & unordered processing

2019-06-29 Thread wang xuchen
Hi Flink experts, I am prototyping a real time system that reads from Kafka source with Flink and calls out to an external system as part of the event processing. One of the most important requirements are read from Kafka should NEVER stall, even in face of some async external calls slowness

Does making synchronize call might choke the whole pipeline

2019-06-22 Thread wang xuchen
Dear Flink experts, I am testing the following code env.enableCheckpointing(2000); FlinkKafkaConsumer consumer = new FlinkKafkaConsumer<>("kf-events", new SimpleStringSchema(), properties2); ... messageStream.rebalance().map(new MapFunction() {

Flink Kafka consumer with low latency requirement

2019-06-20 Thread wang xuchen
Dear Flink experts, I am experimenting Flink for a use case where there is a tight latency requirements. A stackoverflow article suggests that I can use setParallism(n) to process a Kafka partition in a multi-threaded way. My understanding is there is still one kafka consumer per partition, but

Does Flink Kafka connector has max_pending_offsets concept?

2019-06-04 Thread wang xuchen
Hi Flink users, When # of Kafka consumers = # of partitions, and I use setParallelism(>1), something like this 'messageSteam.rebalance().map(lamba).setParallelism(3).print()' How do I tune # of outstanding uncommitted offset? Something similar to

Question about Kafka Flink consumer parallelism

2019-05-30 Thread wang xuchen
Hi Flink users, I am trying to figure out how leverage parallelism to improve throughput of a Kafka consumer. From my research, I understand the scenario when *kafka partitions (=<>) # consumer and * to use rebalance spread messages evenly across workers. Also use setParallelism(#) to achieve