Re: Duplicate record writes to sink after job failure

2019-01-15 Thread Andrey Zagrebin
Hi Chris, there is no way to provide "exactly-once" and avoid duplicates without transactions available since Kafka 0.11. The only way I could think of is building a custom deduplication step on consumer side. E.g. using in memory cache with eviction or some other temporary storage to keep set of

Duplicate record writes to sink after job failure

2019-01-14 Thread Slotterback, Chris
We are running a Flink job that uses FlinkKafkaProducer09 as a sink with consumer checkpointing enabled. When our job runs into communication issues with our kafka cluster and throws an exception after the configured retries, our job restarts but we want to ensure at least once processing so we