My naive solution can't work because a dump can be quite long. So, yes I have to find a way to stop the consumption from the topic used for streaming mode when a dump is done :( Terry, I try to implement something based on your reply and based on this thread https://stackoverflow.com/questions/59201503/flink-kafka-gracefully-close-flink-consuming-messages-from-kafka-source-after-a Any suggestions are welcomed thx.
David On 2020/01/06 09:35:37, David Morin <morin.david....@gmail.com> wrote: > Hi, > > Thanks for your replies. > Yes Terry. You are right. I can try to create a custom source. > But perhaps, according to my use case, I figured out I can use a technical > field in my data. This is a timestamp and I think I just have to ignore late > events with watermarks or later in the pipeline according to metadata stored > in the Flink state. I test it now... > Thx > > David > > On 2020/01/03 15:44:08, Chesnay Schepler <ches...@apache.org> wrote: > > Are you asking how to detect from within the job whether the dump is > > complete, or how to combine these 2 jobs? > > > > If you had a way to notice whether the dump is complete, then I would > > suggest to create a custom source that wraps 2 kafka sources, and switch > > between them at will based on your conditions. > > > > > > On 03/01/2020 03:53, Terry Wang wrote: > > > Hi, > > > > > > I’d like to share my opinion here. It seems that you need adjust the > > > Kafka consumer to have communication each other. When your begin the dump > > > process, you need to notify another CDC-topic consumer to wait idle. > > > > > > > > > Best, > > > Terry Wang > > > > > > > > > > > >> 2020年1月2日 16:49,David Morin <morin.david....@gmail.com> 写道: > > >> > > >> Hi, > > >> > > >> Is there a way to stop temporarily to consume one kafka source in > > >> streaming mode ? > > >> Use case: I have to consume 2 topics but in fact one of them is more > > >> prioritized. > > >> One of this topic is dedicated to ingest data from db (change data > > >> capture) and one of them is dedicated to make a synchronization (a dump > > >> i.e. a SELECT ... from db). At the moment the last one is performed by > > >> one Flink job and we start this one after stop the previous one (CDC) > > >> manually > > >> I want to merge these 2 modes and automatically stop consumption of the > > >> topic dedicated to the CDC mode when a dump is done. > > >> How to handle that with Flink in a streaming way ? backpressure ? ... > > >> Thx in advance for your insights > > >> > > >> David > > > > > > > >