Hi Jiayi, This would allow me to call the Kafka producer without risking a race condition, but it comes with its own problem: unless the source has a parallelism of 1, it will trigger multiple times. I can create a specific source that doesn't produce anything, has a parallelism of 1, and calls the producer from its open method : it's a bit ugly, but it would get rid of the race condition.
On Mon, Nov 4, 2019 at 3:59 PM bupt_ljy <[email protected]> wrote: > Hi Gael, > > I had a similar situation before. Actually you don’t need to accomplish > this in such a complicated way. I guess you’ve already had a rules source > and you can send rules in #open function for a startup if your rules source > inherit from #RichParallelSourceFunction. > > > Best, > > Jiayi Liao > > Original Message > *Sender:* Gaël Renoux<[email protected]> > *Recipient:* user<[email protected]> > *Date:* Monday, Nov 4, 2019 22:50 > *Subject:* Finite source without blocking save-points > > Hello everyone, > > I have a job which runs continuously, but it also needs to send a single > specific Kafka message on startup. I tried the obvious approach to use > StreamExecutionEnvironment.fromElements and add a Kafka sink, however > that's not possible: the source being finished, it becomes impossible to > stop the job with a save-point later. > > The best solution I found is creating a basic Kafka producer to send the > message, and running that producer inside the job's startup script (before > calling StreamExecutionEnvironment.execute()). However, there's a race > condition, where the message could be sent and trigger stuff before the job > is ready to receive messages. In addition, it forces me to have a separate > Kafka producer, while Flink already comes with Kafka sinks. And finally, > it's pretty specific to my use case (sending a Kafka message), and it looks > like there should be a generic solution here. > > Do you guys know of any better way to do this? Is there any way to set up > a finite source that will not block save-points? > > Just in case, the global use case is nothing special: my job maintains a > set of rules as broadcast state in operators and handle input according to > those rules. On startup, I need to request all rules to be sent at once > (the emitter normally sends updated rules only), in case the rule state has > been lost (happens when we evolve the rule model, for instance), and this > is done through a Kafka message. > > Thanks in advance! > > Gaël Renoux > -- Gaël Renoux Senior R&D Engineer, DataDome M +33 6 76 89 16 52 <+33+6+76+89+16+52> E [email protected] <[email protected]> W www.datadome.co <http://www.datadome.co?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> <https://www.facebook.com/datadome/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> <https://fr.linkedin.com/company/datadome?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> <https://twitter.com/data_dome?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature> [image: Read DataDome reviews on G2] <https://www.g2.com/products/datadome/reviews?utm_source=review-widget&utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
