The simplest way would be to have your job send a message to another kafka topic when it’s done.
Rick > On Dec 1, 2015, at 3:44 PM, Anton Polyakov <polyakov.an...@gmail.com> wrote: > > Hi > > I am looking at Samza to process some incoming stream of trades. Processing > pipeline is a complex DAG where some nodes might create zero to many > descendant events. Ultimately they got to the end sink (and now these are > completely different events all originated by one source trade) > > I am looking for a answer to quite fundamental (in my view) question - given > source trade, how could I know if it was fully processed by DAG or is it > still in flight? It is called lineage tracking in Storm afaik. > > In my situation when trade event is fired I need a reliable way to determine > when processing is done in order to treat results. I googled a lot of > algorithms - checkpointing in Flink, transactions in Storm. Checkpointing in > Flink could work but unfortunately I can’t use them from my source to mark a > trade with checkpoint barrier - API is not available. Looking at Samza > there’s nothing obvious either. > > I have a very uncomfortable feeling that the problem should be very basic and > fundamental, maybe I am using wrong terms to explain it. take it to extreme - > one can send 1 event to Samza and processing is takes 15 minutes on some > node. How on the end of the pipeline should I know when its done? > > Thank you
signature.asc
Description: Message signed with OpenPGP using GPGMail