Hello Raúl, The more I analyze the requirements, the more complicated it seems to get - it's looking like I need to have a per-customer state machine to manage the flow.
To answer your questions: > - How many batches per day will you receive per customer? one per day per customer - 3 predecessor events in, 1 result event out. > - Can the event batches interleave? Yes, for sure, which raises another issue - now I'm thinking I need per-customer, dynamically added route/route-policy > - What identifies a concrete batch of events? * a per-customer arrival of config info from external system via JMS * a per-customer arrival of message indicating certain external processing is ready for next step from Camel-based process * a per-customer indication of a set of files have arrived via SFTP > - Do you have a finite list of customers? (and customer ids?) * I don't see how that factors in - assume the less restrictive case > - What is the payload of the fourth event? * an advisory JMS message that SFTP-processed files has completed and that the external system can proceed base on these results. Thanks, Chris On Wed, May 15, 2013 at 4:42 PM, Raul Kripalani <r...@evosent.com> wrote: > Let me try and understand the timeline of these events. I will call a > logical grouping of events "event batch". > If you don't mind answering the following questions, I can assist better: > > - How many batches per day will you receive per customer? > - Can the event batches interleave? > - What identifies a concrete batch of events? > - Do you have a finite list of customers? (and customer ids?) > - What is the payload of the fourth event? Am I right asserting that you > receive 3 raw events, and you build some sort of composite or merged > message and release it as the 4th event? > > Regards, > > *Raúl Kripalani* > Enterprise Architect, Open Source Integration specialist, Program > Manager | Apache > Camel Committer > http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani > http://blog.raulkr.net | twitter: @raulvk > > On Wed, May 15, 2013 at 9:30 PM, Chris Wolf <cwolf.a...@gmail.com> wrote: > >> Raúl, >> >> Thanks for your ideas. Now it's 3 events without guaranteed order to >> arrive before triggering the fourth event. As for correlation key, I >> guess it would be customer-id plus a timestamp >> that shows the event happened in the last several hours. >> >> The more I look into it, the more it seems like the Camel resequencer >> process may apply >> here: http://camel.apache.org/resequencer.html >> >> The thing is, I'm not sure about setting batch size and/or timeout. >> The documentation >> states: >> >> "...messages are collected into a batch, either by a maximum number of >> messages per batch or using a timeout..." >> >> So it seems I would set batch size to 3 in my case, but our process is >> supporting >> multiple customers, so really, it's 3 predecessor events *per customer*. >> Also, >> I think I would need persistent state in the even of a crash. >> Although I'd like to >> avoid it, I can't help thinking I may have to implement yet another custom >> Processor to support this use case. (I already had to implement a custom >> SFTP Processor to handle dynamic endpoint settings based on the >> customer-id). >> >> I know that Camel is intended for ETL-ish sorts of problems, but I >> think the Camel >> approach is also appropriate for interactive, online processing, but >> in that case, >> it would be really helpful if endpoint configurations had the >> additional dimension >> of settings/configs indexed by id (such as customer-id, user-id, >> account-id, etc.) >> >> I could be wrong, but it seems that the static URI way of configuring >> endpoints >> doesn't scale to process-flow/per-identity. Those endpoints that do take >> expressions help in this regard, but not all of them work with >> expression-language >> configuration, e.g. SFTP Consumer. >> >> Thanks, >> >> Chris >> >> On Wed, May 15, 2013 at 3:46 PM, Raul Kripalani <r...@evosent.com> wrote: >> > Hi Chris, >> > >> > I like this kind of problems ;-) Do these two messages share a >> correlation >> > key? >> > >> > If yes, you can create a bean which acts like a Repository, accumulating >> > message bodies or Exchanges under the correlation key. Could be >> implemented >> > using Guava's MultiMap, or a DB if you need durable persistence, or a >> > distributed cache if you require clustering without persistence. >> > >> > When a message arrives, you query the Repository for a previous message >> > with the same correlation key. If it exists, you pick it up, do whatever >> > manipulation is needed, and release the 3rd event. Kind of like a >> > CyclicBarrier that stores the messages for later usage. >> > >> > P.S.: You could consider using the Aggregator EIP, but it'll block a >> thread >> > until the 2nd event comes through. >> > >> > Regards, >> > >> > *Raúl Kripalani* >> > Enterprise Architect, Open Source Integration specialist, Program >> > Manager | Apache >> > Camel Committer >> > http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani >> > http://blog.raulkr.net | twitter: @raulvk >> > >> > On Wed, May 15, 2013 at 6:56 PM, Chris Wolf <cwolf.a...@gmail.com> >> wrote: >> > >> >> In my process, I have two events that must be completed before the >> >> third can proceed. >> >> One event is the arrival of a certain JMS message and the other is the >> >> arrival of >> >> a certain file type. The problem is, I cannot represent this in a >> >> route pipeline because >> >> one event may occur before the other and it's totally random from one >> >> run to another. >> >> >> >> In the abstract, I'm thinking one of the EIP patterns, either >> >> "resequencer" or "scatter-gather" >> >> applies, but I'm not certain how to do this in a concrete way with >> >> Camel. If anyone has >> >> ideas, that would be great... >> >> >> >> Thanks, >> >> >> >> >> >> Chris >> >> >>