Re: What is the best approach to do an asynchronous rendezvous?
Hi Chris, I like this kind of problems ;-) Do these two messages share a correlation key? If yes, you can create a bean which acts like a Repository, accumulating message bodies or Exchanges under the correlation key. Could be implemented using Guava's MultiMap, or a DB if you need durable persistence, or a distributed cache if you require clustering without persistence. When a message arrives, you query the Repository for a previous message with the same correlation key. If it exists, you pick it up, do whatever manipulation is needed, and release the 3rd event. Kind of like a CyclicBarrier that stores the messages for later usage. P.S.: You could consider using the Aggregator EIP, but it'll block a thread until the 2nd event comes through. Regards, *Raúl Kripalani* Enterprise Architect, Open Source Integration specialist, Program Manager | Apache Camel Committer http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani http://blog.raulkr.net | twitter: @raulvk On Wed, May 15, 2013 at 6:56 PM, Chris Wolf cwolf.a...@gmail.com wrote: In my process, I have two events that must be completed before the third can proceed. One event is the arrival of a certain JMS message and the other is the arrival of a certain file type. The problem is, I cannot represent this in a route pipeline because one event may occur before the other and it's totally random from one run to another. In the abstract, I'm thinking one of the EIP patterns, either resequencer or scatter-gather applies, but I'm not certain how to do this in a concrete way with Camel. If anyone has ideas, that would be great... Thanks, Chris
Re: What is the best approach to do an asynchronous rendezvous?
Raúl, Thanks for your ideas. Now it's 3 events without guaranteed order to arrive before triggering the fourth event. As for correlation key, I guess it would be customer-id plus a timestamp that shows the event happened in the last several hours. The more I look into it, the more it seems like the Camel resequencer process may apply here: http://camel.apache.org/resequencer.html The thing is, I'm not sure about setting batch size and/or timeout. The documentation states: ...messages are collected into a batch, either by a maximum number of messages per batch or using a timeout... So it seems I would set batch size to 3 in my case, but our process is supporting multiple customers, so really, it's 3 predecessor events *per customer*. Also, I think I would need persistent state in the even of a crash. Although I'd like to avoid it, I can't help thinking I may have to implement yet another custom Processor to support this use case. (I already had to implement a custom SFTP Processor to handle dynamic endpoint settings based on the customer-id). I know that Camel is intended for ETL-ish sorts of problems, but I think the Camel approach is also appropriate for interactive, online processing, but in that case, it would be really helpful if endpoint configurations had the additional dimension of settings/configs indexed by id (such as customer-id, user-id, account-id, etc.) I could be wrong, but it seems that the static URI way of configuring endpoints doesn't scale to process-flow/per-identity. Those endpoints that do take expressions help in this regard, but not all of them work with expression-language configuration, e.g. SFTP Consumer. Thanks, Chris On Wed, May 15, 2013 at 3:46 PM, Raul Kripalani r...@evosent.com wrote: Hi Chris, I like this kind of problems ;-) Do these two messages share a correlation key? If yes, you can create a bean which acts like a Repository, accumulating message bodies or Exchanges under the correlation key. Could be implemented using Guava's MultiMap, or a DB if you need durable persistence, or a distributed cache if you require clustering without persistence. When a message arrives, you query the Repository for a previous message with the same correlation key. If it exists, you pick it up, do whatever manipulation is needed, and release the 3rd event. Kind of like a CyclicBarrier that stores the messages for later usage. P.S.: You could consider using the Aggregator EIP, but it'll block a thread until the 2nd event comes through. Regards, *Raúl Kripalani* Enterprise Architect, Open Source Integration specialist, Program Manager | Apache Camel Committer http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani http://blog.raulkr.net | twitter: @raulvk On Wed, May 15, 2013 at 6:56 PM, Chris Wolf cwolf.a...@gmail.com wrote: In my process, I have two events that must be completed before the third can proceed. One event is the arrival of a certain JMS message and the other is the arrival of a certain file type. The problem is, I cannot represent this in a route pipeline because one event may occur before the other and it's totally random from one run to another. In the abstract, I'm thinking one of the EIP patterns, either resequencer or scatter-gather applies, but I'm not certain how to do this in a concrete way with Camel. If anyone has ideas, that would be great... Thanks, Chris
Re: What is the best approach to do an asynchronous rendezvous?
Let me try and understand the timeline of these events. I will call a logical grouping of events event batch. If you don't mind answering the following questions, I can assist better: - How many batches per day will you receive per customer? - Can the event batches interleave? - What identifies a concrete batch of events? - Do you have a finite list of customers? (and customer ids?) - What is the payload of the fourth event? Am I right asserting that you receive 3 raw events, and you build some sort of composite or merged message and release it as the 4th event? Regards, *Raúl Kripalani* Enterprise Architect, Open Source Integration specialist, Program Manager | Apache Camel Committer http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani http://blog.raulkr.net | twitter: @raulvk On Wed, May 15, 2013 at 9:30 PM, Chris Wolf cwolf.a...@gmail.com wrote: Raúl, Thanks for your ideas. Now it's 3 events without guaranteed order to arrive before triggering the fourth event. As for correlation key, I guess it would be customer-id plus a timestamp that shows the event happened in the last several hours. The more I look into it, the more it seems like the Camel resequencer process may apply here: http://camel.apache.org/resequencer.html The thing is, I'm not sure about setting batch size and/or timeout. The documentation states: ...messages are collected into a batch, either by a maximum number of messages per batch or using a timeout... So it seems I would set batch size to 3 in my case, but our process is supporting multiple customers, so really, it's 3 predecessor events *per customer*. Also, I think I would need persistent state in the even of a crash. Although I'd like to avoid it, I can't help thinking I may have to implement yet another custom Processor to support this use case. (I already had to implement a custom SFTP Processor to handle dynamic endpoint settings based on the customer-id). I know that Camel is intended for ETL-ish sorts of problems, but I think the Camel approach is also appropriate for interactive, online processing, but in that case, it would be really helpful if endpoint configurations had the additional dimension of settings/configs indexed by id (such as customer-id, user-id, account-id, etc.) I could be wrong, but it seems that the static URI way of configuring endpoints doesn't scale to process-flow/per-identity. Those endpoints that do take expressions help in this regard, but not all of them work with expression-language configuration, e.g. SFTP Consumer. Thanks, Chris On Wed, May 15, 2013 at 3:46 PM, Raul Kripalani r...@evosent.com wrote: Hi Chris, I like this kind of problems ;-) Do these two messages share a correlation key? If yes, you can create a bean which acts like a Repository, accumulating message bodies or Exchanges under the correlation key. Could be implemented using Guava's MultiMap, or a DB if you need durable persistence, or a distributed cache if you require clustering without persistence. When a message arrives, you query the Repository for a previous message with the same correlation key. If it exists, you pick it up, do whatever manipulation is needed, and release the 3rd event. Kind of like a CyclicBarrier that stores the messages for later usage. P.S.: You could consider using the Aggregator EIP, but it'll block a thread until the 2nd event comes through. Regards, *Raúl Kripalani* Enterprise Architect, Open Source Integration specialist, Program Manager | Apache Camel Committer http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani http://blog.raulkr.net | twitter: @raulvk On Wed, May 15, 2013 at 6:56 PM, Chris Wolf cwolf.a...@gmail.com wrote: In my process, I have two events that must be completed before the third can proceed. One event is the arrival of a certain JMS message and the other is the arrival of a certain file type. The problem is, I cannot represent this in a route pipeline because one event may occur before the other and it's totally random from one run to another. In the abstract, I'm thinking one of the EIP patterns, either resequencer or scatter-gather applies, but I'm not certain how to do this in a concrete way with Camel. If anyone has ideas, that would be great... Thanks, Chris
Re: What is the best approach to do an asynchronous rendezvous?
Hello Raúl, The more I analyze the requirements, the more complicated it seems to get - it's looking like I need to have a per-customer state machine to manage the flow. To answer your questions: - How many batches per day will you receive per customer? one per day per customer - 3 predecessor events in, 1 result event out. - Can the event batches interleave? Yes, for sure, which raises another issue - now I'm thinking I need per-customer, dynamically added route/route-policy - What identifies a concrete batch of events? * a per-customer arrival of config info from external system via JMS * a per-customer arrival of message indicating certain external processing is ready for next step from Camel-based process * a per-customer indication of a set of files have arrived via SFTP - Do you have a finite list of customers? (and customer ids?) * I don't see how that factors in - assume the less restrictive case - What is the payload of the fourth event? * an advisory JMS message that SFTP-processed files has completed and that the external system can proceed base on these results. Thanks, Chris On Wed, May 15, 2013 at 4:42 PM, Raul Kripalani r...@evosent.com wrote: Let me try and understand the timeline of these events. I will call a logical grouping of events event batch. If you don't mind answering the following questions, I can assist better: - How many batches per day will you receive per customer? - Can the event batches interleave? - What identifies a concrete batch of events? - Do you have a finite list of customers? (and customer ids?) - What is the payload of the fourth event? Am I right asserting that you receive 3 raw events, and you build some sort of composite or merged message and release it as the 4th event? Regards, *Raúl Kripalani* Enterprise Architect, Open Source Integration specialist, Program Manager | Apache Camel Committer http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani http://blog.raulkr.net | twitter: @raulvk On Wed, May 15, 2013 at 9:30 PM, Chris Wolf cwolf.a...@gmail.com wrote: Raúl, Thanks for your ideas. Now it's 3 events without guaranteed order to arrive before triggering the fourth event. As for correlation key, I guess it would be customer-id plus a timestamp that shows the event happened in the last several hours. The more I look into it, the more it seems like the Camel resequencer process may apply here: http://camel.apache.org/resequencer.html The thing is, I'm not sure about setting batch size and/or timeout. The documentation states: ...messages are collected into a batch, either by a maximum number of messages per batch or using a timeout... So it seems I would set batch size to 3 in my case, but our process is supporting multiple customers, so really, it's 3 predecessor events *per customer*. Also, I think I would need persistent state in the even of a crash. Although I'd like to avoid it, I can't help thinking I may have to implement yet another custom Processor to support this use case. (I already had to implement a custom SFTP Processor to handle dynamic endpoint settings based on the customer-id). I know that Camel is intended for ETL-ish sorts of problems, but I think the Camel approach is also appropriate for interactive, online processing, but in that case, it would be really helpful if endpoint configurations had the additional dimension of settings/configs indexed by id (such as customer-id, user-id, account-id, etc.) I could be wrong, but it seems that the static URI way of configuring endpoints doesn't scale to process-flow/per-identity. Those endpoints that do take expressions help in this regard, but not all of them work with expression-language configuration, e.g. SFTP Consumer. Thanks, Chris On Wed, May 15, 2013 at 3:46 PM, Raul Kripalani r...@evosent.com wrote: Hi Chris, I like this kind of problems ;-) Do these two messages share a correlation key? If yes, you can create a bean which acts like a Repository, accumulating message bodies or Exchanges under the correlation key. Could be implemented using Guava's MultiMap, or a DB if you need durable persistence, or a distributed cache if you require clustering without persistence. When a message arrives, you query the Repository for a previous message with the same correlation key. If it exists, you pick it up, do whatever manipulation is needed, and release the 3rd event. Kind of like a CyclicBarrier that stores the messages for later usage. P.S.: You could consider using the Aggregator EIP, but it'll block a thread until the 2nd event comes through. Regards, *Raúl Kripalani* Enterprise Architect, Open Source Integration specialist, Program Manager | Apache Camel Committer http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani http://blog.raulkr.net | twitter: