Re: What is the best approach to do an asynchronous rendezvous?

2013-05-15 Thread Raul Kripalani
Hi Chris,

I like this kind of problems ;-) Do these two messages share a correlation
key?

If yes, you can create a bean which acts like a Repository, accumulating
message bodies or Exchanges under the correlation key. Could be implemented
using Guava's MultiMap, or a DB if you need durable persistence, or a
distributed cache if you require clustering without persistence.

When a message arrives, you query the Repository for a previous message
with the same correlation key. If it exists, you pick it up, do whatever
manipulation is needed, and release the 3rd event. Kind of like a
CyclicBarrier that stores the messages for later usage.

P.S.: You could consider using the Aggregator EIP, but it'll block a thread
until the 2nd event comes through.

Regards,

*Raúl Kripalani*
Enterprise Architect, Open Source Integration specialist, Program
Manager | Apache
Camel Committer
http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani
http://blog.raulkr.net | twitter: @raulvk

On Wed, May 15, 2013 at 6:56 PM, Chris Wolf cwolf.a...@gmail.com wrote:

 In my process, I have two events that must be completed before the
 third can proceed.
 One event is the arrival of a certain JMS message and the other is the
 arrival of
 a certain file type.  The problem is, I cannot represent this in a
 route pipeline because
 one event may occur before the other and it's totally random from one
 run to another.

 In the abstract, I'm thinking one of the EIP patterns, either
 resequencer or scatter-gather
 applies, but I'm not certain how to do this in a concrete way with
 Camel.  If anyone has
 ideas, that would be great...

 Thanks,


 Chris



Re: What is the best approach to do an asynchronous rendezvous?

2013-05-15 Thread Chris Wolf
Raúl,

Thanks for your ideas.  Now it's 3 events without guaranteed order to
arrive before triggering the fourth event.  As for correlation key, I
guess it would be customer-id plus a timestamp
that shows the event happened in the last several hours.

The more I look into it, the more it seems like the Camel resequencer
process may apply
here:  http://camel.apache.org/resequencer.html

The thing is, I'm not sure about setting batch size and/or timeout.
The documentation
states:

...messages are collected into a batch, either by a maximum number of
messages per batch or using a timeout...

So it seems I would set batch size to 3 in my case, but our process is
supporting
multiple customers, so really, it's 3 predecessor events *per customer*.  Also,
I think I would need persistent state in the even of a crash.
Although I'd like to
avoid it, I can't help thinking I may have to implement yet another custom
Processor to support this use case.  (I already had to implement a custom
SFTP Processor to handle dynamic endpoint settings based on the customer-id).

I know that Camel is intended for ETL-ish sorts of problems, but I
think the Camel
approach is also appropriate for interactive, online processing, but
in that case,
it would be really helpful if endpoint configurations had the
additional dimension
of settings/configs indexed by id (such as customer-id, user-id,
account-id, etc.)

I could be wrong, but it seems that the static URI way of configuring endpoints
doesn't scale to process-flow/per-identity.  Those endpoints that do take
expressions help in this regard, but not all of them work with
expression-language
configuration, e.g. SFTP Consumer.

Thanks,

Chris

On Wed, May 15, 2013 at 3:46 PM, Raul Kripalani r...@evosent.com wrote:
 Hi Chris,

 I like this kind of problems ;-) Do these two messages share a correlation
 key?

 If yes, you can create a bean which acts like a Repository, accumulating
 message bodies or Exchanges under the correlation key. Could be implemented
 using Guava's MultiMap, or a DB if you need durable persistence, or a
 distributed cache if you require clustering without persistence.

 When a message arrives, you query the Repository for a previous message
 with the same correlation key. If it exists, you pick it up, do whatever
 manipulation is needed, and release the 3rd event. Kind of like a
 CyclicBarrier that stores the messages for later usage.

 P.S.: You could consider using the Aggregator EIP, but it'll block a thread
 until the 2nd event comes through.

 Regards,

 *Raúl Kripalani*
 Enterprise Architect, Open Source Integration specialist, Program
 Manager | Apache
 Camel Committer
 http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani
 http://blog.raulkr.net | twitter: @raulvk

 On Wed, May 15, 2013 at 6:56 PM, Chris Wolf cwolf.a...@gmail.com wrote:

 In my process, I have two events that must be completed before the
 third can proceed.
 One event is the arrival of a certain JMS message and the other is the
 arrival of
 a certain file type.  The problem is, I cannot represent this in a
 route pipeline because
 one event may occur before the other and it's totally random from one
 run to another.

 In the abstract, I'm thinking one of the EIP patterns, either
 resequencer or scatter-gather
 applies, but I'm not certain how to do this in a concrete way with
 Camel.  If anyone has
 ideas, that would be great...

 Thanks,


 Chris



Re: What is the best approach to do an asynchronous rendezvous?

2013-05-15 Thread Raul Kripalani
Let me try and understand the timeline of these events. I will call a
logical grouping of events event batch.
If you don't mind answering the following questions, I can assist better:

   - How many batches per day will you receive per customer?
   - Can the event batches interleave?
   - What identifies a concrete batch of events?
   - Do you have a finite list of customers? (and customer ids?)
   - What is the payload of the fourth event? Am I right asserting that you
   receive 3 raw events, and you build some sort of composite or merged
   message and release it as the 4th event?

Regards,

*Raúl Kripalani*
Enterprise Architect, Open Source Integration specialist, Program
Manager | Apache
Camel Committer
http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani
http://blog.raulkr.net | twitter: @raulvk

On Wed, May 15, 2013 at 9:30 PM, Chris Wolf cwolf.a...@gmail.com wrote:

 Raúl,

 Thanks for your ideas.  Now it's 3 events without guaranteed order to
 arrive before triggering the fourth event.  As for correlation key, I
 guess it would be customer-id plus a timestamp
 that shows the event happened in the last several hours.

 The more I look into it, the more it seems like the Camel resequencer
 process may apply
 here:  http://camel.apache.org/resequencer.html

 The thing is, I'm not sure about setting batch size and/or timeout.
 The documentation
 states:

 ...messages are collected into a batch, either by a maximum number of
 messages per batch or using a timeout...

 So it seems I would set batch size to 3 in my case, but our process is
 supporting
 multiple customers, so really, it's 3 predecessor events *per customer*.
  Also,
 I think I would need persistent state in the even of a crash.
 Although I'd like to
 avoid it, I can't help thinking I may have to implement yet another custom
 Processor to support this use case.  (I already had to implement a custom
 SFTP Processor to handle dynamic endpoint settings based on the
 customer-id).

 I know that Camel is intended for ETL-ish sorts of problems, but I
 think the Camel
 approach is also appropriate for interactive, online processing, but
 in that case,
 it would be really helpful if endpoint configurations had the
 additional dimension
 of settings/configs indexed by id (such as customer-id, user-id,
 account-id, etc.)

 I could be wrong, but it seems that the static URI way of configuring
 endpoints
 doesn't scale to process-flow/per-identity.  Those endpoints that do take
 expressions help in this regard, but not all of them work with
 expression-language
 configuration, e.g. SFTP Consumer.

 Thanks,

 Chris

 On Wed, May 15, 2013 at 3:46 PM, Raul Kripalani r...@evosent.com wrote:
  Hi Chris,
 
  I like this kind of problems ;-) Do these two messages share a
 correlation
  key?
 
  If yes, you can create a bean which acts like a Repository, accumulating
  message bodies or Exchanges under the correlation key. Could be
 implemented
  using Guava's MultiMap, or a DB if you need durable persistence, or a
  distributed cache if you require clustering without persistence.
 
  When a message arrives, you query the Repository for a previous message
  with the same correlation key. If it exists, you pick it up, do whatever
  manipulation is needed, and release the 3rd event. Kind of like a
  CyclicBarrier that stores the messages for later usage.
 
  P.S.: You could consider using the Aggregator EIP, but it'll block a
 thread
  until the 2nd event comes through.
 
  Regards,
 
  *Raúl Kripalani*
  Enterprise Architect, Open Source Integration specialist, Program
  Manager | Apache
  Camel Committer
  http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani
  http://blog.raulkr.net | twitter: @raulvk
 
  On Wed, May 15, 2013 at 6:56 PM, Chris Wolf cwolf.a...@gmail.com
 wrote:
 
  In my process, I have two events that must be completed before the
  third can proceed.
  One event is the arrival of a certain JMS message and the other is the
  arrival of
  a certain file type.  The problem is, I cannot represent this in a
  route pipeline because
  one event may occur before the other and it's totally random from one
  run to another.
 
  In the abstract, I'm thinking one of the EIP patterns, either
  resequencer or scatter-gather
  applies, but I'm not certain how to do this in a concrete way with
  Camel.  If anyone has
  ideas, that would be great...
 
  Thanks,
 
 
  Chris
 



Re: What is the best approach to do an asynchronous rendezvous?

2013-05-15 Thread Chris Wolf
Hello Raúl,

The more I analyze the requirements, the more complicated it seems to get - it's
looking like I need to have a per-customer state machine to manage the flow.

To answer your questions:
- How many batches per day will you receive per customer?
one per day per customer - 3 predecessor events in, 1 result event out.

- Can the event batches interleave?
Yes, for sure, which raises another issue - now I'm thinking I need
per-customer, dynamically added route/route-policy

- What identifies a concrete batch of events?
  * a per-customer arrival of config info from external system via JMS
  * a per-customer arrival of message indicating certain external
processing is ready for next step from Camel-based process
  * a per-customer indication of a set of files have arrived via SFTP

- Do you have a finite list of customers? (and customer ids?)
   * I don't see how that factors in - assume the less restrictive case

- What is the payload of the fourth event?
  * an advisory JMS message that SFTP-processed files has completed
and that the
 external system can proceed base on these results.

Thanks,

Chris

On Wed, May 15, 2013 at 4:42 PM, Raul Kripalani r...@evosent.com wrote:
 Let me try and understand the timeline of these events. I will call a
 logical grouping of events event batch.
 If you don't mind answering the following questions, I can assist better:

- How many batches per day will you receive per customer?
- Can the event batches interleave?
- What identifies a concrete batch of events?
- Do you have a finite list of customers? (and customer ids?)
- What is the payload of the fourth event? Am I right asserting that you
receive 3 raw events, and you build some sort of composite or merged
message and release it as the 4th event?

 Regards,

 *Raúl Kripalani*
 Enterprise Architect, Open Source Integration specialist, Program
 Manager | Apache
 Camel Committer
 http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani
 http://blog.raulkr.net | twitter: @raulvk

 On Wed, May 15, 2013 at 9:30 PM, Chris Wolf cwolf.a...@gmail.com wrote:

 Raúl,

 Thanks for your ideas.  Now it's 3 events without guaranteed order to
 arrive before triggering the fourth event.  As for correlation key, I
 guess it would be customer-id plus a timestamp
 that shows the event happened in the last several hours.

 The more I look into it, the more it seems like the Camel resequencer
 process may apply
 here:  http://camel.apache.org/resequencer.html

 The thing is, I'm not sure about setting batch size and/or timeout.
 The documentation
 states:

 ...messages are collected into a batch, either by a maximum number of
 messages per batch or using a timeout...

 So it seems I would set batch size to 3 in my case, but our process is
 supporting
 multiple customers, so really, it's 3 predecessor events *per customer*.
  Also,
 I think I would need persistent state in the even of a crash.
 Although I'd like to
 avoid it, I can't help thinking I may have to implement yet another custom
 Processor to support this use case.  (I already had to implement a custom
 SFTP Processor to handle dynamic endpoint settings based on the
 customer-id).

 I know that Camel is intended for ETL-ish sorts of problems, but I
 think the Camel
 approach is also appropriate for interactive, online processing, but
 in that case,
 it would be really helpful if endpoint configurations had the
 additional dimension
 of settings/configs indexed by id (such as customer-id, user-id,
 account-id, etc.)

 I could be wrong, but it seems that the static URI way of configuring
 endpoints
 doesn't scale to process-flow/per-identity.  Those endpoints that do take
 expressions help in this regard, but not all of them work with
 expression-language
 configuration, e.g. SFTP Consumer.

 Thanks,

 Chris

 On Wed, May 15, 2013 at 3:46 PM, Raul Kripalani r...@evosent.com wrote:
  Hi Chris,
 
  I like this kind of problems ;-) Do these two messages share a
 correlation
  key?
 
  If yes, you can create a bean which acts like a Repository, accumulating
  message bodies or Exchanges under the correlation key. Could be
 implemented
  using Guava's MultiMap, or a DB if you need durable persistence, or a
  distributed cache if you require clustering without persistence.
 
  When a message arrives, you query the Repository for a previous message
  with the same correlation key. If it exists, you pick it up, do whatever
  manipulation is needed, and release the 3rd event. Kind of like a
  CyclicBarrier that stores the messages for later usage.
 
  P.S.: You could consider using the Aggregator EIP, but it'll block a
 thread
  until the 2nd event comes through.
 
  Regards,
 
  *Raúl Kripalani*
  Enterprise Architect, Open Source Integration specialist, Program
  Manager | Apache
  Camel Committer
  http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani
  http://blog.raulkr.net | twitter: