Re: [DISCUSS] Clustering in ServiceMix 4

Gert Vanthienen Tue, 10 Feb 2009 04:15:36 -0800

Guillaume,

From a user perspective, registering the endpoint through an OSGibundle should be fine. For the JBI packaging, we should take care thatthe implementation doesn't tie the components' build to ServiceMix 4 orwe run into the same problem we used to have with ServiceMix 3. Weshould probably also take care of ServiceMix 3 users by adding a warningor something to the log if they configure that flag because it won't doanything there, right? Given that these ClusterRegistrations are in theOSGi service registry, I suppose we can also add some console commandsto enable/disable clustering at runtime.

Once we have this done, we can look into making this cluster engine moreintelligent, e.g. by automatically detecting exchanges to be sent to theengine as you described and perhaps by adding a communication mechanismfor ServiceMix 4 instances to share information about endpoints (andthus allow them to turn on/off clustering if the endpoint is availableon another node automatically).

Just for clarity: this feature is meant a replacement for the JCA/JMSFlow in ServiceMix 3. Would this solution solve all of our concernswith those flows or are there still situations where a set of JMSendpoints on both ServiceMix instances would be better than using thebuilt-in clustering solution?

Regards,

Gert

Guillaume Nodet wrote:

On Mon, Feb 9, 2009 at 13:31, Jean-Baptiste Onofré <[email protected]> wrote:

Hi Guillaume,

If I have right understood, the ClusterRegistration in the registry provides a filter. 
After that, the user needs to define if the endpoint is "member" of the cluster 
or not. It means that, in a cluster, we can have :
- standalone endpoints (dealing only with the local SMX instance)
- clustered endpoints (dealing with all SMX cluster members). Does it mean that 
this kind of endpoints are federate in all cluster members (the endpoint is 
present in all cluster member) ?


The ClusterRegistration provides a filter that will match an endpoint
set.  If the endpoint that created and sent the exchange matches, the
exchanges will be re-routed.  In such case, the cluster engine will
send a JMS message, that will be consumed by another cluster engine.
The consumption is based on JMS selectors, so it will only consume
messages that will be transformed into a JBI exchange, which target
endpoint exists locally.  So you can achieve both load-balancing and
remoting.

Let's take a real example:
   HTTP consumer -> EIP routing slip -> (xslt transformation , HTTP
provider, xslt transformation)
This route defines a transformation on the request, send it to another
web service, and transform the response back.
If you configure the HTTP consumer to be "clustered", then, it will
look somewhat like:
   HTTP -> JMS -> EIP
If you deploy this whole application on two instances (assuming the
underlying JMS brokers are connected together), an http request coming
to the first smx instance could be processed (eip and the remaining of
the flow) by another instance (load-balancing / some failover).   You
could also imagine deploying the HTTP consumer only on one instance,
and the remaining of the application only on a second instance
(remoting).

If I compare with application servers clustering (with the EJBs session 
replication, entity turns, etc), when you setup the application server in 
cluster mode, your application works in cluster mode (you can't choose if it's 
cluster compliant or not). Using JBoss Groups, you can register applications in 
cluster or not.

Maybe it can be interesting to let the user to choose which endpoints is 
cluster one or not with something like, defining a ClusterRegistration service 
that store the cluster endpoints set. Like this, we can manage OSGi and JBI 
packaging, implicit cluster and manual cluster. But the user needs to define 
clustered endpoint.


Not sure if we can really compare with app servers clustering.  If you
want true clustering, you need to make sure the whole state can failed
over onto another node.  Unfortunately, most of the components
maintain some state.  For example if an endpoint receives an exchange
and create a copy of it to send it to another endpoint, wait for the
response and then copy it as the response of the original exchange,
this original exchange if smx crashes unless it is stored somewhere.
The servicemix-eip component has a configurable storage that is used
to store all such data.  It defaults to an simple map without
persistence, but you can configure if to use a database if you want.
However, this downgrades performances a lot.
Usually, the best way is to put a single cluster endpoint for a whole
flow at the beginning of this flow.  In our example above, clustering
the HTTP consumer should be somewhat sufficient:  if the something
wrong happens when processing the eip, xslt transformations or the
call to the other web service, the transaction can be rolled back and
retried on another smx instance.  However, if the http client is
waiting for a response and the first smx instance (hosting the http
consumer) crashes, there's really nothing we can do ...

Anyway, an interesting option would be to have a more complex filter
that automatically detects exchanges that need to be clustered if for
a given flow, no exchange have gone through the cluster engine yet.
It would automatically cluster all consumer endpoints, but not
subsequent exchanges.

It's only a quick think about SMX cluster. The topic is very interesting.

Regards
JB

 On Mon 09/02/09 09:28, "Guillaume Nodet" [email protected] wrote:

I've commited my ongoing work about servicemix 4 clustering at
https://svn.apache.org/repos/asf/servicemix/smx4/nmr/trunk/jbi/
cluster/

It's not in the build yet and I committed it for discussion purpose.

This work has two goals:
* provide some persistence in the JBI layer
* provide some transparent remoting between JBI endpoints

The way I've began implementing that is to use an ExchangeListener in
the NMR to re-route exchanges to a cluster endpoint (I guess it should
be renamed to something like "cluster engine" to avoid being confused
with "clustered endpoints").

The org.apache.servicemix.cluster.requestor package dervies from the
spring message listener container and implements a jms layer which is
able to provide request / response in an asynchronous way.

I've experimented different things, and the one i've been focusing
lately

is to use a single JMS queue and selectors.  Let me explain a bit.

The JMS flow in servicemix 3 was using lots of different destinations
(one per container + one per endpoint + one per service qname + one
per

interface qname).  The problem with such a design is that a jms

consumer

can easily consume only from one destination (unless we use some

specific

activemq features).  Another problem is that if not using activemq,

setting up lots of JMS destinations can be really tedious.  The use of
a

single destination leads to fewer consumers, at the expense of using

jms

selectors.   Previsouly, i've tried to use two queues (one for

requests

and another one for responses) but there's no real benefits in doing

this

imho.

The other thing i've been focusing on is to make sure that processing
a

jms message does not block a thread, and yet be able to use jms or xa

transactions.  This is not so easy.  For example the spring jms
listener

containers do use a thread for consuming the jms message and process

it,

expecting the processing to happen synchronously.  However, in

servicemix

synchronous processing is a bad idea, as if it involves sending an

http

request and waiting for a response, this means blocking a thread for

nothing.

For scalability, we need to not block threads if possible.  But spring

message listener containers only support synchronous processing, so
I've

hacked two new containers, one being JMS compliant, and another one

specific

to ActiveMQ which is much more performant.  It uses a

MessageAvailableListener

to be notified when consumers have messages to be processed instead of

wasting

threads to poll actively for messages.

Anyway, both containers can support client ack, jms local transactions
or

xa transactions in asynchronous mode.

I haven't really worked on how to register such endpoints (from the
user point of
view).  At the moment, we need to register a ClusterRegistration in the
OSGi

registry.  Such registrations contains a filter that will be used to

decide if a
new active / consumer exchange should be re-routed to the cluster engine or
not.

The most simple filter would be a filter that checks the source endpoint

and

will thus cluster all exchanges outgoing from a given endpoint.

As for how to register such objects, one way would be to put that on
the endpoint
exporter that is used in smx4 to register jbi endpoints with the OSGi
packaging,

but this would not work with JBI packaging (such registrations would have

to be

deployed in a separate osgi bundle).  I was also thinking about adding a

simple

boolean property on all endpoints, something like clustered="true".

Sorry for the long rant, but I should have sent this email way earlier
...

Feedback welcome.


--
Cheers,
Guillaume Nodet
------------------------
Blog: http://gnodet.blogspot.com/

------------------------

Open Source SOA
http://fusesource.com

Re: [DISCUSS] Clustering in ServiceMix 4

Reply via email to