Re: [DISCUSS] Clustering in ServiceMix 4

Gert Vanthienen Tue, 10 Feb 2009 11:56:18 -0800

Guillaume,

Thank you for the clarifications! This kind of clustering support isyet another good reason for people to migrate to ServiceMix 4, very nicework indeed!


Regards,

Gert


Guillaume Nodet wrote:

2009/2/10 Gert Vanthienen <[email protected]>:

Guillaume,

From a user perspective, registering the endpoint through an OSGi bundle should 
be fine.  For the JBI packaging, we should take care that the implementation 
doesn't tie the components' build to ServiceMix 4 or we run into the same 
problem we used to have with ServiceMix 3.  We should probably also take care 
of ServiceMix 3 users by adding a warning or something to the log if they 
configure that flag because it won't do anything there, right?  Given that 
these ClusterRegistrations are in the OSGi service registry, I suppose we can 
also add some console commands to enable/disable clustering at runtime.


You're referring to adding a property on the endpoint I suppose ? If
we add that on the endpoint exporter (which is smx4 specific, we would
not have to take care about smx3 users).   Implementing a set of
commands is a really nice idea.

Once we have this done, we can look into making this cluster engine more 
intelligent, e.g. by automatically detecting exchanges to be sent to the engine 
as you described and perhaps by adding a communication mechanism for ServiceMix 
4 instances to share information about endpoints (and thus allow them to turn 
on/off clustering if the endpoint is available on another node automatically).


Yes, we should be able to add more intelligence later.

Just for clarity: this feature is meant a replacement for the JCA/JMS Flow in 
ServiceMix 3.  Would this solution solve all of our concerns with those flows 
or are there still situations where a set of JMS endpoints on both ServiceMix 
instances would be better than using the built-in clustering solution?


Yes, I think so.  The cluster engine is more scalable, more efficient,
and the fact that the user has to configure on explicitely on a given
endpoint put a slight burden on the user at the benefit of not needing
to send all exchanges in the NMR through the cluster.  The engine has
also some kind of throughput control where it can controls the maximum
number of exchanges sent into the NMR at a given time to not overload
it.  I can't really think of some use case where it would be better to
have explicit endpoints, unless you really need to have a really fine
grained control over the jms messages sent: for example one could want
some jms messages to be persistent and others not.  This is currently
not possible to configure that, though we should be able to enhance
the ClusterRegistration later to hold such informations.

Regards,

Gert

Guillaume Nodet wrote:

On Mon, Feb 9, 2009 at 13:31, Jean-Baptiste Onofré <[email protected]> wrote:

Hi Guillaume,

If I have right understood, the ClusterRegistration in the registry provides a filter. 
After that, the user needs to define if the endpoint is "member" of the cluster 
or not. It means that, in a cluster, we can have :
- standalone endpoints (dealing only with the local SMX instance)
- clustered endpoints (dealing with all SMX cluster members). Does it mean that 
this kind of endpoints are federate in all cluster members (the endpoint is 
present in all cluster member) ?

The ClusterRegistration provides a filter that will match an endpoint
set.  If the endpoint that created and sent the exchange matches, the
exchanges will be re-routed.  In such case, the cluster engine will
send a JMS message, that will be consumed by another cluster engine.
The consumption is based on JMS selectors, so it will only consume
messages that will be transformed into a JBI exchange, which target
endpoint exists locally.  So you can achieve both load-balancing and
remoting.

Let's take a real example:
  HTTP consumer -> EIP routing slip -> (xslt transformation , HTTP
provider, xslt transformation)
This route defines a transformation on the request, send it to another
web service, and transform the response back.
If you configure the HTTP consumer to be "clustered", then, it will
look somewhat like:
  HTTP -> JMS -> EIP
If you deploy this whole application on two instances (assuming the
underlying JMS brokers are connected together), an http request coming
to the first smx instance could be processed (eip and the remaining of
the flow) by another instance (load-balancing / some failover).   You
could also imagine deploying the HTTP consumer only on one instance,
and the remaining of the application only on a second instance
(remoting).

If I compare with application servers clustering (with the EJBs session 
replication, entity turns, etc), when you setup the application server in 
cluster mode, your application works in cluster mode (you can't choose if it's 
cluster compliant or not). Using JBoss Groups, you can register applications in 
cluster or not.

Maybe it can be interesting to let the user to choose which endpoints is 
cluster one or not with something like, defining a ClusterRegistration service 
that store the cluster endpoints set. Like this, we can manage OSGi and JBI 
packaging, implicit cluster and manual cluster. But the user needs to define 
clustered endpoint.

Not sure if we can really compare with app servers clustering.  If you
want true clustering, you need to make sure the whole state can failed
over onto another node.  Unfortunately, most of the components
maintain some state.  For example if an endpoint receives an exchange
and create a copy of it to send it to another endpoint, wait for the
response and then copy it as the response of the original exchange,
this original exchange if smx crashes unless it is stored somewhere.
The servicemix-eip component has a configurable storage that is used
to store all such data.  It defaults to an simple map without
persistence, but you can configure if to use a database if you want.
However, this downgrades performances a lot.
Usually, the best way is to put a single cluster endpoint for a whole
flow at the beginning of this flow.  In our example above, clustering
the HTTP consumer should be somewhat sufficient:  if the something
wrong happens when processing the eip, xslt transformations or the
call to the other web service, the transaction can be rolled back and
retried on another smx instance.  However, if the http client is
waiting for a response and the first smx instance (hosting the http
consumer) crashes, there's really nothing we can do ...

Anyway, an interesting option would be to have a more complex filter
that automatically detects exchanges that need to be clustered if for
a given flow, no exchange have gone through the cluster engine yet.
It would automatically cluster all consumer endpoints, but not
subsequent exchanges.

It's only a quick think about SMX cluster. The topic is very interesting.

Regards
JB

 On Mon 09/02/09 09:28, "Guillaume Nodet" [email protected] wrote:

I've commited my ongoing work about servicemix 4 clustering at
https://svn.apache.org/repos/asf/servicemix/smx4/nmr/trunk/jbi/
cluster/

It's not in the build yet and I committed it for discussion purpose.

This work has two goals:
* provide some persistence in the JBI layer
* provide some transparent remoting between JBI endpoints

The way I've began implementing that is to use an ExchangeListener in
the NMR to re-route exchanges to a cluster endpoint (I guess it should
be renamed to something like "cluster engine" to avoid being confused
with "clustered endpoints").

The org.apache.servicemix.cluster.requestor package dervies from the
spring message listener container and implements a jms layer which is
able to provide request / response in an asynchronous way.

I've experimented different things, and the one i've been focusing
lately

is to use a single JMS queue and selectors.  Let me explain a bit.

The JMS flow in servicemix 3 was using lots of different destinations
(one per container + one per endpoint + one per service qname + one
per

interface qname).  The problem with such a design is that a jms

consumer

can easily consume only from one destination (unless we use some

specific

activemq features).  Another problem is that if not using activemq,

setting up lots of JMS destinations can be really tedious.  The use of
a

single destination leads to fewer consumers, at the expense of using

jms

selectors.   Previsouly, i've tried to use two queues (one for

requests

and another one for responses) but there's no real benefits in doing

this

imho.

The other thing i've been focusing on is to make sure that processing
a

jms message does not block a thread, and yet be able to use jms or xa

transactions.  This is not so easy.  For example the spring jms
listener

containers do use a thread for consuming the jms message and process

it,

expecting the processing to happen synchronously.  However, in

servicemix

synchronous processing is a bad idea, as if it involves sending an

http

request and waiting for a response, this means blocking a thread for

nothing.

For scalability, we need to not block threads if possible.  But spring

message listener containers only support synchronous processing, so
I've

hacked two new containers, one being JMS compliant, and another one

specific

to ActiveMQ which is much more performant.  It uses a

MessageAvailableListener

to be notified when consumers have messages to be processed instead of

wasting

threads to poll actively for messages.

Anyway, both containers can support client ack, jms local transactions
or

xa transactions in asynchronous mode.

I haven't really worked on how to register such endpoints (from the
user point of
view).  At the moment, we need to register a ClusterRegistration in the
OSGi

registry.  Such registrations contains a filter that will be used to

decide if a
new active / consumer exchange should be re-routed to the cluster engine or
not.

The most simple filter would be a filter that checks the source endpoint

and

will thus cluster all exchanges outgoing from a given endpoint.

As for how to register such objects, one way would be to put that on
the endpoint
exporter that is used in smx4 to register jbi endpoints with the OSGi
packaging,

but this would not work with JBI packaging (such registrations would have

to be

deployed in a separate osgi bundle).  I was also thinking about adding a

simple

boolean property on all endpoints, something like clustered="true".

Sorry for the long rant, but I should have sent this email way earlier
...

Feedback welcome.

--
Cheers,
Guillaume Nodet
------------------------
Blog: http://gnodet.blogspot.com/

------------------------

Open Source SOA
http://fusesource.com

Re: [DISCUSS] Clustering in ServiceMix 4

Reply via email to