Re: jabberd2 in cluster? ideas, proof of concept and questions...

Tomasz Sterna Wed, 29 Aug 2012 06:14:02 -0700

Dnia 2012-08-29, śro o godzinie 08:08 +0200, Alexandre Jousset pisze:

> I wondered if it was possible to run it in a cluster.
> The answer is clearly "no", as far as I understood it.
[...]
> I decided to think about what would be needed to "clusterize" jabberd2. The 
> result of my thoughts can be seen on a graph here: 
> http://dbx.gtmp.org/jabberd2-cluster.png


Clustering is a setup where several instances of the same object appear
and work like a single object, distributing the load.

jabberd2 is able to cluster c2s, s2s and recently sm components.


What you are reffering to is building a mesh of independent object
working cooperatively to perform a task.

This idea had been discussed since the very beginning of jabberd2.
Ie. see http://codex.xiaoka.com/wiki/jabberd2:oldtodo#global - Router mesh


The problem with using JID hashing is that the number of components you
distribute load to has to be constant.
When you add/remove the component to the sm cluster you disturb (almost)
every sm connection.
In typical case when you have fixed number of sm instances handling a
domain it's acceptable, but in distributed, dynamic router mesh - it is
not.


> About the DB, with my method the best is still to host it on a separate 
> server,

Shared storage is basically a requirement for any distributed setup.
You may use a separate one (like an (No)SQLDB, NFS etc.) or implement it
yourself in-house. I would prefer not to implement a
router-mesh-distributed-storage myself, but it may be a fun project for
someone that is in to these things. :-)


> As having ideas is not enough, I started an implementation of it. Everything 
> seems to run OK with the simple tests I've made with 3 hosts.

Good for you!
We lack people wanting to actually bang some code. :-)



> My first question is: is this idea good?

It's not bad, but keep on reading. ;-)

> I know I should have asked this question before starting to implement it,

Not really, as you explained yourself in 1) and 2) (most importantly 2).


> [...]maintainers like me to give them the patches for review? As a pull 
> request on github?

For bugfixes do not hesitate to create pull-requests.
It's a good way of submitting code.

For review - just point us to a branch on github you committed proposed
code, and we'll review it. You can create pull-request later.


 --- 8< --- >8 ---


Now, for another mesh concept.
My idea is to implement it using dynamically adapting routing.


Currently router routes stuff based on domain only (ie. example.com).
This is good when only one component connection handles stuff for the
domain.
But what happens when there is more than one connection handling a
domain name? We need to somehow decide which connection should handle
the packet. Now we use a simple hack of deciding based on hash of JID.

Consider the following scenario:
- we have a router up
- component1 for 'example.com' connects and wants to bind 'example.org'
name with priority 0
- as there is no one handling 'example.com' yet, router registers a
component1 handling packets destined to it and accepts the bind
- packet destined for '[email protected]' arrives
- router pushes it to component1 connection
  [ so far everything as usual ]
- component2 connects and wants to bind 'example.com' with priority 0
- router sees that there is a connection1 that already bound
'example.com', so it:
  - registers a component2 handling 'example.com'
  - sends component1 request that it needs to bind also bare-JIDs
    from now on
  - sends component2 request that it needs to bind also bare-JIDs
    from now on
- component1 binds connections for '[email protected]' and
'[email protected]' with priority 0 as it has these sessions online
- component2 does not bind any bare-JIDs as it does not have any
sessions online yet
- packet for '[email protected]' arrives
- router pushes the packet to component1 as it knows that it bound
'[email protected]'
- packet destined for '[email protected]' arrives
- router picks up a component1 or 2 at random (as these have both the
same, highest priority) and pushes the packet
- component3 connects and wants to bind 'example.com' with priority 0
- router accepts component3 and notifies it that it needs to do bare-JID
binds
- component3 requests '[email protected]' bind with priority 0
- router detects that '[email protected]' is also bound on component1
  - router notifies component1 that it needs to bind full-JIDs
 from now on
  - router notifies component3 that it needs to bind full-JIDs
 from now on
- component1 binds '[email protected]/Home' with priority 1
- component3 binds '[email protected]/Work' with priority 100
- packet destined for '[email protected]' arrives
- router pushes the packet to component3
- packet destined for '[email protected]/Home' arrives
- router pushes the packet to component1

How this works in a router mesh?
Simple:
- require unique component names
- share the routing tree


What do you think about this idea?

It should be easy to implement with minimum changes to router code, as a
lot of preparation was done during implementing current clustering.
Just replace the clustering tables, with trees, and implement the
dynamic routing adaptation.

Then the hard part - routing information sharing between interconnected
routers and conflict resolution. ;-)


-- 
Tomasz Sterna
Instant Messaging Consultant : Open Source Developer
http://tomasz.sterna.tv/  http://www.xiaoka.com/portfolio

Re: jabberd2 in cluster? ideas, proof of concept and questions...

Reply via email to