Hi all,
I tested jabberd2 and, like a lot of people I wondered if it was
possible to run it in a cluster.
The answer is clearly "no", as far as I understood it.
I liked the way jabberd2 was implemented and, after roughly testing other
technologies, I decided to think about what would be needed to "clusterize"
jabberd2. The result of my thoughts can be seen on a graph here:
http://dbx.gtmp.org/jabberd2-cluster.png
On this drawing, there is a dbsession and a dbauth on each host. This
has some implications (see below).
Roughly, the idea is to set up each host like a independent host,
except that the routers are interconnected. Each router then needs a way to
know on which host is the recipient. I had the idea to reuse the JID hash
algorithm used to decide on which sm the recipient is (in the non clusterized
jabberd2 implementation).
Also, I automatically add a (numerical) resource to each component's "from" (they
haven't any) and "to" (if there isn't already one) when they send a message to the router
and I use it to route the messages to the right cluster node.
About the DB, with my method the best is still to host it on a separate
server, because if one wants a local DB on each server, although it is
possible, the user list on each DB must be split using the same hash algorithm.
Worse, if one adds or removes a node, the DB must be split again across hosts.
As having ideas is not enough, I started an implementation of it.
Everything seems to run OK with the simple tests I've made with 3 hosts. I
haven't modified s2s yet, but the idea for it is to, again, use the hash algo
to know to which router to send the message. So the s2s component should be,
like the routers, connected to all routers (as their default route).
My first question is: is this idea good? I know I should have asked
this question before starting to implement it, but 1) I was not sure to be able
to implement it and 2) I learned the source code this way and if other better
ideas show up I would be volunteer to implement them.
Second question: if the answer to the first question is either "yes" or
"maybe, let's see", how would the authors / maintainers like me to give them the patches
for review? As a pull request on github? (I forked the repository as user mid1221213 but I didn't
push back the patches yet), ...? I still have to do some further tests (and maybe a stress test)
and to verify / clean the code before I can send patches.
About my patches, I took care to #ifdef them (using a --enable-cluster
configure option) in order to not change jabberd2 functionality in case this
feature would be included and considered as experimental. Anyway, there is some
overhead using a cluster so a one node cluster would be a bad idea as well.
A requirement about the cluster: all hosts have to use the same arch
because of the hash algorithm.
A serious drawback about my implementation: c2s sends "see-other-host"
message to redirect clients to the right host. However, in my tests, neither Pidgin nor
Empathy seem to use it to try to reconnect. Pidgin even translates the message into
French (my locale) when displaying it! I don't know what is the right way to do this
redirection.
I hope my code will be useful, and in either case it was a great fun to
write it :-)
--
-- \^/ --
-- -/ O \--------------------------------------- --
-- | |/ \| Alexandre (Midnite) Jousset | --
-- -|___|--------------------------------------- --