Re: Stepping down

2018-01-23 Thread Alexandre Jousset

Le 21/01/2018 à 16:24, Tomasz Sterna a écrit :

W dniu nie, 21.01.2018 o godzinie 15∶01 +0100, użytkownik Alexandre
Jousset napisał:


[...]



I don't know if I'm skilled enough but instead of letting it
die, I would like to become the maintainer if nobody with better
skills wants to :-)


Judging by your contributions to jabberd2, I see no problem in passing
the project to you.


Thanks for this :-)

I suggest to, of course, give some time to other people to volunteer if 
they wish to.

Some said they could host the jabberd2.org website or the mailing list, 
and that may help another maintainer, but in the case I become the new 
maintainer I think I could do that too.

However, note that I only have access to Linux boxes, so I may need 
help from someone else about the other OS's ports. That may be a problem.


BTW I was recently doing some load test and having thought
about solving the SPOF of the router process, [...]


We already had a _lengthy_ discussion on list on my vision how to
multiply the router:
https://www.mail-archive.com/jabberd2@lists.xiaoka.com/msg01909.html

Your work still lives in:
https://github.com/jabberd2/jabberd2/tree/mesh


I know :-)

My concern is that the work I've done is complex and not thoroughly 
tested. That does not mean it should be abandoned, but maybe a shorter term 
solution could be found just to remove the SPOF (without necessarily 
implementing the whole router mesh). And as I said at that time, I'm not at all 
happy with my routing graph discovering feature I implemented.

I see 3 approaches here:
- a heartbeat / failover solution, for cases where the single router 
(which could be switched without disconnecting users) is not a bottleneck?
- a simplification of my work to get a router mesh less dynamic but easier to 
implement? I haven't thought a lot about this yet but I checked out the "mesh" 
branch again on my computer and I'm currently digging into it to see how to simplify it. 
Any advice welcome :-)
- work fast and hard to simplify / finish the mesh branch code :-)

I also experimented a "multi-router" setup, which works great, but 
needs that all the c2s's and all the sm's to be connected to all routers at the same time 
in order for them to work as expected, thus not allowing a failover setup, just a kind of 
multithread (actually multiprocess / multihost) setup.


But my latest approach was to ditch the router component in favor to
message bus (using 0MQ). See discussion at
https://gitter.im/jabberd2/jabberd2?at=56b8b4e9939ffd5d15f671e1

This is what https://github.com/jabberd2/jabberd2/commits/ashnazg
branch implemented and jabberd3 code (which was born of ashnazg branch) was 
going for.


Just a question about this, you are talking about 0MQ, but the source 
points to nanomsg...?

Anyway, I think it is a better solution indeed and it looks promising, and this 
is one of the reasons I think a shorter term / simpler solution should be found for the 
router / SPOF in j2. I mean, priority should be given to j3 (but of course with 
continuing maintaining "normally" j2).

And about j3, I'm afraid I didn't fully understood what you said, I'm 
sorry. You said in your first message that you are going away from all XMPP 
work, and that you're opening the source of j3 (and another project), but you 
only say you're stepping down from the j2 lead. Just to be clear, is it the 
same for the other projects?



The rest of this post is of about other topics I wanted to say / ask 
the list.

I started to implement a Redis storage backend based on the BDB one. It 
is still experimental but I get good performance with it, especially with 
millions of users in my tests (see below). I'd like to know if people would be 
interested in it? I made the assumption (based on some web found comparison 
pages) that BDB doesn't scale well with lot of users.

I also started to make some load tests, using a home made XMPP tester (I tried 
Tsung but I'm not at ease with it nor with Erlang to customize it) on a 
"few-nodes-setup", and I managed to connect up to 5M simultaneous users using a 
simple scenario (connect / send message randomly from time to time but not getting roster 
nor presences for the moment). About this my questions to the list are the following:
- What is your biggest known j2 installation in terms of account number 
and simultaneously connected users?
- Same question in a test environment?
- Do you know about an XMPP stress-tester (apart from Tsung) that is 
able to connect millions of users? My searches only led me to Tsung in that 
category.

Thanks,
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: Clustered Jabberd2

2013-05-21 Thread Alexandre Jousset

Hi Sylvain,

Le 17/05/2013 13:41, Sylvain Guglielmi a écrit :

I've been reading the mailing-list archives (especially this thread 
http://www.mail-archive.com/jabberd2@lists.xiaoka.com/msg01908.html ), and I 
think the discussed changes would suit my needs pretty well. The feature I dig 
most is allowing multiple routers and Session Managers (SM) on a single domain 
name to make jabberd2 service more resilient/easily scalable. I think that 
works fine with multiple c2s already (I'll try this in the upcoming days).

- I think the branch made by Alexandre Jousset has not been merged in the 
master. Am I right on this ?


You're right. Unfortunately I had to stop working on this for a while, 
but I'll publish my (unfinished) changes ASAP and I may be able to finish them 
soon.


- If not, is there any plan to do it, or to implement similar features ?

Also, when having multiple SMs, I guess there's two main ways to do :
- each SM only hosts the data of a subset of the users (if one SM goes down, 
some people will have issues).


This is the current way it is done. And the one, albeit different, it 
will be done in my implementation. In my implementation the advantage is that 
if one SM node is down, new users (including users that have been disconnected) 
will be able to connect using the others SMs.


or
- every user's session data is on every SM (more reliable, more storage 
required, and I guess harder to implement to guarantee one user can not open 
two session on different SMs at once/would require the admin to set up 
replication for the databases of each SM... etc... ).
Both views are valid I guess. I still don't get which approach was most 
favoured or was going to be tried.


This one would be tricky I think...


I'm still not very at ease with jabberd2 code. I've just stared writing my own 
highly customisable roster module to plug it on already-existing roster 
databases. But I'd gladly help with that if needed.

Thanks.

@Alexandre Jousset : if you're still living near Paris, I would gladly invite 
you to discuss this around a drink. ^_^. Anyone else in the area ?


Yes I live near Paris and I work in Paris. I'm OK to have a drink 
sometime somewhere with you ;-) Contact me in private for this :-)
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Router mesh, some news

2012-11-22 Thread Alexandre Jousset

Hi Tomasz, hi all,

FYI I'm still working (when possible, mainly on my spare time) on this 
feature, even if I'm quite silent.

I could say that I'm at ~70% of code modifications, the remaining 30% 
are the remote routers management, i.e. connections / disconnections / 
reconnections / (un)binding propagation.

For the moment there are 384 '+' and 269 '-' in the diffs ;-)

When I have something that can be shown and which is a bit tested I'll 
push it on my repo on github and I'll send a message here explaining what I've 
done.
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-10-26 Thread Alexandre Jousset

Le 18/10/2012 16:42, Tomasz Sterna a écrit :

Dnia 2012-10-18, czw o godzinie 16:12 +0200, Alexandre Jousset pisze:

What if you do not manage all the routers in the mesh?
And you were given a password to access only one or two routers of the
mesh?


I think it is pretty unusual for the admin not to have access to all 
routers (at least all routers managing the same domains). I'm sure there could 
be cases and this would add a lot of flexibility, but see below for the 
drawbacks.


I've been building collaborative mesh networks (ircd, eggdrop) a lot.
Believe me, the situations when you have just an entry point to the
network are not that rare.

Besides, it goes along the philosophy of jabberd2.
router-users.xml is there for a reason.
If it was assumed one administrator controls the whole components
network, there would be no need for separate users.



The problem with the multi-hop proposal is that you have to manage cases where 
there is cyclic connections. e.g. A = B = C = A


What exactly is the problem with cyclic connections?



A solution may be to add the ID of the component binding the domain / 
bare JID to the bound route, and to check if that combination is already bound, 
but this will increase CPU usage and the data structure sizes.


TTL/distance would be enough.
This does not increase data structures that much and CPU use is
neglectable - you have to choose the route anyway.
Premature optimisation is the root of all evil. - let's concentrate on
the design first.

Now that I think of it, implementing distance would be beneficial
anyway, as we could mark routers on slow connections as less preferable.



For the moment I have 2 hash tables (finally I differentiated them), one for 
domains where we don't really care of the size of the values, and one for bare JIDs 
bindings where the value is just the component_t. This component_t can be the local 
component for local connections, or other routers connection for remotes. So we would 
have to add a char * malloc'ed, strcpy'ed, etc., for each new connection in 
bare JID binding mode. So this would add CPU and memory consumption just for multi-hop 
support. It's a choice to do, if you really want me to do this I'll do it, but I'm a bit 
against that solution.


I'm sorry, I don't understand.
Give me for-instance.


Forget about all this ;-) As I was writing a response to this post I 
understood that my problem was an implementation issue indeed, and I found a 
solution for it ;-)

Roughly, instead of storing a component_t as bare JID hash table values I'll 
store a pointer to an element of a pool of component_t / component ID / 
distance associations.
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-10-18 Thread Alexandre Jousset

About this topic, I have some more comments and questions:

Le 15/10/2012 02:22, Alexandre Jousset a écrit :

Le 12/10/2012 19:53, Tomasz Sterna a écrit :

We do. In the simplest way to do it, routers don't forward other routers' 
binding requests. Of course it is possible to implement it to allow multi-hops, 
but I'm afraid this could lead to problems (and inefficiency) for no real gain 
(except simplistic configuration). Of course it would be easier to only list 
just one router of the mesh when adding a router, but I would prefer 
sacrificing this easiness in favor of efficiency. After all, the administrator 
has all knowledge about its server architecture. So when adding a router, the 
config file should list *all* other already running routers in the mesh.


What if you do not manage all the routers in the mesh?
And you were given a password to access only one or two routers of the
mesh?


I think it is pretty unusual for the admin not to have access to all 
routers (at least all routers managing the same domains). I'm sure there could 
be cases and this would add a lot of flexibility, but see below for the 
drawbacks.


In my proposal nothing stops you from making each router know all the
others to make it more efficient, but it shouldn't be _required_.


The problem with the multi-hop proposal is that you have to manage cases where 
there is cyclic connections. e.g. A = B = C = A

A solution may be to add the ID of the component binding the domain / 
bare JID to the bound route, and to check if that combination is already bound, 
but this will increase CPU usage and the data structure sizes.

For the moment I have 2 hash tables (finally I differentiated them), one for 
domains where we don't really care of the size of the values, and one for bare JIDs 
bindings where the value is just the component_t. This component_t can be the local 
component for local connections, or other routers connection for remotes. So we would 
have to add a char * malloc'ed, strcpy'ed, etc., for each new connection in 
bare JID binding mode. So this would add CPU and memory consumption just for multi-hop 
support. It's a choice to do, if you really want me to do this I'll do it, but I'm a bit 
against that solution.

Of course, if you have better solution... :-)
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-10-15 Thread Alexandre Jousset

Le 15/10/2012 10:03, Tomasz Sterna a écrit :

Dnia 2012-10-15, pon o godzinie 02:22 +0200, Alexandre Jousset pisze:

 We talked earlier about weighted randomization instead of
priorities. With weighted randomization it is impossible to be sure
that a local component will be preferred, this is why I made an
implicit priority for local components, still using weighted random
between local components, or between remote components when needed.


Right.
But I still don't see a rationale, why local components are better than
remote ones?

Why does local component should be preferred just because the connection
happened to come from local c2s?


Going to a remote component involves going through local router, then 
through remote router, then remote component. It adds a hop + a (physical) 
network access.


 To do otherwise, we should use weighted random + priorities,
this would add more complexity and misunderstanding in the
configuration process.


I was thinking more of a binary switch prefer local components, than
reintroducing priorities.


Ok.
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-10-15 Thread Alexandre Jousset

Le 15/10/2012 14:43, Tomasz Sterna a écrit :

Dnia 2012-10-15, pon o godzinie 12:15 +0200, Alexandre Jousset pisze:

But I still don't see a rationale, why local components are better

than

remote ones?

Why does local component should be preferred just because the

connection

happened to come from local c2s?


 Going to a remote component involves going through local
router, then through remote router, then remote component. It adds a
hop + a (physical) network access.


That's a technical detail that should not affect the load balancing
algorithm.

If I set two components one with weight 1 and one with weight 2, the
second one should get two times more requests than first one, regardless
where is it connected and where the requests are coming from.


You're right.

I think both behaviors make sense so I agree with you to add a binary 
switch in configuration to tell the router whether it should take care of local 
/ remote components or not. This would make, for example, the use of a 
(separate and protocol agnostic) load balancer in front of the cluster more 
efficient. But it's true that in that example the weighted randomization would 
be useless. But as it (the binary switch) is easy to implement, letting the 
admin choose is better.
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-10-15 Thread Alexandre Jousset

I respond to this message back in time to ask a question:

Le 11/09/2012 13:35, Tomasz Sterna a écrit :
[...]

Components have its own names.
Each component needs to be uniquely named.


Is it because components could previously have same names that there is « 
switch(targets-rtype) » at router/router.c:502, and all the multi attribute, 
route_MULTI_TO and route_MULTI_FROM?

If not, I have misunderstood something about the purpose of these...?

If yes, I think I can remove that functionality completely...?
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-10-15 Thread Alexandre Jousset

Le 15/10/2012 19:38, Tomasz Sterna a écrit :

Dnia 2012-10-15, pon o godzinie 18:29 +0200, Alexandre Jousset pisze:

 Is it because components could previously have same names that
there is « switch(targets-rtype) » at router/router.c:502, and all
the multi attribute, route_MULTI_TO and route_MULTI_FROM?
 If not, I have misunderstood something about the purpose of
these...?


Take a look in the code how these are set. :-)

In short, these choose on which attribute router does hash computation.
In case of jabberd2 component protocol connections we have
route_MULTI_TO, which computes hash of stanza to attribute.
In case of legacy component connections we compute hash of from
attribute.

Legacy connections are used to connect transports, so we need to make
sure all your packets are directed to the same transport instance - so
we use from attribute (route_MULTI_FROM).


Ok, I was confused by the from directing to which component to send 
the packet... I understand now.


jabberd2 component protocol connections are jabberd2 components like sm,
s2s and we need to make sure all stanza sent to you are directed to the
same sm instance - thus we compute hash of the to attribute to select
sm instance to route to.

So, you may remove this support from jabberd2 protocol connections, but
we need to keep route_MULTI_FROM behavior for legacy component
connections, as we cannot extend this protocol.


Ok, thanks for explanation.
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-10-14 Thread Alexandre Jousset

Le 12/10/2012 19:53, Tomasz Sterna a écrit :

We do. In the simplest way to do it, routers don't forward other 
routers' binding requests. Of course it is possible to implement it to allow 
multi-hops, but I'm afraid this could lead to problems (and inefficiency) for 
no real gain (except simplistic configuration). Of course it would be easier to 
only list just one router of the mesh when adding a router, but I would prefer 
sacrificing this easiness in favor of efficiency. After all, the administrator 
has all knowledge about its server architecture. So when adding a router, the 
config file should list *all* other already running routers in the mesh.


What if you do not manage all the routers in the mesh?
And you were given a password to access only one or two routers of the
mesh?

In my proposal nothing stops you from making each router know all the
others to make it more efficient, but it shouldn't be _required_.


Ok.


Also, in the pseudo-code I've written (and started to implement) I had 
to make a distinction between local components and remote routers, just for 
efficiency, to allow the use of a local component preferably before trying a 
remote one. So the local components have greater priorities than remote ones, 
and both are chosen with weighted random in their category. What do you think 
about this?


Explicit is better than implicit.
If you want local components to have higher priority - just say so in
the configuration file. But default should be that remote binds are as
equal as local ones.


We talked earlier about weighted randomization instead of priorities. 
With weighted randomization it is impossible to be sure that a local component 
will be preferred, this is why I made an implicit priority for local 
components, still using weighted random between local components, or between 
remote components when needed.

To do otherwise, we should use weighted random + priorities, this would 
add more complexity and misunderstanding in the configuration process.

But maybe I've misunderstood something?


Finally, I've added a routers.xml file (with a final 's', naming can 
be changed of course) to allow reloading it dynamically to change its connections 
settings if needed. What do you think about this? Do you think it could be necessary?


Seams reasonable and simple.
remote-routers.xml maybe?


Ok.
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: Working around client bugs in server software

2012-10-12 Thread Alexandre Jousset

Le 12/10/2012 10:42, Tomasz Sterna a écrit :

There is a SMACK-324 [1] bug affecting a lot of Java client applications
(including most Android clients).

It would be trivial to work around it in jabberd2 codebase, but it just
doesn't feel right.
 From practical point of view: There is a trivial fix we can have on -
let's just do it and make our users happy.
And current jabberd2 development philosophy is a stable server that
just works.

But there is a danger of:
- never fixing the original issue if we stop exposing it
- jabberd becoming an unmaintainable bag of patches for problems not in
jabberd

I would like to hear your opinions how should we approach such issues.

[1] http://issues.igniterealtime.org/browse/SMACK-324


According to this link the bug is marked as resolved, fixed.

Do you think clients haven't upgraded and still have the bug?
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-09-18 Thread Alexandre Jousset

Le 17/09/2012 17:50, Tomasz Sterna a écrit :

 This case could be resolved if the router auto-binds the
user@domain route. There could still be problems if there are more
than one router. But that case (2+ routers auto-binding user@domain
at the same time) could be fixed by the conflict resolution we thought
before, just by canceling all the binds for user@domain...?

 What do you think?


Brilliant idea!
Works for me.


Thanks :-)


I would just make it temporary and extend it to all routing levels.
Whenever the router makes a (random) decision to choose one of equal
binds to route to, it sticks to this decision for a predefined time.


Ok.

I'm going to change my pseudo-code according to our latest decisions 
and when everything is OK I'll start to write real code.

I'll keep you informed when there will be something interesting and / 
or more questions.
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-09-18 Thread Alexandre Jousset

Le 19/09/2012 00:19, Tomasz Sterna a écrit :

Dnia 2012-09-18, wto o godzinie 21:50 +0200, Alexandre Jousset pisze:

About routing levels and the user@domain binding... With this
solution, there's no more domain-only level at the beginning, so each
SM should bind directly bare JIDs and domains (still with
auto-binding).

 Moreover, as the same SM will manage all user@domain, there
is no more full JID binding level either...

 This simplifies binding and routing (and all changes needed to
the code), as we only have to maintain 2 hash tables (preferably), one
for domains (with multiple routes/priorities) and one for bare JIDs
(with only 1 route and no priority, everything would be managed by the
SM)...


Do we need priorities in case of domain binds?
This would cause sm with higher priority to take all the load.

Maybe weights instead priorities. This would help tune load balancing.


I thought it was what you meant previously, sorry. But you're right, ok 
for this.


 To sum it up, it is the end of the adaptive routing... And
this has an implication, it is that the router memory consumption will
be higher than before with single-router architecture... Then wouldn't
it be a good solution to --enable-multi-router at ./configure time? It
would add the burden of maintaining both binding/routing solutions
(hopefully only in the router) but it will make the changes invisible
to the currently deployed servers.


Not really.
You need bare-jid level binds only when there is more than one component
handling the domain. If there is only one, you can do domain based
routing without the need for user@domain binding.



This approach does not use more memory than current solution.


I was thinking about multiple SMs handling one domain. Maybe it would 
be a good thing to say this (only 1 SM per domain to use less memory) in the 
install / config guide after the changes.


I won't accept #ifdef'd implementation.
Even dynamic modules make releases difficult. Having switchable code is
even more PITA. There were some screw ups with libsubst and mio
implementations in the past.


Ok again :-) Sorry for insisting ;-)

Now I have a question about the routers interconnections. In my proof 
of concept, each router had IPs/ports of each other router (including itself, 
just to copy/paste this part of the config file), and each one connected to 
each other in a client-server way, trying to reconnect roughly each X seconds 
(with a customizable X) in case of error / lost connection.

The client router component binded the domains it managed on the server, so when the 
server wanted to send a packet it used that bind information to route. Each router was at the same time a client and a 
server, receiving packets from other routers on its client side and routing outgoing packets on its 
server-side.

This architecture is far from perfection of course, but it has the 
advantage of being symmetric, avoiding problems about which router has to 
connect to which router and some other issues.

How do you see these interconnections? How to configure them? What if 
(and how) one add or remove a router/host from the cluster?

I'm thinking about something like editing the router.xml file on each 
host, or a specific routers.xml containing only these informations that could 
be copied verbatim on each host, and sending a SIGsomething to reload it, but 
if you have a better idea you're welcome :-)
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-09-18 Thread Alexandre Jousset

About this, and after reading/writing other posts in other parts of 
this thread:

Le 17/09/2012 17:50, Tomasz Sterna a écrit :

I would just make it temporary and extend it to all routing levels.
Whenever the router makes a (random) decision to choose one of equal
binds to route to, it sticks to this decision for a predefined time.


I don't see any need to stick with a (now weighted) random decision, other than 
in the user@domain auto-bind case.

According to the pseudo code I've just written there are 3 cases where 
the router makes a random decision:

1) to=ad...@example.com (with or without resource) or to=example.com, and no 
example.com domain bound = weighted random on default routes (whether we accept multiple default 
routes or not, and how, is another question)
2) to=ad...@example.com (with or without resource), no ad...@example.com bare JID bound and more 
than one component accepting example.com = weighted random on example.com routes + auto-bind 
ad...@example.com to the chosen route
3) to=example.com and more than one component accepting example.com = 
weighted random on example.com routes

Do you see any other case?
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-09-17 Thread Alexandre Jousset

Le 17/09/2012 10:05, Tomasz Sterna a écrit :

Dnia 2012-09-16, nie o godzinie 23:06 +0200, Alexandre Jousset pisze:

 Err... Sorry again, but in case of delivery I've found that:
http://xmpp.org/rfcs/rfc3921.html#rules (see 11.1.4.1 for messages)...
This page was what I saw before posting my solution (I remember now).

 Anyway, it is for messages, where one can deliver them to
*all* with same priority without ACK problems and there are other
treatment (see same URL) for other types of stanzas.

 Am I still wrong and mixing things or...? ;-)


I still don't see how this might work.

Could you give an example protocol flow?


The question was:

 But then - what happens if two resources of the same priority get
 connected to two different sm instances?

After reading the link I posted in my previous message, I don't see 
what could be the problem with this...?

The router should send the messages to both SMs where the resources 
with same priority are bound, they will know what to do with them.
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-09-17 Thread Alexandre Jousset

Le 17/09/2012 11:02, Tomasz Sterna a écrit :

Dnia 2012-09-17, pon o godzinie 10:55 +0200, Alexandre Jousset pisze:


 The question was:

   But then - what happens if two resources of the same priority get
   connected to two different sm instances?

 After reading the link I posted in my previous message, I
don't see what could be the problem with this...?

 The router should send the messages to both SMs where the
resources with same priority are bound, they will know what to do with
them.


What about iq and/or presence?

- iq-get would get two responses then
- presence-subscribe could get both accept and deny, which to use?


This would cause problems only if the message is sent to user@domain. And in that case, 
the link I've posted (rfc3921) says that for IQs the server should reply with an error on behalf of the user. 
And For presence stanzas other than those of type probe, the server MUST deliver the stanza 
to all available resources;  I suppose that in the latter case the response includes the full 
JID...?
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-09-17 Thread Alexandre Jousset

Le 17/09/2012 13:10, Tomasz Sterna a écrit :

Again.
There is 'u...@example.com/foo' resource with priority 1 bound on sm1.
There is 'u...@example.com/bar' resource with priority 1 bound on sm2.

1. There is an incoming iq-get request for u...@example.com vCard.
  - it is being sent to sm1 and sm2
  - sm1 and sm2 answers on behalf of the user
  - querying user gets two responses


I see a possibility for this but it looks hackish...: router looks into the messages 
when there are more than 1 possible recipient component (in user@domain case). 
If it is an IQ = it generates the error itself. Or it passes the message to one of the 
component (at random) that will generate the error message...

I'm not happy with these ideas, though...

I'm still trying to think about a better solution.


1. Presence case
  - you're right. Presence packets are replicated to all resources, so
we're good here.


Ok.
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-09-17 Thread Alexandre Jousset

Le 14/09/2012 16:08, Tomasz Sterna a écrit :

Let's say, that we won't allow several SM instances handle resources of
the same user.
How? This needs tiny modification of C2S/SM protocol. Instead sending
the user session creation request to the user domain, let's send it to
the user bare-JID.

This way if there is no session for u...@example.com bound on
example.com SMs, router will revert to routing by domain and will pick
up one SM at random. Then this SM will bind 'u...@example.com' name and
all subsequent user sessions will be created with this SM.
So there will be no need for communication between SMs.

(There is a possibility of race - handling several session creation
requests by router and pushing to several random SMs, before first one
binds user bare JID.)


This case could be resolved if the router auto-binds the user@domain route. There could 
still be problems if there are more than one router. But that case (2+ routers auto-binding user@domain at 
the same time) could be fixed by the conflict resolution we thought before, just by canceling all the binds for 
user@domain...?

What do you think?
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-09-16 Thread Alexandre Jousset

Le 14/09/2012 21:17, Tomasz Sterna a écrit :

There is nothing in XMPP about delivering to most recent resource.
I would like to stick to the specification :-)


Sorry, I mixed the notions of binding and delivering :-(
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-09-16 Thread Alexandre Jousset

Le 16/09/2012 21:54, Alexandre Jousset a écrit :

Le 14/09/2012 21:17, Tomasz Sterna a écrit :

There is nothing in XMPP about delivering to most recent resource.
I would like to stick to the specification :-)


 Sorry, I mixed the notions of binding and delivering :-(


Err... Sorry again, but in case of delivery I've found that: 
http://xmpp.org/rfcs/rfc3921.html#rules (see 11.1.4.1 for messages)... This 
page was what I saw before posting my solution (I remember now).

Anyway, it is for messages, where one can deliver them to *all* with 
same priority without ACK problems and there are other treatment (see same URL) 
for other types of stanzas.

Am I still wrong and mixing things or...? ;-)
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-09-14 Thread Alexandre Jousset

Hi,

Le 13/09/2012 16:15, James Wilson a écrit :

I've been watching this discussion unfold and thought I might contribute.


Thanks for your contribution.


Personally, I have not ran a jabberd2 instance in a long time, but this 
question below:

On 13/09/2012, at 11:35 PM, Tomasz Sterna wrote:


Dnia 2012-09-13, czw o godzinie 15:07 +0200, Alexandre Jousset pisze:



But then - what happens if two resources of the same priority get
connected to two different sm instances?


   *This* was my real question ;-)


I don't have answer.

Will have to think about it.


leads me to believe that they should act like so:

[...]


I don't know if this would work in practice, but this is one way I see the 
issue of the above question being resolved.


I don't know what Tomasz thinks about this, but I think it is quite 
complicated for just that case.

I was thinking about another idea: AFAIK the protocol says that in that case 
the message should either be duplicated, and we've seen previously that this may lead 
to problems (IQs, ACKs), or sent to one of the recipients based on the 
implementation's choice. Maybe we can just record the time when the session was 
started and add this information to each related bind request, keep at router 
level that information in the hash table values' structure, and use it in that case. 
The message would then be delivered to the recipient of the most recently started 
session.

...?
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-09-14 Thread Alexandre Jousset

Le 14/09/2012 16:08, Tomasz Sterna a écrit :
[...]

(There is a possibility of race - handling several session creation
requests by router and pushing to several random SMs, before first one
binds user bare JID.)


This race condition, in theory, has small probability to happen, but 
actually I see some cases where it is possible (e.g. c2s crash and / or 
restart, some network problems...), especially with a lot of users and 
auto-reconnecting clients.

What exactly will happen if this race condition occurs? I think it may 
be complicated too to recover from it.

You haven't told your opinion about my idea...? Its advantage is that 
there could not be such race condition. The drawback is just that SM needs to 
keep track of session start time (which may be already the case, I haven't 
checked) and to send it on each of this session related bind. In any case SM 
has to be patched at least to implement the new binding algorithm, so this 
addition would be just a small change.

What do you think?
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-09-13 Thread Alexandre Jousset

Le 13/09/2012 00:05, Tomasz Sterna a écrit :
[...]

Looks simple.
Too simple? ;-)


It's never too simple :-) I think that, as you said before, the 
current implementation was designed open enough to be adapted and that will greatly 
simplify the coding of these new features.


In real life the incoming part of the split would get disabled parts
already handled on the accepting part, and all disconnected sessions
would have to cope.


Err... I'm not sure I understand this one. Sorry if my English and / or 
understanding is too bad :-/

BTW I have another question.

AFAIK the routing to a domain is done only from c2s to sm when a user connects. 
Then the sm answers with the domain in the from part and gives its ID too for 
further communication. So, after this moment c2s knows to which component it should send 
messages for that user session.

My question is: is this the only case where the routing to a domain is 
needed?

If yes, in case of domain routing (e.g. when to is example.com) one 
should only route to one of the bound component serving that domain, maybe randomly. If no, ...?

That is not the same as routing to u...@example.com without resource 
because in that case we said we should duplicate the message to all components bound to 
this user (whatever their resources).
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-09-13 Thread Alexandre Jousset

Le 13/09/2012 14:57, Tomasz Sterna a écrit :

Dnia 2012-09-13, czw o godzinie 13:45 +0200, Alexandre Jousset pisze:

 AFAIK the routing to a domain is done only from c2s to sm when
a user connects. Then the sm answers with the domain in the from
part and gives its ID too for further communication. So, after this
moment c2s knows to which component it should send messages for that
user session.

 My question is: is this the only case where the routing to a
domain is needed?


The other case is communicating with the jabber server (sm) itself.
You can disco the domain to see the server features. You can xmpp-ping
the domain, you can even get server presence (it has some resources
answering).


 If yes, in case of domain routing (e.g. when to is
example.com) one should only route to one of the bound component
serving that domain, maybe randomly. If no, ...?


Not randomly. To the highest priority bind.


Yes, sorry, I meant randomly between components bound with same 
priority.


But what happens if there are many binds of the same priority?

Cannot do randomly, as messages need to go to all highest priority
resources.
Cannot do all, as iq requests would get response many times.


See above.


 That is not the same as routing to u...@example.com without
resource because in that case we said we should duplicate the message
to all components bound to this user (whatever their resources).


Not all.
I suggested that the priority of the bare-JID bind should be equal to
the highest priority resource connected to sm.


Yes, sorry again, I meant with taking priority into account.


But then - what happens if two resources of the same priority get
connected to two different sm instances?


*This* was my real question ;-)
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-09-05 Thread Alexandre Jousset

Hello,

Le 03/09/2012 18:10, Tomasz Sterna a écrit :

I didn't get to designing routing exchange protocol yet.
Building a working implementation of adapting binding of components
should shine some light on what is required to be exchanged.


Ok.

I started to look at the process you sent earlier, and I could start to 
think about the tree structure just by following the process.

That led me to have to ask some questions about this:

1) A minor one: is it right that it's a typo when you wrote example.org 
instead of example.com at some places?

2) Is is OK to assume that all non full-JIDs are of priority 0 (or, 
better said, have no priority at all)?

3) I have a doubt when you say at different places that a component needs to bind 
bare-JIDs / full-JIDs from now on. Does that mean that it needs to send bind 
requests for all sessions it already has, or just make *new* bind requests of the type 
you mentioned?

4) In this process, what if component1 disconnects? I suppose that the router 
needs to crawl through its tree to update it, but that can be CPU intensive and can cause 
lags... I know that this event is not a normal event anyway and is not 
supposed to happen often, so it can be negligible.

5) A Jabber protocol question, I know I could find the answer in the online docs, 
but as I have you at hand ;-) Is it possible to have 2 full-JIDs connected at the same 
time? With equal or different priorities? I suppose the answer is no to both 
questions, but just to be sure...

6) The tree structure I'm thinking about uses hashes to find the next node (root = domain = 
user = resource), and finally a pointer or an array of pointers to components for the leaves. If the 
answer to previous questions at 5) is no, there is only one case where there can be more than 
one leaf: at each node is a default route leading to one or more components. If I use a static 
sized array to store them, that will use more memory. So the best would be to use a linked list, but that 
would make the process slower. I would tend to use linked lists because the cases where one has to crawl 
through them are (relatively) less frequent. What do you think?

That's it for the questions. I you need I could draw a diagram of the 
tree structure I'm thinking about to make things clearer, just tell me.
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-09-05 Thread Alexandre Jousset

Hm, I answer to myself, after thinking some more ;-)

Le 05/09/2012 21:04, Alexandre Jousset a écrit :

 1) A minor one: is it right that it's a typo when you wrote example.org 
instead of example.com at some places?


Obviously, yes.


 2) Is is OK to assume that all non full-JIDs are of priority 0 (or, better 
said, have no priority at all)?


Obviously, yes.


 3) I have a doubt when you say at different places that a component needs to bind 
bare-JIDs / full-JIDs from now on. Does that mean that it needs to send bind 
requests for all sessions it already has, or just make *new* bind requests of the type 
you mentioned?


I think the component has to send all its already online sessions.

And I think this can be a hint for the routers' synchronisation in multi-router 
implementation. In a multi-router implementation, why not consider a router as a more or 
less normal component (when connected to other routers)? With one exception, 
that when a random has to be chosen between components, only the local components have a 
chance. I have to think more about this...


 4) In this process, what if component1 disconnects? I suppose that the router needs 
to crawl through its tree to update it, but that can be CPU intensive and can cause 
lags... I know that this event is not a normal event anyway and is not 
supposed to happen often, so it can be negligible.

 5) A Jabber protocol question, I know I could find the answer in the online docs, 
but as I have you at hand ;-) Is it possible to have 2 full-JIDs connected at the same 
time? With equal or different priorities? I suppose the answer is no to both 
questions, but just to be sure...


Obviously, no.


 6) The tree structure I'm thinking about uses hashes to find the next node (root = domain = 
user = resource), and finally a pointer or an array of pointers to components for the leaves. If the 
answer to previous questions at 5) is no, there is only one case where there can be more than 
one leaf: at each node is a default route leading to one or more components. If I use a static 
sized array to store them, that will use more memory. So the best would be to use a linked list, but that 
would make the process slower. I would tend to use linked lists because the cases where one has to crawl 
through them are (relatively) less frequent. What do you think?


Stupid question: the case where there is multiple choice for components is only 
the domain case. So I can use a reasonnably configurable-at-compile-time 
sized array.
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----




Re: jabberd2 in cluster? ideas, proof of concept and questions...

2012-08-31 Thread Alexandre Jousset

Hi Tomasz,

Thanks for your answer. I'll study your message in detail when I'll 
have time to. I think I'll be able to work on this topic during the week-end.

Regards,
--
--  \^/--
---/ O \-----
--   | |/ \|  Alexandre (Midnite) Jousset  |   --
---|___|-----