Hi,

TL;DR: checking if you are still in a MUC is broken and needs to be
fixed, either with new IQs or with a new rejoin-if-needed presence.


MUC presence tends to break
===========================

Most of us have experienced this: our client shows us as present in a
MUC, but nothing happens for hours or even days, and when we finally
send a "ping", it's bounced with a 'not-acceptable' error.

Or we are in the MUC, but can't see the subject and only see a handful
of participants, with messages from apparently offline users coming in
all the time.

XEP-0045 assumes a perfect network connection between the client, its
server and the MUC service, and perfect uptimes of all of them. Whenever
these assumptions fail, users end up in a desynchronized state, with a
good chance of not noticing it or having no way to fix it.


Technical Background
====================

The reasons for the trouble are:

1. A MUC misunderstanding a join for a presence update.

The MUC thinks the user is still joined, while the client was actually
restarted / reconnected and needs to synchronize. In this case, the
client will not receive the initial presence but only presence updates,
ending up with a slowly-growing partial participant list.

This issue should be addressed with https://github.com/xsf/xeps/pull/499
but there are still broken service implementations out there.

Ugly workaround hack: the client needs to prepend each join with an
explicit <presence unavailable> to force the MUC to do a real join.

2. A MUC misunderstanding a presence update for a GC1.0 join.

The client doesn't notice that it was kicked (or the MUC service was
restarted), sends a presence update caused by user (in)action, and
receives a full join package in response. The client will receive a full
MUC history, subject change and presences, and parse these elements as
live updates. It will end up with a union set of the old and new
participants, but will be able to participate in the discussion.

Workaround: none. Burn GC1.0 with fire.

3. A silent intermittent network failure that eats some stanzas

When the MUC realizes the client's absence (typically on the first
bounced error), the participant gets kicked. However, there is no
equivalent way for the client to realize it is gone from the MUC.

The only workaround for the client is to periodically check whether it
is still there by sending something and waiting for it to arrive, get
bounced or run into a timeout. However, there is no proper mechanism to
do that, and all "improper" mechanisms suffer from various drawbacks:

 - send a periodic presence to the MUC: will be interpreted as GC1.0
   join, and will be reflected to all participants, causing O(N²)
   traffic to the MUC.

 - send a silent message to the MUC (also O(N²)).

 - send a MUC-PM to yourself (will be rejected by MUCs that forbid PMs)

 - send an IQ to yourself (works in theory (*))

(*) poezio and yaxim solve that by sending a 0199 ping to your own
participant JID. However, in Multi-Session Nick scenarios, the ping IQ
will be routed to a "random" client of yours, and if that client is
currently suffering from a bad connection, your desktop client will run
into a ping timeout and erroneously think it got disconnected from the
MUC.

Proposed Solutions
==================

All of the following change behavior and would need to be feature-coded
in the MUC/service caps:

1. Mandate different response codes to 0199 ping on the MUC JID (not on
   the self-participant-JID), e.g.
   'not-acceptable' --> you are not joined
   OK result --> you are joined

2. Create a new, explicit am-I-joined IQ that a client can send to the
   MUC JID.

3. Create a new <presence> or <x> sub-element <rejoin-if-needed/>.

After joining a MUC, a client would add that element to all subsequent
presence stanzas sent to the MUC (and would periodically send such
presence to check whether it is still joined).

The MUC service would treat this presence as follows:

If the client is joined: only reflect the presence to participants if it
is different from the last-reflected presence (to avoid the O(N²)
overhead).

If the client is not joined: send the client a presence-unavailable to
let it know that it is going to rejoin (and so it can flush the
participant list), then send presence broadcast, (partial) MUC history,
subject, participant presence.

The last solution is the most complex, but allows for a
single-round-trip rejoin if the client got desynchronized.

Opinions please?


Kind regards,

Georg
-- 
|| http://op-co.de ++  GCS d--(++) s: a C+++ UL+++ !P L+++ !E W+++ N  ++
|| gpg: 0x962FD2DE ||  o? K- w---() O M V? PS+ PE-- Y++ PGP+ t+ 5 R+  ||
|| Ge0rG: euIRCnet ||  X(+++) tv+ b+(++) DI+++ D- G e++++ h- r++ y?   ||
++ IRCnet OFTC OPN ||_________________________________________________||

Attachment: signature.asc
Description: PGP signature

_______________________________________________
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: [email protected]
_______________________________________________

Reply via email to