On 4 Oct 2017, at 09:19, Georg Lukas <[email protected]> wrote:
> 
> Hi,
> 
> TL;DR: checking if you are still in a MUC is broken and needs to be
> fixed, either with new IQs or with a new rejoin-if-needed presence.
> 
> 
> MUC presence tends to break
> ===========================
> 
> Most of us have experienced this: our client shows us as present in a
> MUC, but nothing happens for hours or even days, and when we finally
> send a "ping", it's bounced with a 'not-acceptable' error.
> 
> Or we are in the MUC, but can't see the subject and only see a handful
> of participants, with messages from apparently offline users coming in
> all the time.
> 
> XEP-0045 assumes a perfect network connection between the client, its
> server and the MUC service, and perfect uptimes of all of them. Whenever
> these assumptions fail, users end up in a desynchronized state, with a
> good chance of not noticing it or having no way to fix it.
> 
> 
> Technical Background
> ====================
> 
> The reasons for the trouble are:
> 
> 1. A MUC misunderstanding a join for a presence update.
> 
> The MUC thinks the user is still joined, while the client was actually
> restarted / reconnected and needs to synchronize. In this case, the
> client will not receive the initial presence but only presence updates,
> ending up with a slowly-growing partial participant list.
> 
> This issue should be addressed with https://github.com/xsf/xeps/pull/499
> but there are still broken service implementations out there.
> 
> Ugly workaround hack: the client needs to prepend each join with an
> explicit <presence unavailable> to force the MUC to do a real join.
> 
> 2. A MUC misunderstanding a presence update for a GC1.0 join.
> 
> The client doesn't notice that it was kicked (or the MUC service was
> restarted), sends a presence update caused by user (in)action, and
> receives a full join package in response. The client will receive a full
> MUC history, subject change and presences, and parse these elements as
> live updates. It will end up with a union set of the old and new
> participants, but will be able to participate in the discussion.
> 
> Workaround: none. Burn GC1.0 with fire.
> 
> 3. A silent intermittent network failure that eats some stanzas
> 
> When the MUC realizes the client's absence (typically on the first
> bounced error), the participant gets kicked. However, there is no
> equivalent way for the client to realize it is gone from the MUC.
> 
> The only workaround for the client is to periodically check whether it
> is still there by sending something and waiting for it to arrive, get
> bounced or run into a timeout. However, there is no proper mechanism to
> do that, and all "improper" mechanisms suffer from various drawbacks:
> 
> - send a periodic presence to the MUC: will be interpreted as GC1.0
>   join, and will be reflected to all participants, causing O(N²)
>   traffic to the MUC.
> 
> - send a silent message to the MUC (also O(N²)).
> 
> - send a MUC-PM to yourself (will be rejected by MUCs that forbid PMs)
> 
> - send an IQ to yourself (works in theory (*))
> 
> (*) poezio and yaxim solve that by sending a 0199 ping to your own
> participant JID. However, in Multi-Session Nick scenarios, the ping IQ
> will be routed to a "random" client of yours, and if that client is
> currently suffering from a bad connection, your desktop client will run
> into a ping timeout and erroneously think it got disconnected from the
> MUC.
> 
> Proposed Solutions
> ==================
> 
> All of the following change behavior and would need to be feature-coded
> in the MUC/service caps:
> 
> 1. Mandate different response codes to 0199 ping on the MUC JID (not on
>   the self-participant-JID), e.g.
>   'not-acceptable' --> you are not joined
>   OK result --> you are joined
> 
> 2. Create a new, explicit am-I-joined IQ that a client can send to the
>   MUC JID.
> 
> 3. Create a new <presence> or <x> sub-element <rejoin-if-needed/>.
> 
> After joining a MUC, a client would add that element to all subsequent
> presence stanzas sent to the MUC (and would periodically send such
> presence to check whether it is still joined).
> 
> The MUC service would treat this presence as follows:
> 
> If the client is joined: only reflect the presence to participants if it
> is different from the last-reflected presence (to avoid the O(N²)
> overhead).
> 
> If the client is not joined: send the client a presence-unavailable to
> let it know that it is going to rejoin (and so it can flush the
> participant list), then send presence broadcast, (partial) MUC history,
> subject, participant presence.
> 
> The last solution is the most complex, but allows for a
> single-round-trip rejoin if the client got desynchronized.
> 
> Opinions please?

Thanks for the write-up. I agree this is a problem worth solving.

I think (3) seems like it has nice properties in terms of a single round-trip, 
but I think (2) is the preferable option in practice. It’s simple to implement 
for everyone (3 is quite complex), and I think also makes it easier if one 
wants to write a MIX proxy (so allowing users to join MUCs as if they were 
MIXs, and have the server do all the work for them - which would be a nice 
thing, I think).
As it’s a sticking plaster, and we’re trying to fix things properly, going with 
a sticking plaster iq seems ok to me (as is more likely to get the needed wide 
deployment).

/K
_______________________________________________
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: [email protected]
_______________________________________________

Reply via email to