On 4 Oct 2017, at 09:19, Georg Lukas <[email protected]> wrote: > > Hi, > > TL;DR: checking if you are still in a MUC is broken and needs to be > fixed, either with new IQs or with a new rejoin-if-needed presence. > > > MUC presence tends to break > =========================== > > Most of us have experienced this: our client shows us as present in a > MUC, but nothing happens for hours or even days, and when we finally > send a "ping", it's bounced with a 'not-acceptable' error. > > Or we are in the MUC, but can't see the subject and only see a handful > of participants, with messages from apparently offline users coming in > all the time. > > XEP-0045 assumes a perfect network connection between the client, its > server and the MUC service, and perfect uptimes of all of them. Whenever > these assumptions fail, users end up in a desynchronized state, with a > good chance of not noticing it or having no way to fix it. > > > Technical Background > ==================== > > The reasons for the trouble are: > > 1. A MUC misunderstanding a join for a presence update. > > The MUC thinks the user is still joined, while the client was actually > restarted / reconnected and needs to synchronize. In this case, the > client will not receive the initial presence but only presence updates, > ending up with a slowly-growing partial participant list. > > This issue should be addressed with https://github.com/xsf/xeps/pull/499 > but there are still broken service implementations out there. > > Ugly workaround hack: the client needs to prepend each join with an > explicit <presence unavailable> to force the MUC to do a real join. > > 2. A MUC misunderstanding a presence update for a GC1.0 join. > > The client doesn't notice that it was kicked (or the MUC service was > restarted), sends a presence update caused by user (in)action, and > receives a full join package in response. The client will receive a full > MUC history, subject change and presences, and parse these elements as > live updates. It will end up with a union set of the old and new > participants, but will be able to participate in the discussion. > > Workaround: none. Burn GC1.0 with fire. > > 3. A silent intermittent network failure that eats some stanzas > > When the MUC realizes the client's absence (typically on the first > bounced error), the participant gets kicked. However, there is no > equivalent way for the client to realize it is gone from the MUC. > > The only workaround for the client is to periodically check whether it > is still there by sending something and waiting for it to arrive, get > bounced or run into a timeout. However, there is no proper mechanism to > do that, and all "improper" mechanisms suffer from various drawbacks: > > - send a periodic presence to the MUC: will be interpreted as GC1.0 > join, and will be reflected to all participants, causing O(N²) > traffic to the MUC. > > - send a silent message to the MUC (also O(N²)). > > - send a MUC-PM to yourself (will be rejected by MUCs that forbid PMs) > > - send an IQ to yourself (works in theory (*)) > > (*) poezio and yaxim solve that by sending a 0199 ping to your own > participant JID. However, in Multi-Session Nick scenarios, the ping IQ > will be routed to a "random" client of yours, and if that client is > currently suffering from a bad connection, your desktop client will run > into a ping timeout and erroneously think it got disconnected from the > MUC. > > Proposed Solutions > ================== > > All of the following change behavior and would need to be feature-coded > in the MUC/service caps: > > 1. Mandate different response codes to 0199 ping on the MUC JID (not on > the self-participant-JID), e.g. > 'not-acceptable' --> you are not joined > OK result --> you are joined > > 2. Create a new, explicit am-I-joined IQ that a client can send to the > MUC JID. > > 3. Create a new <presence> or <x> sub-element <rejoin-if-needed/>. > > After joining a MUC, a client would add that element to all subsequent > presence stanzas sent to the MUC (and would periodically send such > presence to check whether it is still joined). > > The MUC service would treat this presence as follows: > > If the client is joined: only reflect the presence to participants if it > is different from the last-reflected presence (to avoid the O(N²) > overhead). > > If the client is not joined: send the client a presence-unavailable to > let it know that it is going to rejoin (and so it can flush the > participant list), then send presence broadcast, (partial) MUC history, > subject, participant presence. > > The last solution is the most complex, but allows for a > single-round-trip rejoin if the client got desynchronized. > > Opinions please?
Thanks for the write-up. I agree this is a problem worth solving. I think (3) seems like it has nice properties in terms of a single round-trip, but I think (2) is the preferable option in practice. It’s simple to implement for everyone (3 is quite complex), and I think also makes it easier if one wants to write a MIX proxy (so allowing users to join MUCs as if they were MIXs, and have the server do all the work for them - which would be a nice thing, I think). As it’s a sticking plaster, and we’re trying to fix things properly, going with a sticking plaster iq seems ok to me (as is more likely to get the needed wide deployment). /K _______________________________________________ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: [email protected] _______________________________________________
