On Thu Jun 24 21:52:22 2010, Matthew Wild wrote:
On 24 June 2010 21:33, Justin Karneges
<[email protected]> wrote:
> It's a common problem to join a muc that already thinks you are joined, and > then the presence you send is interpretted as a mere status change rather > than a full join.  Then you don't get the room roster, history, etc.  Kev > informs me that the <x xmlns="http://jabber.org/protocol/muc";> element > (hereby referred to as "the muc element") is supposed to solve this problem. > You include it only on join stanzas, but not on status change stanzas.  This > way, if a muc sees the element but thought you were already joined, it can do
> a proper rejoin.
>

Yes, Prosody has had this code since the early days, however we
currently have it commented out due to Google Talk's issues. Gajim
also included the element on nick changes, but we ensured this was
fixed, and added a workaround for it.

But there's little way we can work around Google's oddity (well
technically there is, but none I'd be happy with releasing).


There is, in fact, a workaround in M-Link, too, in as much as it's possible to strip out the XEP-0045 control element on inbound presence from a domain before the processing code ever sees it. I'd be loathe to put that into production.

But don't be coy about this - this is an interop bug, not a mere oddity. While I don't see anything in the spec suggesting that directed presence should be repeated, I admit there's nothing in the spec about it not being repeated either, so we either have a bug in the spec (if Google insist the spec allows them to do this) or a bug in GTalk (if they admit they shouldn't). Either way, it needs resolution.


> However, this seems to break with servers that replay directed presence. > Allegedly gtalk does this.  Every 5 minutes, the client's server replays the > directed presence to the muc, which includes the muc element, causing the > user to constantly rejoin the muc (at least, for those mucs that respect the
> muc element properly).
>
> Some solutions:
>  1) Servers shouldn't replay directed presence.

I don't see that randomly re-sending join requests shouldn't result in multiple joins to a room. Broadcasted presence is a different case, it
is more of a "state" than an instruction.

>  2) If presence is replayed, replay only the elements safe to replay.
>

That requires the server to know which elements those are - MUC
obviously isn't, but what's to say more won't come along? (e.g.
temporary presence subscriptions).


There's actually a (3), which is "If you replay directed presence, then add a delay".

We already add a delay element to other cases of repeated presence, after all, such as in response to probes.

In this particular case, this'd leave the following cases:

User            Not Joined              Joined
Delay           Join                            Ignore
No Delay                Join                            Rejoin

Of course, I'm not convinced that this solves nickname changes, which is another long-standing Google interop bug.

There's another issue caused by this bug - for every repeated presence, that causes a presence element to be resent by the MUC to every occupant, unless we also start to filter presence stanzas.

(Stanzas per sec from this is G*(G+N)/300, where G is Google users and N is non-Google users, kids. In jdev, as I write, that's G==3 and N==25, hence averaging 0.3 stanzas a second - nothing we can't handle, but add in another 6 Googlers and it's already reached 1/sec.)

Agreed, but what's done is done, and without using presence for MUC we
wouldn't have the unavailable on disconnect.

Right - MUC is a presence-based system, so needs to operate over presence.

The only other solution would be to make distinct the history, current occupants, and current subject retrieval, which is also an option. Ideally, you'd also need to include nick changes here which starts to radically impinge on the design.



> Kev additionally informs me that M-Link's muc service may be the only one that
> performs rejoins properly when receiving the muc element.

It wasn't the first, but it likely is the only one at the moment. I
didn't consider it acceptable to release logic that is broken with one
of the largest XMPP deployments on the internet, so as I said, we
removed it from Prosody with a view to re-adding it if/when Google
finally cleaned up their act.


It's not clear to me why Google does this, I have attempted to contact them to resolve the interop issue, and I've yet to have a response, but I'd hope that they're keen to sort this out - it's entirely possible I'm using the wrong contact details or something, so if Google people are reading this, please, I'm keen to get a solution.

But aside from being rather spectacularly wasteful of bandwidth, a client should be able to tell that at least the history is just that - due again to the delay. In addition, it's only the last presence that Google replays, as far as I understand things, so clients can workaround this bug by sending an "update" presence after joining. They can even workaround the nickname changing bug, as Kev pointed out to me, by sending unavailable to the old nickname after successfully changing nickname. So if client developers find their users are suffering, there are workarounds they could deploy, which are much easier to do there.

The alternative is that either servers ignore Google, have special workaround for Google, or else avoid improvements across the board due to Google.

Personally, I'm willing to dig my heels in a bit on this. I can't think of any cases where repeating directed presence is a good idea, and in the absence of the specification mentioning this, I don't believe that implementations should be doing so. Unless I see a reason to change my mind, and/or unless customers start to complain, I see no reason to change our behaviour.


> If indeed few mucs
> are supporting this, then maybe we have an opportunity to amend this problem > in XEP-0045.  That is, change the XEP to make it clear that the muc element > does not cause rejoins, and possibly look for a different rejoin solution
> that does not break the presence model.
>

That would be fine, we can update our code easily, and would be glad
to see it back in action. However the issue can be solved more
generally by implementing XEP-0198 for s2s, and logic to make
unavailable any remote users when it's detected that their server has
crashed. This is my current goal.

Yes, also if we ensure servers respond correctly to probes when directed presence is involved we can probe in various cases - that said, I know M-Link doesn't respond correctly in this case, although we're working on that. (I'm curious as to whether other servers do, as well - this is a bis thing we've not caught up with yet, AFAIK).

As an aside, here, it may be required that clients send unavailable to their old nickname after a nick change, as suggested above as a workaround to Google, since the server has to track the directed presence in order to send unavailable and respond to pings - if the client never sends the unavailable to match the directed presence, then various state mismatches could occur.

Dave.
--
Dave Cridland - mailto:[email protected] - xmpp:[email protected]
 - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
 - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade

Reply via email to