On Thu Jun 24 21:52:22 2010, Matthew Wild wrote:
On 24 June 2010 21:33, Justin Karneges
<[email protected]> wrote:
> It's a common problem to join a muc that already thinks you are
joined, and
> then the presence you send is interpretted as a mere status
change rather
> than a full join. Then you don't get the room roster, history,
etc. Kev
> informs me that the <x xmlns="http://jabber.org/protocol/muc">
element
> (hereby referred to as "the muc element") is supposed to solve
this problem.
> You include it only on join stanzas, but not on status change
stanzas. This
> way, if a muc sees the element but thought you were already
joined, it can do
> a proper rejoin.
>
Yes, Prosody has had this code since the early days, however we
currently have it commented out due to Google Talk's issues. Gajim
also included the element on nick changes, but we ensured this was
fixed, and added a workaround for it.
But there's little way we can work around Google's oddity (well
technically there is, but none I'd be happy with releasing).
There is, in fact, a workaround in M-Link, too, in as much as it's
possible to strip out the XEP-0045 control element on inbound
presence from a domain before the processing code ever sees it. I'd
be loathe to put that into production.
But don't be coy about this - this is an interop bug, not a mere
oddity. While I don't see anything in the spec suggesting that
directed presence should be repeated, I admit there's nothing in the
spec about it not being repeated either, so we either have a bug in
the spec (if Google insist the spec allows them to do this) or a bug
in GTalk (if they admit they shouldn't). Either way, it needs
resolution.
> However, this seems to break with servers that replay directed
presence.
> Allegedly gtalk does this. Every 5 minutes, the client's server
replays the
> directed presence to the muc, which includes the muc element,
causing the
> user to constantly rejoin the muc (at least, for those mucs that
respect the
> muc element properly).
>
> Some solutions:
> 1) Servers shouldn't replay directed presence.
I don't see that randomly re-sending join requests shouldn't result
in
multiple joins to a room. Broadcasted presence is a different case,
it
is more of a "state" than an instruction.
> 2) If presence is replayed, replay only the elements safe to
replay.
>
That requires the server to know which elements those are - MUC
obviously isn't, but what's to say more won't come along? (e.g.
temporary presence subscriptions).
There's actually a (3), which is "If you replay directed presence,
then add a delay".
We already add a delay element to other cases of repeated presence,
after all, such as in response to probes.
In this particular case, this'd leave the following cases:
User Not Joined Joined
Delay Join Ignore
No Delay Join Rejoin
Of course, I'm not convinced that this solves nickname changes, which
is another long-standing Google interop bug.
There's another issue caused by this bug - for every repeated
presence, that causes a presence element to be resent by the MUC to
every occupant, unless we also start to filter presence stanzas.
(Stanzas per sec from this is G*(G+N)/300, where G is Google users
and N is non-Google users, kids. In jdev, as I write, that's G==3 and
N==25, hence averaging 0.3 stanzas a second - nothing we can't
handle, but add in another 6 Googlers and it's already reached 1/sec.)
Agreed, but what's done is done, and without using presence for MUC
we
wouldn't have the unavailable on disconnect.
Right - MUC is a presence-based system, so needs to operate over
presence.
The only other solution would be to make distinct the history,
current occupants, and current subject retrieval, which is also an
option. Ideally, you'd also need to include nick changes here which
starts to radically impinge on the design.
> Kev additionally informs me that M-Link's muc service may be the
only one that
> performs rejoins properly when receiving the muc element.
It wasn't the first, but it likely is the only one at the moment. I
didn't consider it acceptable to release logic that is broken with
one
of the largest XMPP deployments on the internet, so as I said, we
removed it from Prosody with a view to re-adding it if/when Google
finally cleaned up their act.
It's not clear to me why Google does this, I have attempted to
contact them to resolve the interop issue, and I've yet to have a
response, but I'd hope that they're keen to sort this out - it's
entirely possible I'm using the wrong contact details or something,
so if Google people are reading this, please, I'm keen to get a
solution.
But aside from being rather spectacularly wasteful of bandwidth, a
client should be able to tell that at least the history is just that
- due again to the delay. In addition, it's only the last presence
that Google replays, as far as I understand things, so clients can
workaround this bug by sending an "update" presence after joining.
They can even workaround the nickname changing bug, as Kev pointed
out to me, by sending unavailable to the old nickname after
successfully changing nickname. So if client developers find their
users are suffering, there are workarounds they could deploy, which
are much easier to do there.
The alternative is that either servers ignore Google, have special
workaround for Google, or else avoid improvements across the board
due to Google.
Personally, I'm willing to dig my heels in a bit on this. I can't
think of any cases where repeating directed presence is a good idea,
and in the absence of the specification mentioning this, I don't
believe that implementations should be doing so. Unless I see a
reason to change my mind, and/or unless customers start to complain,
I see no reason to change our behaviour.
> If indeed few mucs
> are supporting this, then maybe we have an opportunity to amend
this problem
> in XEP-0045. That is, change the XEP to make it clear that the
muc element
> does not cause rejoins, and possibly look for a different rejoin
solution
> that does not break the presence model.
>
That would be fine, we can update our code easily, and would be glad
to see it back in action. However the issue can be solved more
generally by implementing XEP-0198 for s2s, and logic to make
unavailable any remote users when it's detected that their server
has
crashed. This is my current goal.
Yes, also if we ensure servers respond correctly to probes when
directed presence is involved we can probe in various cases - that
said, I know M-Link doesn't respond correctly in this case, although
we're working on that. (I'm curious as to whether other servers do,
as well - this is a bis thing we've not caught up with yet, AFAIK).
As an aside, here, it may be required that clients send unavailable
to their old nickname after a nick change, as suggested above as a
workaround to Google, since the server has to track the directed
presence in order to send unavailable and respond to pings - if the
client never sends the unavailable to match the directed presence,
then various state mismatches could occur.
Dave.
--
Dave Cridland - mailto:[email protected] - xmpp:[email protected]
- acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
- http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade