Re: [Standards] LAST CALL: XEP-0313 (Message Archive Management)

Georg Lukas Wed, 31 Mar 2021 09:49:02 -0700

* Jonas Schäfer <[email protected]> [2021-03-16 21:23]:
> 1. Is this specification needed to fill gaps in the XMPP protocol
> stack or to clarify an existing protocol?


Yes.

> 2. Does the specification solve the problem stated in the introduction
> and requirements?

Mostly yes.

> 3. Do you plan to implement this specification in your code? If not,
> why not?

Implemented it in yaxim in 2019.

> 4. Do you have any security concerns related to this specification?

Yes. Time and again, specifications that use Message Forwarding have
fallen victim to impersonation attacks (there is a number of CVEs
around, like CVE-2017-5589, CVE-2019-16235, and CVE-2020-26547).

The XEP urgently needs a respective section in the Security
Considerations, and ideally also a negative example like
https://xmpp.org/extensions/xep-0280.html#example-11

> 5. Is the specification accurate and clearly written?

I think it's still missing a number of important points.


§6.1.1: the business rules for user archives are inadequate in two
regards:

MUC messages in user archive: I think that implementation practice has
clearly shown that storing MUC messages in the user archive is a Bad
Idea, and nobody is doing it anyway. Also the server is probably not in
a position to track a user's MUC activity and query all MUCs for whether
they implement some sort of message storage. This part should be
converted into "SHOULD NOT" or "MUST NOT".

Storage rules: those look very much like the original ones from the
initial specification, and I think we have learned much since then.

Prosody will store "normal" messages with a body, or "chat"
messages that are not empty after stripped. By default, it will strip
chat states, but it will count origin-id or <x muc> as elements that are
worth of storing.

Part of the problem is an implementation issue with storing the stripped
message and not the original <https://issues.prosody.im/1423> but the
general problem of clearly defined storage rules remains.

This XEP needs something like the 0280 Recommended Rules
<https://xmpp.org/extensions/xep-0280.html#recommended-rules> but it
should be part of the XEP and not a later addition guarded by a separate
namespace. Maybe.

Also it would be great to persist message errors for sent messages. But
this is a separate can of worms.


My comment from the last 0313 LC about letting the client know if the
MAM preferences are "undefined" yet, so that the client can ask the user
once, now applies to XEP-0441, so I hope I'll think of bringing it up in
the respective Last Call again.


The Business Rules section needs clear guidance to client
implementations that want to do "full sync", i.e. obtain all messages
received by the account since the client was last online, without too
many duplicates.

This is a complex problem because offline messages might contain
everything that is also in MAM, or might have been drained by another
client in the meantime, so that offline messages will only give you the
end of chat history.

Furthermore, I'm not sure if messages received by a client from offline
history are supposed to contain the respective MAM-ID, so deduplicating
here might be very adventurous, as the same message might arrive from
offline history without a MAM ID and from MAM with a MAM ID, and the @id
attribute might not be unique.

There is a separate standards@ thread regarding how to treat offline
history for MAM-enabled clients, but it only solves part of the above
problem.


There is no "atomic" switch between fetching messages from MAM and
receiving live traffic, so a client needs to either remember the last
locally known MAM ID before sending initial presence, then request MAM
after that last-known-MAM-ID until it catches up with offline history,
or until it depletes the MAM archive, duplicating messages between
offline history and MAM fetching.

The naive approach of first fetching MAM, then sending initial presence
causes a subtle race condition for messages that are delivered to your
account in the brief moment after you completed fetching from MAM.

There is also a problem if a client crashes during this catch-up phase
(this is more common on mobile systems than you'd hope), as it needs to
either persist the last-known-MAM-ID or keep the incoming MAM fetch in
memory until everything is processed, as otherwise it would populate the
message database with new MAM-IDs that would be incorrectly considered
as the new last-known-MAM-ID after a client restart.


We are still missing a "MAM subscription" mechanism, where a client
would automatically receive the MAM-ID of all messages it sends, so that
it can properly de-duplicate them from a later MAM fetch. As it is now,
a client needs to exclude sent messages from the "obtain
last-known-MAM-ID" algorithm, and then assign the MAM-ID for sent
messages that are reflected to it on the next MAM fetch.


Not sure which parts of that belong into bind2 though.


Georg

signature.asc
Description: PGP signature

_______________________________________________
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: [email protected]
_______________________________________________

Re: [Standards] LAST CALL: XEP-0313 (Message Archive Management)

Reply via email to