In implementing MAM in clients there can be a case where MAM results
contain duplicates of already seen messages. In order to prevent such
duplication, the MAM ID for a stanza would need to appear on a newly generated
non-MAM stanza.
As background, imagine a client which, when it receives a new stanza
from a server, presents a view that renders the new stanza and then queries MAM
to provide a chat history between two JIDs. When the JID1 sends a message to
JID2 it is logged in the MAM store and forwarded on to JID2, JID2 then requests
MAM results for JID1, returning the last 50 messages, which would include the
stanza that indirectly generated the MAM request, leading to two copies of the
stanza in the message view between JID1 and JID2.
Note that while the common case would be the most recent stanza being
duplicated, it is also possible for more than one to be duplicated because of
the asynchronous nature of the MAM IQ response and they may arrive interleaved
with new messages.
By showing the MAM ID on newly generated inbound messages, the client
would be able to ask MAM for all messages before that ID, preventing
duplication while allowing new messages to be correctly shown in order.
Querying MAM by message times also will not work, given the potential
differences in clocks between arbitrary clients and the MAM store.
Thoughts?
-bjc