Should I migrate the Chat Marker XEP to a Message based solution and assume the issues regarding the storage of the messages with no body will be resolved in some way?
Regards Spencer On Fri, May 31, 2013 at 4:05 PM, David Laban <[email protected]> wrote: > On Thu, 2013-05-30 at 18:16 +0100, Matthew Wild wrote: > > A more general issue is whether this XEP (or rather the specific > > protocol it defines) s necessary at all. I'm not saying it definitely > > isn't, but need a little more persuasion. For example it seems that > > the primary issue it is working around is that XEP-0136 and XEP-0313 > > might not save messages with no body. Might it not be easier to solve > > this problem instead? > > Sorry I'm late to the party. I have actually been discussing this with > Spencer in the office over the last couple of weeks, so maybe I can give > some more motivation for why we feel that an XEP is required. I'm not > too attached > > There are actually a bunch of things that Chat Markers is trying to > solve: > > 1) Atomic "Read Receipt" messages (Or "Seen Receipts" or whatever[1]). > 1.1.1) Currently, our pre-alpha implementation of "Seen by $user" > markers requires each client to keep track of the state machine > (xep-0085) for *each* remote resource. Any incoming <active/> > notification, or any incoming <received> notification from a resource > that is in the 'active' state currently updates the "Seen by $user" > marker. > > 1.2.1) xep-0022 basically solves this problem, but it is marked as > obsolete. I didn't feel like digging it out of the grave, but maybe we > should re-consider it? > > 1.3.1) I suggested that we could simply include our state (if active) as > a sister element to <received/>, but Spencer pointed out that xep-0184 > section 7 states: "When the recipient sends an ack message, it SHOULD > ensure that the message stanza contains only one child element". What > would break if we did this? > > > 2) State recovery for disconnected clients that come online. > 2.1.1) Currently, this is impossible, so our "Seen by $user" marker > stays where it is, and messages appear as "Not delivered" when they are > retrieved from MAM until a reply is received from the remote party (at > which point, we assume that their client has done state recovery from > MAM, and mark all messages as received. > > 2.1.2) xep-0184 section 5.5 Archived Messages states "An entity MUST NOT > send an ack message when a user views messages that have been archived > or stored on the client or the server (e.g., via Message Archiving [8]), > only when first receiving the message." > > This is annoying, but quite understandable (e.g. what should a client do > if it gets <received id=1234/> when it doesn't have any knowledge of > <message id=1234/> or when it might have been sent?) > > 2.3.1) We could allow MAM to store *all messages*, but then then a query > for "how many unread messages to I have since $time" returns a hugely > inflated answer. The only way to get an accurate count would then be to > retrieve *all messages* and classify them. > > 2.3.2) We could create a clone of the MAM XEP (let's call it Message > State Recovery: MSR) that stores everything without a body, and let the > clients query that in order to do state recovery. > > A little benchmarking of our client's <message/> datastore and a simple > thought experiment suggests that this will be many times as large as the > MAM database in a naive implementation (when Kate sends a message to > Pete, her client will send <active/>; <composing/>; (<paused/>; > <composing/>)* <body/>... and each of Pete's clients will send > <received/> and 0 or more will send <active/>. > > Note that we would need to bend the rules for xep-0184 (see 2.1.2) for > this to be useful. Specifically (after retrieving all messages from MSR > and MAM) for each incoming message in MAM that doesn't have a > corresponding outgoing entry in MSR, send a receipt anyway. > > 2.3.3) We could get the server to store markers for "delivered" and > "seen" etc. This is what Chat States attempts to do. > > > > 3) Efficiency > 3.1.1) 2.3.2 and 1.1.1 cover a couple of the obvious problems with what > we have now. > > 3.3.1) If we create a MSR XEP, what is the minimum amount of information > that we can store? If we have solved problem 1) then we can make a lot > of optimizations. For example, if we used my 1.3.1 proposal, then could > simply store the last message of each type? > > Concretely, could we have a database with the following uniqueness > constraint: > (sender_barejid, receiver_barejid, sibling_ns, sibling_name) > > where sibling_ns is the namespace of the element after <received/> in > the <message/> stanza, and sibling_name is its name (e.g. 'active' or > NULL)? > > > And in reply to Matthew Wild's specific comments: > > > > For XEP-0136, it appears to be configurable already in the archiving > > preferences (surprise surprise!). For XEP-0313, I'm open to discussion > > about what it recommends. > > > We have gone for a Message Archive Management + Message Carbons approach > so far, which means that clients only need to know how to unpack > <forwarded/>. I don't fancy forcing 3 teams to implement XEP-0136 if I > can avoid it. > > > XEP-0313 intentionally remains silent on most policy decisions like > > that. However it seemed sensible at the time that nobody would want to > > archive messages without a body, which on the network today are > > primarily chat states and notifications of various sorts. The XEP is > > still experimental, perhaps we can come up with better rules? I don't > > know, that's a discussion for another thread. > > > see 2.3.1 for why I think that MAM's rules are probably correct, and if > anything, we should have a parallel store for messages without bodies. > > > Forgetting archiving completely for the moment, offline messages might > > do enough already, no? XEP-0160 doesn't actually have any > > recommendations about what to store or what not to store. It seems > > that servers are expected to identify things like chat states already > > (XEP-0085 says that servers "SHOULD NOT store them offline"). This > > doesn't seem like a good model, but it's what we currently have. > > > XEP-0160 breaks in any use-case that involves multiple mobile devices > per account. I am actually thinking of disabling support for it on our > server completely, since all of our supported clients understand Message > Archive Management. > > > David. > > > [1] The jdev thread was repeatedly derailed by people querying the > semantics of "read", so if I say "read" and it annoys you, translate it > to "seen" in your head. There are also use-cases for states like > "notified about" and "sent/delivered out-of-band" (e.g. via Apple Push > Notifications or SMS) and "acknowledged". I would prefer to avoid going > down that particular rabbit-hole yet, but any protocol should be > extensible in that direction (the benchmark for extensibility here is > the set of SIP status codes 100 Trying (= reached first server), 180 > Ringing (= notified about), 200 OK (= acknowledged/accepted)). > > -- > > Section numbers are of the form x.y.z) where x = topic, y = status: (1= > where are we now, 2= where did we come from, 3= where could we be), z = > incrementing integer. Sometimes I have nothing to say about x.2.z. > >
