Right, thoughts about 301 (consider them early Last Call feedback, I guess. I think it would be worth addressing them, or at least producing an errata list of your expected edits, before asking too many other people to review this (e.g. LC) as it took me a considerable time and it'd be a shame to waste people's effort commenting on things due to be changed):
== Introduction == This seems mostly fine. I wonder about the reference to realjabber.org. Partly because it's a reference to a potentially less stable URL, and partly because I think the name is inflammatory - did the XSF or Cisco grant the trademark use? Do we need two references to how much deaf people like this within ten lines? == Requirements == 2.3.4 doesn't seem quite right - what we want is for it to be possible to produce gateways for interoperability - not that XEP 301 implementations themselves interop with other networks? 2.4 Doesn't seem to be about Accessibility. 2.4.4 Doesn't make much sense to me. == Glossary == "real-time text"'s definition seems wrong - it isn't necessarily transmitted instantly in 301. It would seem more natural to define this in terms of real-time, defined on the immediately preceding line. == Protocol == "to allow the recipient to see the sender type the message" - I'd suggest "to allow the recipient to receive the latest state of the message as it is being typed" - RTT doesn't allow us to see the sender :) Example 1: I suggest that this could be better demonstrated by not cutting at the word boundaries "He", "llo, m", "y Juliet!" maybe, or something like that. Experience and/or cynicism say that implementers are quite likely to look at the examples, ignore the text, and misunderstand what's going on if the examples provide convenient semantics not required by the protocol. "The bounds of seq is 31-bits, the range of positive values of a signed integer" - I'd be inclined to make this something like "The seq attribute has an upper bound of 2147483647 (2^31 - 1). If this upper bound is reached the following RTT element will reset the seq attribute to 0, i.e. at the upper bound the values would be (in successive stanzas) 2147483646, 2147483647, 0, 1)" or words to that effect. It's not clear to me why setting seq to a random initial value should help with MUC or multi-resource cases - in these cases you know the full JID of the entities involved and a random start point seems to make it harder to understand what's going on, rather than easier. "The event attribute MAY be omitted from the <rtt/> element during regular real-time text transmission" - what is the the alternative you're allowing clients, and what is "regular real-time text transmission"? 4.2.2 - "Recipient clients MUST initialize a new real-time message for display" - how things are rendered in clients are generally not in scope for XEPs, maybe just remove 'for display'? 4.2.2 - "Senders MAY send subsequent <rtt/> elements that do not contain an event attribute" if clients want to always send event attributes, what would they send? 4.2.2 - "Recipients MUST treat 'reset' the same as 'new'." - I'm not sure that's quite right. If recipients want to render 'new' differently that seems to be fine. Maybe "Recipients MUST reset the state of the current real-time message when receiving a 'reset' (returning the real-time message to the same state as when receiving a 'new')"? 4.2.2 - event='init' - I'm reading the XEP linearly so maybe this will be clear later, but at this point in reading the XEP it's not clear to me what the inclusion of event='init' buys us. 4.2.2 - The normatives here don't seem to be congruent. event='cancel' is OPTIONAL, yet we have a SHOULD for behaviour on receiving them. Why not require recipient support? 4.2.3 - I don't think the intent here is clear. Particularly it's not OPTIONAL if you're doing RTT correction. So I think we need to tighten this up. There's a choice on discovery and it'll affect what needs to be said. Choice 1) If you implement 308 and you also implement 301 you MUST support (at least receiving) RTT correction and ids are not OPTIONAL and MUST be included on the correction RTT. Choice 2) You can implement 308 and 301 yet not support RTT correction - in which case supporting RTT correction is OPTIONAL, but if you do you MUST advertise appropriate disco features and MUST include ids etc. 4.3 - "The delivered message in the <body/> element is displayed instead of the real-time message" - maybe "The content of the <body/> element is considered the final text, rather than the state of the RTT calculations"? 4.3 - "In the ideal case, the message from <body/> is redundant since this delivered message is identical to the final contents of the real-time message." - can we s/message/text/ here? Calling child elements of <message/> stanzas messages seems potentially confusing. 4.3.1 - Is this redundant? 4.4 - The discussion of throttling here feels a bit odd. I don't like having references to servers dropping messages as part of congestion handling, as that's not compliant behaviour. The comments about 0.7 seconds being fine for not hitting throttles but smaller values hitting it seems a bit hit-and-miss - servers are free to implement whatever throttling they want, and I'm a little worried about recommending here what we think the state of the network is likely to be now or in the future. 4.5 - "the recipient can watch the sender" - this isn't quite right (similar to previous comment). 4.5.1 - I'm not sure that the use of quite cryptic one-character elements here is terribly useful. 4.5.1 - I think this has been commented on elsewhere, but using 'characters' here seems to be less clear than talking about code points. I understand the desire to mask implementers from needing exposure to code points, but I don't think that's going to ultimately help uptake or interoperability. 4.5.1 - I think if there are going to be SHOULDs in supported features we should try to explain in what circumstances it's acceptable to ignore the SHOULDs. 4.5.2 - Talking about message length here probably needs clarification - is it the number of characters (whatever they mean to different people), code points, normalised code points, octets on the wire... 4.5.3.2 - This might become clearer later, but at this stage it's not clear what 'positions' are. 4.5.3.3 - Apart from adding complexity I'm not sure what forward delete is buying us vs. backspace. 4.5.4 - I don't think trusting that nothing in the chain is going to transform unicode in any way is going to be sufficient for interoperability here. I think we need to consider normalising the text before RTT calculations are performed on it. I'm not entirely convinced, without going through specs in some detail, that an implementation that does choose to do normalisation somewhere on route is non-compliant, which is what's asserted here. 4.5.4.1 - Ah, OK. So you do require normalisation here - you need to say which type is required. 4.5.4.2 - This then forbids normalisation again. 4.5.4.2 - Question for Unicode experts. Are there any code points that would be illegal to transmit on their own, but are legal in combination with others? If so, they'd get rejected with stream errors, which would probably be bad. This section seems to imply that illegal UTF-8 encoding is expected, which is in turn illegal XMPP. 4.5.4.3 - "A single UTF-8 encoded character equals one code point" - this isn't true, is it? 4.5.4.3 - "different internal encodings (i.e. string formats) that is different" - s/is/are/ 4.6 - "XMPP servers may drop <message/> elements (e.g. flooding protection)." - They can't. 4.6.1 - I think they need to do more than increment and check they increment - I think they need to increment/check in steps of 1. 4.6.1 - "Recipients MUST keep track of separate real-time messages per sender, including maintaining independent seq values" - I think what you mean is that they "MUST track RTT per full-JID, and not collate across multiple full JIDs", rather than the present text, which suggests that they must track multiple RTT streams for a single full JID without providing guidance how. I think this needs tightening up to be clear of the intent. 4.6.2 - "Recipients MUST freeze the current real-time message" - it's not clear what freezing a message means. 4.6.3 - "Retransmission SHOULD be done at an average interval of 10 seconds during active typing or composing." - this seems like a lot of data getting sent across if these messages are large. I'd be much happier saying something like "Retransmission SHOULD NOT be done more frequently than once every 10 seconds 6.1.4 - "it is acceptable for the transmission interval of <rtt/> to vary" - yet earlier there was a SHOULD saying it doesn't vary, wasn't there? 6.2.1 - I suspect this should be more prominent than buried inside Implementation Notes 6.2.1 - I think that presence decloaking is probably a better approach to this than sending init. 6.2.1 - That said, if people disagree and want another 85-ish non-disco mess, I think this can be clarified a bit - at the moment it sounds like disco and init discovery are alternatives, rather than init only being a fallback for when disco isn't available. Perhaps something like: """ Activation of real-time text in a chat session (immediate or user-initiated) can be done by: * Immediately transmitting real-time text (if the feature is advertised in by the recipient, as described in Determining Support); or * Where Disco knowledge isn't available (e.g. sending to an entity for which presence information isn't available, and thus the full JID isn't known and can't be queried) by sending a <message/> stanza containing only a "<rtt event='init'/>". In this case there MUST be no further transmission of RTT elements until the recipient indicates support - either by exposing information necessary to use service discovery, or by replying with a (non-cancel event) RTT element of its own. """ 6.3 - "All action elements only have absolute positioning, and positioning does not depend on previous action elements" - this isn't true, positioning is dependent upon processing of previous action elements - a deletion will effect a change of index in all subsequent code points. 6.4.1 - It might be useful to reference some method of calculating this. It's not immediately obvious to me that it's trivial to work out edits without resorting to something that ends up polynomial in the worst case (or oversimplifying the edit), so some guidance would be handy here. 6.4.3 - this says that implementations "may" do this, and I suspect that it really is discouraged rather than truly optional (indeed, the language elsewhere says as much). 6.4.4 - this looks like something discouraged, too, but this isn't mentioned that I can see. 6.5 - "Upon receiving Action Elements in incoming <rtt/> elements, they are added to a queue in the order they are received. This provides immunity to variable network conditions, since the queueing action smooth out the latency fluctuations of incoming transmission." - it's not clear to me that it's the queuing that does anything to the latency. Also 'action *will* smooth out'. 6.5 - " In addition, it is best to process <w/> elements using non-blocking programming techniques." - I don't really know what this is doing here. 6.6 - "There are other special basic considerations" - isn't that nearly oxymoronic? 6.6.1 - "For specialized clients that send continuous real-time text (e.g. news ticker, captioning, transcription, TTY gateway), a Body Element can be automatically sent when messages reach a certain length. This allows continuous real-time text without real-time messages becoming excessively large." - Is this true? Sending a body means you reset the state to the content of the body and terminate that RTT message, which doesn't seem consistent with continuing RTT. 6.6.3.1 - This doesn't seem like the wrong approach if RTT is wanted in a MUC (at least until we have per-MUC disco stuff), but I'm somewhat worried about the effect this has as an amplification attack. I don't know what we should say here, but if people can have a think it'd be good. 6.6.3.2 - this seems inconsistent with an earlier section that (I think) was recommending or mandating support for multiple full JIDs. 6.6.5 - seems somewhat out of place. How many systems are there these days that can't keep up with a human typist? And telling people that they need to make their applications flicker-free just seems odd. 6.6.6 seems redundant. 7 - these examples seem to be to a bare JID, and therefore can't have had caps already indicate support, but lack support discovery. It'd be good to note this. 7.4.2 - this includes an RTT including a wait in the element with the body - but once the body is received the RTT state is discarded and the body replaces it, if I remember earlier in the XEP correctly (and it was quite a while ago now). 8 - Why are we picking out Google Talk as an XMPP exemplar? 8 - Why are we telling SIP clients what specs to use? 8 - All of this section seems somewhat out of place in a XEP. 10.1 - "It is important for implementers of real-time text to educate users about real-time text. " - this doesn't really seem right. 10.1 - I think a sensible Privacy note would be to make RTT opt-in. 10.2 - "also needs to also " 10.2 - "(e.g. deferred XEP-0200)" - just XEP-0200, I'd have thought. 10.2 - I think blaming encryption for the increased number of stanzas RTT generates is a little disingenuous. 10.3 - "The nature of real-time text result in" 10.3 - "than may otherwise happen in a non-real-time text conversation. This may lead to increased" s/may/would/ s/may/will/ respectively will remove normative language. 10.3 - "including stanzas dropped by an overloaded server" - I think "including stanzas dropped during a network or server error" would be more appropriate. 10.3 - "Use of this specification in the recommended way will cause a load that is only marginally higher than a user communicating without this specification." - do you have numbers for this? It seems quite counterintuitive, I'd expect it to increase the server load due to message routing roughly by a factor of the number of RTT transmitted between each typical <body/>. 10.3 - "Bandwidth overhead of real-time text is very low compared to many other activities possible on XMPP networks including in-band file transfers and audio" - This is a little disingenuous where IBB is a fallback, and audio never travels over the XMPP network. I'd remove the line completely. 14 - (I appreciate the acknowledgement, thank you) 14 - It's usual in XEPs that acknowledgements are done personally rather than by affiliation, so I think it'd be sensible to just leave the names in and remove affiliations. 14 - I find the comment acknowledging the invention a bit odd. It's assumed that the XEP is your own work, and "invention" is a term I've more commonly come across in relation to patents - I assume there isn't a patent associated with this that you're assigning to the XSF? Appendix B - it's usual to just have author name, email and JID here. We don't generally link out to the authors' websites. /K
