(Peter, please go ahead and proceed to publish 0.5 I submitted ASAP -- so that people have context when reading this. Fixes can go into 0.6, if deemed important)
On Sun, Jul 22, 2012 at 6:00 PM, Gunnar Hellström < [email protected]> wrote: > 11. Section 4.1, Example 1, Line 9 , make the text part "my Ju" , so >> that it is obvious that it is not about word by word transmission. >> 12. Section 4.1, Example 1, Line 15 , make the text part "liet" only, >> so that it is obvious that it is not about word by word transmission. >> > 13. Section 4.2.2 event='new' third line. change "display, and then >> process" to "reception, and then process text and" . Because we must not >> assume that all applications display the text. " >> > > 11/12. Edit Deferred -- It is merely an introductory example. Also, if > people chunk text instead of preserving key press intervals, then > whole-word burst transmission is greatly preferred over broken-word burst > transmission. > > But why do you want to confuse the reader with giving the impression that > transmission is word-wise, when it is time-sampled in reality. I suggest to > accept my edit proposal in order to not cause wrong impression what it is > all about. > It's all a matter of perspective -- It is relative. I suspect more than half of people here would agree your suggestion is NOT simpler in this case because: ... Primary reason: My opinion is that the first introductory example MUST be as simple as possible. I think most would agree with me here. There is no wrong impression to convey here, because other subsequent examples are self explanatory on what's allowed (breaking up text, turning things into single keypresses, key press intervals, and the new example I added to v0.5, really makes it much easier to understand key press intervals.). But the bottom line, it is an introductory example, and the introductory example must be as simple as possible to explain. ... Secondary explanation: When displayed in forced-color-code XML on the website (i.e. published at http://www.xmpp.org/extensions/xep-0301.html)... the transmitted real-time text words are no longer separately color-highlighted like the draft copy in the Word version. So the full words make them easier to glance out than if they are fragmented words, too. ... Tertiary explanation: We need to view this specification from a less experienced developer perspective. People who are less experienced with protocols (we are protocol authors, other people are not), need to be able to see the simplest possible example (see primary rason) (Even if you only agree with one or two of the above reasons, that should be good enough, no?) 18. Consider deleting the "Forward Delete" d action element. It cannot be > used with the default value for p because that would point outside the > real-time message. Therefore, a p must always be calculated and included. > Then it is equal in complexity to use it as Backspace. Having both just > seem to add complexity to implementations. ( It would have been different > and of value if it worked from a current cursor position.) But if you > have good reasons, e.g. easily matching some editing operation result, you > can keep it. > > 18. Edit deferred -- Explanation given in long email. > > Forward delete just introduces complexity. Since you do not have the > concept of "current position" in the specification, a forward delete and a > backspace of anything else than the last character are equally long in > coding. But, if you want to have these two codings of the same operation, > I can accept it. > About complexity: It only adds 5 lines of complexity to the implementation: http://code.google.com/p/realjabber/source/browse/trunk/Java/src/RealTimeText.java?r=24#551 About reasoning: ... Reason 1. There are situations where it made a lot of sense to have the two separate, including recipient-side time-smoothed display which was something you also suggested. For example, <e n="5"/> can be automatically converted to the equivalent <e/><e/><e/><e/><e/> for time-smoothed display with the cursor animated backwards. And <d p='10' n='5'/> can automatically be converted to the equivalent <d p='10'/><d p='10'/><d p='10'/><d p='10'/><d p='10'/> for time-smoothed display with the cursor staying stationary. If we merged the two, then we can't have distinctive time-smoothed display of either. (As I recall, you're a strong proponent of time-smoothed display) But of course, it might not be that important, even to you. ... Reason 2. Ability to do accurate journalling of edits, for emergency purposes. However, this reason can become moot, especially if we're not using the 'n' argument, since a single-character backspace transmitted can be indistinguishable from a single-character delete operation (even for time-smoothed display). ... Reason 3. It slightly simplifies "Monitoring Key Presses Directly" for senders http://xmpp.org/extensions/xep-0301.html#monitoring_key_presses_directly ... (I know that's not the preferred method) ... Reason 4. It simplifies visualizing of text block deletes (i.e. cut operations), since you're deleting from normal start position. ... There are other reasons. I'd like comments from other people once the v0.5 is up on the page (hopefully by tomorrow), so other people have context on what we're talking about here. Let's wait till v0.5 is up so there's context... 19. Edit deferred -- Explanation given in previous email. It helps reader >> associate WHICH definition of "character" we are using. Even the RFC's say >> that the word has multiple interpretations, so it's appropriate here in the >> title. The title is like a glossary entry, and the contents explain we're >> using code points as the method of counting characters. >> > I still regard this dangerous and confusing. We are counting Unicode > code points, and that needs to be clear in all explanations. > We will have to agree to disagree -- I think it's safer and less confusing: Did you know there are 47 occurances of the word "character" in the whole document? Therefore, I prefer not to remove the word "Character" in the heading "Unicode Character Counting". Thus, it is like the heading of an extended * glossary* definition here -- and it is in my opinion safer and less confusing. Obviously, the section is too big to move to the glossary section, but I am open to alternate ideas of defining the word "character" from this mailing list. For this, I defer to public comment (once 0.5 is up). 20. Edit deferred -- I didn't like adding the paragraph either, but > following your suggestion will complicate implementations. If I do your > suggestion, it will no longer be easy to do "Monitoring Message Changes > Instead Of Key Presses" > http://xmpp.org/extensions/xep-0301.html#sending_realtime_text because I > would no longer be able to treat the real-time message as easily as if it > was essentially "an array of code points". You are a strong advocate of > this method too, and I'm sure you agree with me you don't want to > complicate section 6.4.1 > > > I think that typing of characters resulting in a multiple of code points > will result in these code points being submitted to display at the same > time, and therefore easily can be put into the same <t/> element. This is > valid for example for the combining diacritical marks 300 -36F, that > normally are displayed together with their base character. > http://unicode.org/charts/PDF/U0300.pdf > Usually nothing is displayed on the sending side until both have been > typed. > That is generally true, but there are situations where a single letter is refreshed on the sender end as it gains additional combining marks. You are familiar with this too, I imagine: A valid displayable glyph (or "displayable character", as some call it) rendered by multiple Unicode code points: An example is a standard Unicode character plus a single combining diacritical mark, such as an umlaut mark) gets expanded into a more complex displayable glyph, by the insertion of a second combining diacritical mark (such as a grave). Many environments require you to specify all marks beforehand to output the character all at once, but in some environments, the displayable character is immediately redisplayed after each added combining mark, in order to visually show the progress of adding multiple combining marks to the same displayed glyph. It can be a feature built into a textbox field, that cannot be overriden, at least for certain diacritic operations. In this specific situation, if you are doing "Monitoring Message Changes Instead Of Keypresses", doing this method would detect that the only changed code point is a single combining character. This would be transmitted by itself. That single code point is a valid transmission within an Insert Text action element <t>X</t> where X would be a single Unicode code point of a combining character (without any accompanying text), being inserted into the destination string to add an additional diacritic mark to an existing displayed glyph. This is a valid operation that can realistically occur. Putting both in the same t-element simplifies for both the transmitter and > the receiver. The receiver does not need to handle an outstanding > combinable diacritical mark waiting for its base character. > There would also be no risk that text in edits combine in an erroneous way > with already existing code points, before next message arrives containing > the correct second half of the character. > So, keeping combined characters together is a good goal and simplification > and should be adviced with a "SHOULD". > I do agree that the normative "SHOULD" is a reasonable, though that adds an additional sentence to the paragraph. This can be considered during public comments, or can even wait until LAST CALL. I'd like to hear feedback from others about section "Accurate Processing of Action Elements", once 0.5 is up. > Yes, good to distinguish between service discovery, and activating > support. > There is something missing in a sentence in version 0.4, chapter 5. > In order for an application to determine whether an entity supports this > protocol, where possible it SHOULD use the dynamic, presence-based profile > of service discovery defined in . > > What was your intention after "in"? > I don't see the error. It refers to XEP-0115. Perhaps it is a browser cache issue -- try the Refresh button at http://xmpp.org/extensions/xep-0301.html<http://xmpp.org/extensions/xep-0301.html#accurate_processing_of_action_elements> -- It should say "In order for an application to determine whether an entity supports this protocol, where possible it SHOULD use the dynamic, presence-based profile of service discovery defined in Entity Capabilities<http://xmpp.org/extensions/xep-0115.html> [14 <http://xmpp.org/extensions/xep-0301.html#nt-id229147>]. " This is actually a copy-and-paste from another XEP, and is very consistent with what most XEP's treat XEP-0115 as, so "Determining Support" in v0.4 and v0.5 is more consistent with other XEP's. You do observe I still, however, include XEP-0085 style implicit discovery, since I must have it work in all situations that chat states work in. In version 0.4, section 6.2 looks complex and need further restructuring > now before I can judge the final result of the protocol. > Regarding: http://xmpp.org/extensions/xep-0301.html#activating_and_deactivating_realtime_text Section 6.2 is actually rather simple if you interpret it from the flexibility of choice: I should point out: - Activation and Deactivation is optional, as I mentioned in the first paragraph - Some implementors definitely require activation/deactivation, including a method that can be similiar to the activation of audio/video. - Other implementers (like you) need it always active at all times. - If you support it, you don't need to support all activation methods -- just one. (i.e. "Accept after confirm" is okay, or even "Accept" only). This is just a list of suggested activation methods, you can support just one, two, three of those methods -- not a suggsetion to support ALL activation methods! (is this the part you're getting confused by? If so, I can adjust the wording to clarify.) - Tehnically, implementers can do anything (And implementers have asked for the ability to do so) -- it can be a button, it can be a menu, it can be a preferences/option, etc. We don't strictly define what's allowed and what's not allowed, how they do the UI -- this section 6.2 points simply points out general business rules of activation/deactivation and how it affects protocol. Even so, it's not even part of the Protocol section, since it's a purely optional section -- you can ignore this section and immediately begin transmitting <rtt/> if Section 5 Determining Support permits you to do so. That's it. For best interoperability, you can also listen for incoming <rtt/> in the abscence of Determining Support (i.e. an invisible contact sending you an <rtt event='init'/>) -- some as an invisible contact sending you an XEP-0085 Chat State without revealing themselves first and without providing disco first. (As you know already in previous emails, XEP-0085 section 5.1 also allows implicit discovery by sending a single chat state to signal support for chat states.) So essentially that means you are allowed to just simply begin immediately transmitting <rtt/> upon either (1) Determining Support permits you to do so, or (2) If you receive incoming <rtt/> elements (that single <rtt event='init'/> implicit discovery mentioned in section 6.2) ... There is implementer demand -- I am relieved including section 6.2 in XEP-0301 because many implementers have talked to me about *their own ideas* of activation/deactivation methods, which can technically not really interoperate very well. By covering general "business rules" for activation/deactivation methods, for those implementers that need real-time text to be activatable/deactivatable in a manner that they need to implement it in, I can ensure interoperability with other clients that implement a different kind of activation method. I realize this may be a hassle for those accessible clients that assume real-time text should always be attempted and always be turned on, but for such implementers, they have to see it as a necessary evil to accept that other implementers may not want to do that -- they might want an accept/reject mechanism of some kind. Section 6.2 simply provides generic business rules of activation/deactivation, if which followed, will eliminate situations of non-interoperability (i.e. two willing clients that decide not to talk to each other). It will interop between clients that always activate immediately (like what I described to you for assistive programs), and clients that chooses to do an activation method. All combinations will work, even in situations where contacts are in private mode/invisible (just like XEP-0085 chat states will work, too). Technically, even though I've strengthened "Determining Support" including XEP-0115 for parity with other XEP's, the spec still allows you to do implicit discovery (ala XEP-0085 Chat State style) so that it works in all situations that XEP-0085 works in, using XEP-0301 as an enhanced "Typing" chat state in some implementations. Peter and Kevin has indicated that this is acceptable, although I know Matt had some misgivings. (I did, however, make it much closer in spirit to XEP-0085.) I'd like them to review 0.5 when it is up -- I just emailed it to Peter yesterday, so any changes will have to go into 0.6, but I want more public comment by multiple people on the gray areas we are still talking about. Thanks Mark Rejhon
