Sending reply that covered the bullets.
Also, some of this is already outdated, as I already made some changes
(independently). You might want to review which bullets still apply,
and which bullets are outdated.
---------- Forwarded message ----------
From: *Mark Rejhon* <[email protected] <mailto:[email protected]>>
Date: Mon, Jul 9, 2012 at 5:26 AM
Subject: Re: [Standards] UPDATED: XEP-0301 (In-Band Real Time Text)
To: XMPP Standards <[email protected] <mailto:[email protected]>>
On Mon, Jul 9, 2012 at 3:51 AM, Gunnar Hellström
<[email protected] <mailto:[email protected]>> wrote:
This looks good. I have some comments, but very few influence the
protocol.
So even if there are minor adjustments to do, the spec looks mature.
Excellent comments, and the vast majority of your comments is useful
-- most of your change will be implemented.
I will address comments to the ones that needs further discussion:
5. Section 2.4, Title. Change to "Usable for mainstream and
accessibility purposes."
The current heading is a single word: "Accessible" -- I prefer to keep
it because it's correct, clear, and short. It catches attention of
the accessibility people better, including the Access Board that has
already contacted us, about the specification too. The word
'mainstream" is also mentioned at end of section 1, also even in
section 2.4, in section 6.2. I therefore believe that I've carefully
balanced accessibility and mainstream, and satisfied both targets,
while aiming to achieve the goal of eventually becoming an important
part of future Accessibility standards.
(That said, I'll fix the first bullet)
14. Section 4.2.2 event='cancel'. How does this behave through
multi-user chat and multiple login situations? Is the
event='cancel' sent through to all? I see a risk that one user
sending event='cancel' would turn off rtt for all recipients. If
this is true, I see three solutions:
a) Delete event='cancel'. b) Add a sentence saying "event='cancel'
SHALL not be used in a MUC or multi-login session. c) Add a
sentence saying "event='cancel' SHOULD be ignored in MUC and
multi-login sessions.
1. It is appropriate for a multi-login session; there is no issue with
using the cancel during a multi-login -- it is completely appropriate
for multi-login. (Regardless of whether or not you cancel
before/after switching, and regardless of whether or not you
reactivate before/after switching logins. All scenarios result in
acceptable behavior.)
2. I already mention that it should not be used for MUC, in the MUC
section: http://xmpp.org/extensions/xep-0301.html#multiuser_chat
I have a slight preference for solution a), to delete cancel from
the specification.
If it is deleted, also the sections in 6.2.1 and 6.2.2 dealing
with "cancel" shall be deleted.
It is already optional. But some implementations need it.
For example, one party clicks a button to turn off real-time text.
This specific implementation requires ability to synchronize the
disabiling of real-time text.
How do we notify the other end of the intent to end a real-time text
session?
Example use case:
- A party activates real-time text by pressing a button.
- Both ends synchronize the enabling of real-time text via <rtt
event='start'/>.
- A party deactivates real-time text.
- Both ends synchronize the disabling of real-time text via <rtt
event='cancel'/>.
Various methods of synchronizing activation/deactivation of real-time
text is listed at:
http://www.xmpp.org/extensions/xep-0301.html#activating_and_deactivating_realtime_text
<http://xmpp.org/extensions/xep-0301.html#activating_and_deactivating_realtime_text>
Certainly, not all implementations necessarily need to follow the
above behaviour (maybe your implementation doesn't need it).
However, there are other vendors that definitely need to be able to do
this behaviour (after displaying a confirmation prompt)
As a result, I cannot remove event='cancel' and deny the other vendors
the ability to synchronize the disabling of real-time text.
That said, unidirectional real-time text is allowed by XEP-0301, so
synchronizing the enabling/disabling of real-time text is not a
requirement, but some vendors require this ability (much like
synchronizing enabling/disabling of audio/video after a confirmation
prompt). I intend to respect both behaviours.
16. Section 4.4, line 3, after "conversation", add "in most
network conditions". On GPRS, having 1.5 s network latency, the
usability requirement will not be met, and that must be accepted.
( F.700 requires 2 seconds end-to-end for usable real-time text
and 1 second for good real-time text. )
Technically you're right. I'll make this wording adjustment, since it
is what F.700 says for technical compliance purposes.
....That said, real-world usability comment: I would like to comment
that the innovation of encoding key press intervals (
http://www.realjabber.org/anim/real_time_text_demo.html ) gives an
approximately 1.5x-2x multiplier to the maximum usable latency. i.e.
a NRTT (Near-Real-Time-Text) "bursty" conversation with a 2 second
latency, is more uncomfortable than an NRTT "smooth" conversation with
a 3 second latency with key press intervals being encoded. That
said, I'm speaking via real-world usability comments by actual users,
as the RealJabber open source software has a tester's latency interval
adjustment for usability trials that I have tried with several people.
But I agree -- we need to be consistent with the F.700 definition of
"real-time" for real-time text conversations.
18. Consider deleting the "Forward Delete" d action element. It
cannot be used with the default value for p because that would
point outside the real-time message. Therefore, a p must always be
calculated and included. Then it is equal in complexity to use it
as Backspace. Having both just seem to add complexity to
implementations. ( It would have been different and of value if it
worked from a current cursor position.) But if you have good
reasons, e.g. easily matching some editing operation result, you
can keep it.
The idea of merging Backspace and Delete is an idea. Eliminating
Backspace is not appealing because it makes Backspacing more
inefficient (because <e/> can be used without the p and n attributes,
to backspace from the end of the message, while Forward Delete action
elements require a position attribute even when deleting text at the
very end of a message). Eliminating Delete would simply force the
requirement of using the Backspace element to simulate the Forward
Delete operation, making it slightly more complicated for
implementors, but would certainly be doable.
That said, I'm not sure it's the best way to proceed: Eliminating the
Forward Delete, forcing implementors to use Backspace to do all block
deletes "backwards", as well as for Forward Deletes. I'd love to hear
comments from other people about the merits of merging Backspace and
Delete, by eliminating the Forward Delete action element. (since
Backspace can do Forward Deletes, you'd just simply need to do a
little bit of math to pull it off.) Testing out RealJabber, the
Optional Remote Cursor behaves very well in all combinations (senders
having it or not interoperates fine with recipients having it or not,
in any possible combination), and the action elements are compact in
all situations. But if one action element was removed, it would
specifically be only the Forward Delete, due to Backspace's being more
common, and its importance on being simple and shortest (no attribute
required) during the simplest situation of simply backspacing at the
end of a message. Comments from others?
However, the danger of omitting the p attribute is overstated:
(1) It's harmless -- to omit the p attribute in a properly implemented
implentation
(2) It's moot -- because two places already show the p attribute is
required
(1) Detailed info about "harmless". The use of a default value of p
is harmless if you follow "Summary of Attribute Values": It simply
"does nothing" because you're deleting non-existent text. It's the
same behaviour as trying to hit the Delete key when the cursor is
already at the very end in Microsoft Word: The action does nothing. I
already say the following:
*Note:Excess deletes MUST be ignored, with text being deleted only to
the end of the message in this case.*
/(Cited from
http://xmpp.org/extensions/xep-0301.html#element_d_forward_delete )/
Which is exactly what happens when you try to hit the Delete key at
the end of a document in Microsoft Word. The Delete key does nothing.
This is exactly the same thing. Thus, therefore, it is harmless
(unless there was a spec violation, a bug, etc). Also observe if
illegal values are used, it is already covered here:
*/Senders MUST NOT use negative values for any attribute, nor use
p values beyond the current message length. However, recipients
receiving such values MUST clip negative values to 0, and clip
excessively high p values to the current length of the real-time
message. Modifications only occur within the boundaries of the current
real-time message, and not other delivered messages./ *
/(Cited from
http://xmpp.org/extensions/xep-0301.html#summary_of_attribute_values)/
/
/
(2) Detailed info about "moot". I already clearly show in two places
where it's not allowed anyway because it's a useless argument.
I show that it's not even valid to omit this attribute for Delete:
http://xmpp.org/extensions/xep-0301.html#action_elements
I also even show that it's a required attribute in the XML Schema:
http://xmpp.org/extensions/xep-0301.html#xml_schema
19. Section 4.5.2, third bullet point. I would like to see the
words "Unicode Code Points" replace "Unicode Character Counting".
Code points is the safe base that we count.
I use the word "characters" in other parts of the specification, such
as Action Elements:
http://xmpp.org/extensions/xep-0301.html#action_elements
Therefore, it is my opinion that it helps people immediately figure
out faster, because
(1) The Table of Contents index stays easy to read by "Unicode
Character Counting"
/(Someone might think: "huh? why does the section exist? I guess
there's some special gotchas about counting characters!")/;
(2) People who don't really know what "Code Points" are, more quickly
associate the specification's definition of "Unicode Characters" as
really meaning "Code Points", because I already mention the word
"characters" several times elsewhere in the specification. The person
then figure out it's simply the metric we are counting "Characters" as.
(3) It's compatible with "UTF-8 encoded characters", which is the same
as a code point.
Therefore, I think more people will figure it out if I keep the
heading "Unicode Character Counting", and go on to explain code
points, rather than use the heading "Unicode Code Points" and suddenly
be a more confusing Table of Contents. That said, I'd like to hear
other people's opinions, there might be multiple different schools of
thoughts, as well. That said, if nobody raises objections, I will be
keeping the heading the same, for the above three reasons.
0. Section 4.5.4.1 At the end, insert paragraph: "Characters
consisting of multiple Unicode code points SHOULD be sent together
in the same <t/> element. Values of /*p*/ and /*n*/ SHOULD NOT
result in pointing within such combinations of code points." (
this is to avoid the situations described with the long note to
section 4.5.4.2. The actions to avoid it should be more on the
sender side as I propose here.
It is a good recommendation, but it does complicate "treat it as an
array of code points".
My implementation of "Monitoring Message Changes Instead Of Key Presses"
http://xmpp.org/extensions/xep-0301.html#monitoring_message_changes_instead_of_key_presses
(The most recommended method out of several "Real Time Text
Transmission Methodologies" in section 64)
An implementation is EncodeRawRTT() in the open source Java source code:
http://code.google.com/p/realjabber/source/browse/trunk/Java/src/RealTimeText.java?r=24#592
As you can see, it is easiest to implement these kinds of
implementations without worrying about adding the further complexity
that you suggest.
So I am not 100% convinced I should add the sentences that you suggest.
Although I do observe you are using the word "SHOULD" rather than
"MUST". (I'd use "strongly suggested" since I've eliminated RFC2119
for Implementation Notes now, upon Peter's advice)
23. Section 4.5.4.2 The Note is correct, but very long. I would
like to see it shortened but have not wording proposal at the
moment. It aims at avoiding situations that I suggest prevent by
my proposal 20 on the sender side.
Preventing this would significantly complicate implementor's ability
to implement:
http://xmpp.org/extensions/xep-0301.html#monitoring_message_changes_instead_of_key_presses
So, this is a trade-off.
28. Chapter 5. last paragraph. I hesitate a lot about this simple
way of detecting support. We need a proper way to detect RTT
capability before we start to use it. There may be systems that
have to select between different protocols for RTT, and they
should not need to start sending in one protocol to try to
discover if RTT is supported. Still, I realize the convenience of
this simple method, and would let a discussion decide if it shall
be kept.
If it is kept, the paranthesis characters on the second line
should be deleted so that the rapid response on this initiation is
made part of the protocol.
Several people already agreed with the method.
For accessibility compliance, I don't like blocking the sender's
abiity to initiate.
(I mentioned it metaphorically: there's no padlock on the originating
phone's keypad.)
I should observe that sending a single <rtt/> element uses less
bandwidth than using disco.
Also, it produces a huge advantage in simplifying some implementations
of real-time text, by allowing detection, initiating and suspending
real-time text, completely "in-line" with the <rtt/> element. It's
not xmpp-ish, but:
*Disco is flawed from an Accessibility Perspective. It might not be
compatible with future accessibility legislation.*
(1) As you saw from previous messages, several people have convinced
me that there are implementations that will turn off RTT by turning
off disco. Unfortunately it looks like I can't enforce disco to be
always available, even when recipients turn off RTT. Therefore, I'm
not going to use it as the primary method anymore. I already wrote an
earlier public message, that I am not going to be a willing author of
XEP-0301 that does not give the "sender a chance".
(2) There is risk that senders will "turn off disco" as the method of
deactivating real-time text; preventing sender's ability to event
signal the desire to start real-time text.
/*[Mark Rejhon's Edit --- this section above is outdated --- please
refer to the newer 'disco fallback' thread instead.]*/
33. Section 6.5.4. The default action for a non-completed message
should be to regard it completed after some time, not to clear it.
So, replace "clear (and/or save)" with "save".
I don't think it's appropriate to specify a specific action. An
example is empty real-time text. Sometimes from a security
perspective, it's better to clear -- especially during a
denial-of-service attack. You don't want to consume disk space or
screen space because somebody started attacking you with fake
real-time text that's easily detected as fake. I've had at least one
private email from one big vendor, in the last 7 days, discussing to
me about concerns that convinced me to introduce the "Stale Messages"
clause: Security!! (resource consumption concern)
So clear is a perfectly legitimate action in a DoS situation.
Thanks so much for your comments -- and I welcome discussion on my
replies to your comments!
Thanks!
Mark Rejhon