Re: [Standards] Review: XEP-xxxx: In-Band Real Time Text

Mark Rejhon Thu, 03 Mar 2011 07:11:13 -0800

On Thu, Mar 3, 2011 at 4:15 AM, Kevin Smith <[email protected]> wrote:
>> - Whole message retransmit: See scalability problems that I explained,
>> with Google Talk allowing 2000 character messages, and real time text
>> conversations often causing people to hit Enter/Send less often.
>
> This is unfortunate - XMPP mandates a minimum maximum stanza size of
> 10,000 bytes (it may be larger in deployments, but not smaller), and a
> sever MUST return an error if this is reached.


Actually, that's the user message size.
Google still meets large stanza sizes (i.e. 10K), they just limit the
textbox to 2,000 characters.
But that's still a large message to be re-transmitting every single
keypress, if I didn't use delay codes.


>> - Longer messages often happen in real time text.  When real time text
>> is enabled, it was observed that some people have a tendancy to send
>> fewer but longer messages.  It was observed there was less incentive
>> to rush and hit Enter or click Send.
>
> Ok. I can't imagine writing an essay into my IM client, but maybe I'm strange 
> :)

A portion of my testers started to rarely hit Enter between sentences.
 For a couple of testers, there were several messages during a
30-minute period that exceeded 1,000 characters.  (They treated a
single message as one continuous IM conversation!  Two persons typing
real time text can have many back-and-fourth responses in just one
message.  Some deafies are used to this with TTY)   Observed habitual
change, does indeed contribute to a spec decision, steering me to a
transform method of edits instead of a retransmit method.


>> I can eliminate talk about out-of-order detection after what you said.
>> We will still need missing message detection so I think the 'seq'
>> attribute needs to stay.
>
> Well, dropping messages without sending an error is a bug, and 198
> eliminates most of this, but ok - I'll mostly buy this.

I'm still trying to figure out ways to simplify this while meeting at
least the goal of lost-message detection.

There are other methods I can do instead of 'seq':
-- Use a rudimentary integrity check, checksum or CRC.  After client
transforms message, check its CRC against the transmitted CRC.
-- Determine sequence from the <message> attributes.  This would work
with Google Talk which contains incrementing identification string
containing alphanumeric text that is suffixed by an incrementing
number, if I parse the string to get at the incrementing number.  But
that's Google specific.
-- Just take the risk on mangled messages.  Personally, this is
something I would rather not do.

What's your opinion on the 'msg' attribute?
-- I can easily remove it, with relatively few disadvantages.  It is
useful for supplementary error detection, in case a critical message
was lost (i.e. message with <body> and message with <rtt 'type='new'>
attribute).  Without seq, this would cause real time text of the NEXT
message to start mangling up the current message because the client
thinks it's not yet moved onto the next message.  But if I keep seq,
this is mostly moot (though it does eliminate creative methods of
error recovery that the developer is able to do, but we're simplifying
the specification here.)
-- However, the msg attribute provides a convenient mechanism for
future retroactive message editing extensions and text editing
extensions.
-- For a client that supports retroative message editing, msg could
also be used in lieu of your (Kevin) last-message correction protocol.
 We need to make sure any potential conflict is resolved (i.e. that
our specs are compatible with each other at least).
-- msg can still be used as a private extension if I ever need to do a
niche-market product that absolutely requires 'msg' for improved
integrity and/or retroactive message editing.


> I don't think this is either here or there at this point, but that's
> still a client rendering issue - clients can choose to render these
> changes however they want, and bright pink in size 72 Comic Sans is an
> option if they want to draw attention to this. I'll buy the cursor
> movement as well, though, in the two-tier model.

You're right that part is a client rendering issue.  My reply was
chiefly to justify the original inclusion of cursor movements in the
spec.   But the visual implementation of the edits and cursor is
definitely a client rendering issue -- it could even be special
color-coding of edits (edits automatically being colored differently).
 That's up to the programmer.   Needless to say, stuff beyond the
scope


>> - Fragment system: Makes it difficult to do message editing.  If we
>> disable editing, it forces the clients 'send box' to behave
>> differently when real time text is enabled.  If you observe my
>> animated video of real time text (already mentioned earlier), you'll
>> see that real time text can be a bolt-on enhancement to a pre-existing
>> user interface.
>
> I don't think (until now) that anyone has proposed a solution that
> didn't allow editing.

Briefly I thought of a new idea: A fragment system that allows
editing.  A fragment delta system.  Loop through 100-character blocks
of the message and find those 100-character blocks that has changed
since the last check of the message, and only transmit the changed
blocks.      But I dismissed the idea, because if somebody deletes
text at the beginning of a long line, and starts typing text at the
beginning of the line, all fragments get retransmitted every single
keypress.  Plus, having two different standards is not good -- One
standard for Tier 1, a totally different standard for Tier 2.


>> - Transform system: It will make Tier 1 compatible with receiving Tier
>> 2 communications, simply by ignoring the Tier 2 specific transforms
>> (edit codes).
>
> It looks like this is probably what's needed.

Now that we agree I will continue to keep a transform method (whether
it's XML based or other methods -- I actually investigated the ISO6429
method too, before finding using a tiny 4-code subset of ISO6429 was
more complicated than using an XML transform method)

Technically, to simplify Tier 1 even further (for certain kinds of
software implementations), I could even merge the "Backspace" and
"Delete" code into one erase code.  A backspace would simply execute
the equivalent of a delete code with a repositioned cursor.  Or the
converse -- a delete would execute the equivalent of a backspace code
with a repositioned cursor.  It may make Method #3 of section 3.9
"Methods of Detecting Message Edits In a Client" ever slightly more
more complicated with a little extra math needed to emulate the other
when one is not supported.

The question is.... What is simpler?  Keeping both Backspace/Delete
versus merging Backspace/Delete into one erase code?  There would be
absolutely no code size differences in my implementation either way,
because removing one or the other, adds some extra lines of code to
the other to calculate the new position of the cursor to emulate the
other code.   Maybe an easier-to-read spec is more important?  Or will
people begin asking "Where is the backspace code?" (then I start
having to explain that backspace can be done using the delete code
with a repositioned cursor).

Discussion welcome about spec simplification!
Spec editing progress: Currently, I'm focussing on "remove-the-fluff"
first, but soon I'll come to actually modifying and simplifying the
actual protocol (if needed, and where possible)

Mark Rejhon

Re: [Standards] Review: XEP-xxxx: In-Band Real Time Text

Reply via email to