Thank you for your comments! I will address the most important one first, because it deserves its own reply:
> This is somewhat counter-intuitive, as if network latency is > consistent, the edits will arrive at a similar sort of rate to that at > which they were transmitted and if it's not consistent, the edits are > unlikely to arrive in a timely manner. I have a suspicion that what's > happening here isn't the use of the delays that make the transmission > delays less noticeable, but rather that the act of rendering the new > text slowly is itself masking network latency. I wonder if there's a That's not the issue. Our open-source software code now INCLUDE all the following 'experimental' adjustments: - Higher and lower intervals. - Displaying text instantly as we received text. - Displaying text in a time-smoothed manner. - Delay codes (Natural Typing) We found that the last bullet was vastly superior to all the other options. (See below for a description of our experiences.) In our greatly simplified resubmission of the spec, we currently plan to keep delay codes in the second draft of our specification. We will also, before then, release open-source code so you can test the various approaches we have done, which are already enabled (including artificial delays) We have found many other ways to majorly simplify the standard without removing the delay code feature completely. Let us release some source code & the demo of our software first -- before we consider other approaches (which may include turning delay codes to a private extension to a public standard, since we already have two software packages that will keep delay codes). Many deaf people are often used to noticing typing variances. People type "What? Are you nuts?" differently than "How now, brown cow?". Hurried and erratic typing versus slow and relaxed typing. There are a lot of subtleties in typing that can only be transmitted with delay codes. Tired people have more errors, excited people often type fast, relaxed people often type slower. When a deafie is familiar with talking to the same person over real time text for a long time (such as via a Text Telephone or TTY -- see Wikipedia -- Plus, it was found it makes it easy to distinguish copy&pastes away from natural Internet bursting (i.e. mobile connection with highly variable ping, or a long XMPP interval). The typing look actually conveys a small percentage of the 'emotion', which adds further to the context (sarcasm versus genuineness, etc). It turns real time text into a high-def experience for some of us. Here are the findings: -- Lower intevals: This works wonderfully on LAN and fast connections. We even tried extremely low transmission intervals such as 5 milliseconds, to make the typing look 'natural'. However, Google Talk started to drop XMPP packets once fast typists were sending about 10-12 XMPP packets per second. (A typist typing 120 WPM types about 10 keypresses per second). If we raise the interval to 50ms or 100ms, we're still sending 10 XMPP packets per second, but the bursty look starts to marginally become noticeable to our target audience of the specification. XMPP servers started to work sort of reliably beginning at around 300ms (3 XMPP packets per second). But at this point, real time text quality started to significantly degrade, to 1000ms and then to unusable at 3000ms interval. Server-unfriendly, user-unfriendly. -- Display text instantly as we received text: This leads to bursty look. The bursty look was noticeable even down to 100ms for most typists, and even at 50ms for fast 100 WPM typists (such as me). Trying to simulate natural typing through short intervals is not practical, and it's not very friendly for XMPP servers if we send 10 XMPP packets per second. User-unfriendly. Also short intervals are bunched up anyway over congested connections, satellite, mobile, and dial-up connections, so 100ms may look like 500ms interval because the delivery of messages are 'clumped' together. Also, at longer intervals (i.e. 2000ms and up) it becomes hard to to tell apart typing from copy & pastes. -- Dispaying text in a time-smoothed manner: Artificial delays inserted between characters actually looks pretty good albiet somewhat unnatural looking. The delay is calculated by the number of characters (and/or number of backspaces and cursor movements), and dividing the interval with that value, and using that as the smoothing value. However, time-smoothed text masks out 'emotion' in the typing. And copy & pastes can look funny unless there's extra complexity in the client to distinguish sudden-output text from non-sudden-output text. Also, when ping becomes variable (random congestion, mobile connections with fluctuating reception, etc -- we tested laptop tethered connections too), time-smoothed display looks somewhat erratic and even more distractingly unnatural. Also, I was surprised to find that using good-looking time-smoothing can be more complex than delay codes (assuming we continued to use a 'edit code' or 'conrol code' based system of real time text) because we still needed to use non-blocking methods of delays such as timers or multithreading. -- Delay codes: This was the eureka moment. When we did this, delay codes made the typing look like local typing, and looked exactly the same, regardless of 100ms interval or 3,000ms interval. Typing looked natural the same over high-speed, as well as satellite and dial-up connections. It looked the same at 1,000ms ping as at 5ms ping. It even looked natural even over highly congested dial-up connection!! (Ever tried SSH while doing an FTP transfer over dial-up Internet?). It was much easier to tell the person's original typing 'emotion'. It was easy to tell apart copy and pastes. To explain this easier, let's think of VoIP: VoIP is essentially a series of packets of small recorded snippets of voice. Likewise, real time text with delay codes (natural typing) is a series of packets of small recorded snippets of typing (including original key press delay, cursor movements, etc). To use an overused cliche phrase, it turns real time text into a "high-definition experience". I plan release of the open source code on SourceForge or Google Code -- We came to the conclusion that for many of us, delay codes are a critical inclusion to the spec, so the spec must be at least compatible with a private delay codes extension (that also works properly with other edits including deletes and pastes). But for the second draft, we plan to keep delay codes included at least until everyone has tried it out (or at least seen a video of it of side-by-side demos -- we plan to make some). We already have two software packages, including the open source code, which I now plan to release under a permissive open source license (such as Apache 2.0) to help accelerate adoption. Any remaining complexity of the specification is also compensated by the good-will release of permissive open source code. The open source license we plan to use, permits use in either open-source or commercial/proprietary projects. This will maximize adoption amongst our peers. Our timeline for releasing the open source software sometime before the end of March. We will then submit an updated specification right after the source code is released. No doubt, that between now and then, I'll be picking your comments and feedback about specific things (i.e. various excellent standards simplification comments including those that have already been said, etc) Regards, Mark Rejhon
