On Thu, Jun 28, 2012 at 12:18 PM, Kurt Zeilenga <[email protected]>wrote:

> On Jun 27, 2012, at 10:26 AM, Mark Rejhon wrote:
>
> On Wed, Jun 27, 2012 at 5:15 AM, Edward Tie <[email protected]> wrote:
>
>> And Chinese , Thais and japanse  language ?
>
>
> I did some tests with several languages, including Arabic, and it all
> works over XEP-0301.
>
> did you test bi-directional text?
> I wonder if there are any special considerations here....
>

It works.
No special considerations.
The indexes and positions of action elements (for insertions and deletions)
are relative to the string itself, not its visually displayed
representation.  Therefore, XEP-0301 needs to have no sense of the
'direction' of text.

I tested copy and pastes of extremely weird Unicode art too, that also had
embedded Unicode direction-change control codes.  I tested editing, and
both the sender/recipient ends stayed in sync with identical text on both
sides in real time. (This is applicable to Arabic)

The <rtt/> action elements is equivalent to editing an array of individual
Unicode Code Points.
It's just terminological -- "code points", "32-bit Unicode characters",
"UTF-32", or "a single complete character encoded in UTF-8" (not an
individual byte) ....Though we've settled on terminology "Code Points" to
make implementors pay close attention to Unicode subtleties because
"character" is misinterpreted very frequently. (char is a byte?  char is
16-bit?  char is a full Unicode code point?)

It's the GUI's responsibility to decide what to do with the resulting
display of the Unicode string, including its direction, including embedded
text-direction-change codes, for a bidirectional Unicode string.   (i.e.
When the string is typically displayed by common OS calls to GUI,
incrementing indexes typically shows up as rightwards movement in a LTR
segment, and leftwards movement in a RTL segment)

I even removed the words "left" and "right" from section 4.5:
http://xmpp.org/extensions/xep-0301.html#summary_of_action_elements

For example, for the backspace action element,
Instead of "Remove n characters to the left of position p in message."
I now say "Remove n characters before position p in message."  (corrected
terminology)
This ensures the language applies independently of the directionality of
the text.

Thanks!
Mark Rejhon

Reply via email to