On Thu, Jun 28, 2012 at 12:18 PM, Kurt Zeilenga <[email protected]>wrote:
> On Jun 27, 2012, at 10:26 AM, Mark Rejhon wrote: > > On Wed, Jun 27, 2012 at 5:15 AM, Edward Tie <[email protected]> wrote: > >> And Chinese , Thais and japanse language ? > > > I did some tests with several languages, including Arabic, and it all > works over XEP-0301. > > did you test bi-directional text? > I wonder if there are any special considerations here.... > It works. No special considerations. The indexes and positions of action elements (for insertions and deletions) are relative to the string itself, not its visually displayed representation. Therefore, XEP-0301 needs to have no sense of the 'direction' of text. I tested copy and pastes of extremely weird Unicode art too, that also had embedded Unicode direction-change control codes. I tested editing, and both the sender/recipient ends stayed in sync with identical text on both sides in real time. (This is applicable to Arabic) The <rtt/> action elements is equivalent to editing an array of individual Unicode Code Points. It's just terminological -- "code points", "32-bit Unicode characters", "UTF-32", or "a single complete character encoded in UTF-8" (not an individual byte) ....Though we've settled on terminology "Code Points" to make implementors pay close attention to Unicode subtleties because "character" is misinterpreted very frequently. (char is a byte? char is 16-bit? char is a full Unicode code point?) It's the GUI's responsibility to decide what to do with the resulting display of the Unicode string, including its direction, including embedded text-direction-change codes, for a bidirectional Unicode string. (i.e. When the string is typically displayed by common OS calls to GUI, incrementing indexes typically shows up as rightwards movement in a LTR segment, and leftwards movement in a RTL segment) I even removed the words "left" and "right" from section 4.5: http://xmpp.org/extensions/xep-0301.html#summary_of_action_elements For example, for the backspace action element, Instead of "Remove n characters to the left of position p in message." I now say "Remove n characters before position p in message." (corrected terminology) This ensures the language applies independently of the directionality of the text. Thanks! Mark Rejhon
