RE: 3 big bidi bugs

Jonathan Rosenne Wed, 29 May 2002 12:35:35 -0700

I don't think anything to do with 5 levels of imbedding or overrides can
be considered a big bug.


Jony


> -----Original Message-----
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED]] On Behalf Of Bernard Miller
> Sent: Wednesday, May 29, 2002 6:57 PM
> To: [EMAIL PROTECTED]
> Subject: 3 big bidi bugs 
> 
> 
> 
> This letter describes 3 major technical problems with the 
> current Unicode bidirectional algorithm as described in UAX 
> #9, version 3.20. Problems 1 and 3 have security 
> implications. Other problems with the whole Unicode 
> bidirectional encoding approach, and their solutions, are 
> discussed in the recently updated Bytext FAQ and 
> documentation (www.bytext.org).
> 
> (1)  Line width dependent mangling, general case:
> Step L2 of UAX #9 indicates that a line that resolves into a 
> sequence of characters with homogenous embedding levels will 
> ALWAYS be displayed right to left, regardless of what the 
> embedding level is.
> 
> So, for example a line that with the L1 resolved embedding 
> levels of: 2222222222222222222222222 will display right to 
> left 3333333333333333333333333 will display right to left 
> 4444444444444444444444444 will display right to left etc
> 
> Likewise:
> in 3333333333333333333333331, the 3�s will display left to 
> right in 5555555555555555555555551, the 5�s will display left 
> to right etc
> 
> It directly contradicts the writers intentions. It means that 
> different Unicode compliant applications will display the 
> same characters in a different order (depending on available 
> line width). Examples of how this is bad are given in 
> question 12 of the Bytext FAQ (www.bytext.org/faq#12). This 
> can be fixed by rewording step L2 such that a reversal 
> happens from the highest embedding level to each lower 
> contiguous embedding level, regardless if the embedding level 
> is represented by a character on the line, until the 
> embedding level of 1 is reached (or, as an optimization, 
> until the first odd embedding level equal to or lower than 
> the lowest embedding level represented by a character on the line).
> 
> (2)  Line width dependent mangling, spelling conventions for 
> quotes: What is the purpose of step X10 if not to allow 
> something like LEFT DOUBLE QUOTATION MARK to be used as if it 
> was an OPEN DOUBLE QUOTATION MARK? One simply puts an 
> embedding inside a quotation, such as �<RLE>quotation<PDF>�. 
> The problem with this is that it only works if the quotation 
> begins and ends on the same line. Examples of how the text is 
> mangled when the quotation spans multiple lines are given in 
> question 13 of the Bytext FAQ (www.bytext.org/faq#13). This 
> cannot really be fixed with minor changes other than to 
> notify users that the whole left=open, right=closed idea may 
> not work as such when the default automatic line breaking is 
> used. Users should not rely on any spelling conventions that 
> do not bypass the effects of step X10 and mirroring --how 
> this can be done is described in the Bytext documentation.
> 
> (3)  Mirroring ambiguities:
> What if eor = sor?
> 
> text:                         R RLO whatever PDF N LRO whatever PDF
> embedding level at step X9:   1     3      3     1     2      2
> directional type at step X10: R     R      R     ?     L      L
> 
> The above example should be in a monospace font. The original 
> is at www.bytext.org/faq#12. Step X10 is ambiguous whether 
> the �N� should be L or R. This means that if N is has the 
> mirrored property, some implementations might display the 
> mirrored form, others the non mirrored form, and others might 
> result in an error. This can be fixed by deciding on a single 
> form for such cases. Also, the
> statement: �for two adjacent runs, the eor of the first run 
> is the same as the sor of the second� needs to be removed 
> because it is not true.
> 
> Bernard
> ---
> Bernard Rafael Miller, email: [EMAIL PROTECTED] 
> Format enabling simplified 8 bit regexes of UCS characters: 
> www.bytext.org
> ---
> 
> 
> 
>

RE: 3 big bidi bugs

Reply via email to