On Tue, 1 Apr 2014 12:51:11 +0700 James Clark <[email protected]> wrote:
> Suppose I have a paragraph (uppercase = RTL): > > CARROT IS car\u00ADrot IN ENGLISH > > and the paragraph gets broken at the soft hyphen. > > Is the correct ordering for the first line > > car- SI TORRAC > > or > > -car SI TORRAC > > ? I did not succeed in deducing the answer from UAX#9. Soft hyphen > has bidi class BN, which means it gets removed in stage X9, and so, > if I have understood correctly, doesn't have a defined embedding > level. > > I'm guessing the correct ordering is the first one, but I don't trust > my instincts here. (In particular, I wondered whether this was > analogous to the case where rule L1 resets embedding levels so that > trailing whitespace is at the visual end of the line.) There is no conformance requirement on the location of the soft hyphen. Indeed, there is no requirement on whether it is rendered at all (TUS Section 16.2). As the treatment of the soft-hyphen is language dependent even in unidirectional text, I am afraid the treatment is down to good taste and the language(s) involved. (E.g., is this Arabic text effectively embedding English text within an overall Thai context?) As U+2010 HYPHEN would result in text like 'car-', in an English influenced context I would also go with 'car-'. Richard. _______________________________________________ Unicode mailing list [email protected] http://unicode.org/mailman/listinfo/unicode

