I think this calls for an implementation note on UAX#9 along these lines.
-------------------------
During line breaking, if a line is broken at the location of a SHY, the text around the line break may change. A common case is the replacement of the invisible SHY by a visible HYPHEN, but see Section x.x in the Unicode Standard.

For the purposes of the Bidi Algorithm, apply steps .. to .. after any substitutions have been made, using the directional classes for the substituted characters, instead of a single BN for the SHY character.

<example>

Note, no special action need be taken for a SHY character in the middle of a line, unless they are rendered as visible glyphs in a "show hidden character" mode. In the latter case, the recommendation would be to treat the visible symbol substituted for the SHY as having bidi class ON.
------------------------

I am not sure whether -car CBA or car- CBA is the right answer, nor whether the substitution will always be limited to the preceding line. (Old orthography German had Bäc<SHY>ker turning in to Bäk-|ker, where I've used | to show the line ending.) Those are details that the UBA should be ignorant about. The important thing is that the array of bidi directional classes is not constrained to contain a single entry for BN at the location of the original SHY.

If "car- CBA" is the right answer then the substitution would have to be HYPHEN plus LRM to get this to come out right, but that would be under the control of the line-breaking conventions, and not legislated by the UBA.

A./

On 4/1/2014 1:31 PM, Whistler, Ken wrote:

Richard Wordingham noted:

> As U+2010 HYPHEN would result in text like 'car-', in an English

> influenced context I would also go with 'car-'.

That's always a possibility, I suppose, but I'm not sure what

"English influenced context" means here.

The examples I just gave were for a RTL paragraph context.

In a LTR paragraph context, the same input would end up in

a very different order:

Trace: Entering br_UBA_ReverseLevels [L2]

Current State: 19

  Text:        05D0 05D1 05D2 0020 0063 0061 0072 002D

  Bidi_Class:     R    R    R    L L    L    L    L

  Levels:         1    1    1    0 0    0    0    0

  Runs: <L-----------------------------------L>

  Order:      [2 1 0 3 4 5 6 7]

And you get the display:

CBA car-

--------->

As opposed to:

-car CBA

<---------

In either case, the hyphen-minus (or hyphen), ends up at the *end of the line*.

My take is that *if* I am going to insert a visible glyph at the point of the

SHY, it would probably be best to insert it at the actual line break at the

end of the line, to be in the same position as an explicit hyphen-minus with

the same line break.

--Ken



_______________________________________________
Unicode mailing list
[email protected]
http://unicode.org/mailman/listinfo/unicode

_______________________________________________
Unicode mailing list
[email protected]
http://unicode.org/mailman/listinfo/unicode

Reply via email to