https://issues.apache.org/ooo/show_bug.cgi?id=81029

Gerald Bettrdige <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[email protected]

--- Comment #10 from Gerald Bettrdige <[email protected]> ---
Created attachment 82075
  --> https://issues.apache.org/ooo/attachment.cgi?id=82075&action=edit
3 screenshots defective displays described in comment

This bug can be a problem ― I came across it when I wanted to make a t with a
tilde to imitate a Latin abbreviation. So I’ve been investigating exactly how
it happens (under Windows 7, oo 4.0.1) in the hope that the evidence might
allow someone to correct the code.

It happens when you add the combining diacritic (hereafter CoD!) with insert
special character.  You can clear it at once by selecting the messy bit (the
font widow then goes blank) and re-selecting the font.
It doesn’t happen if a) you type the CoD directly (of which more later) or b)
cut-and-paste it from a text box – not a rich text box which carries font info
with it. So you can cut-and-paste from Notepad but not from Wordpad.
Whether the font replaces the char with an existing glyph, or displays a
composite glyph, seems to make no difference. A good test of this is t tilde,
which doesn’t exist as a separate glyph, and t caron, which always displays ť
instead, which has to be a substitute glyph. (U+0165)
If you deliberately make the CoD a different font from the letter – say Arial
over Times Roman – and use insert special character, you get the composite
character in Arial displaced over the unmodified character in Times Roman. Try
this with t dot under by adding the CoD U+0323, then you can be sure the CoD
isn’t hidden.
Now try the same thing but cut-and-paste the CoD from Wordpad. Use Times 60pt
in oo, and Arial 28 pt in red in Wordpad. You will find the same behaviour, but
the red enables you to see clearly what comes from what. (Under Windows 7,
Google Chrome seems to do odd things to the clipboard, so if it is running
Wordpad behaves like Notepad, with no format info). However while you can clear
the mess up with insert character by selecting the mess and reselecting the
font, if you cut-and-paste from Wordpad that doesn’t work.
Finally here is something really odd. Take a line where you have an unmodified
char with the modified one beside it. At the beginning of the line insert a
char like t̃ and you will find that the unmodified original letter now gets
modified! But it only does this on the first mess, not subsequent ones. All
this on the attached latin screenshot; notice on the second line ṭ has been
added as a single character U+1E6D, but on the third as t plus U+0323, and only
the latter affects the following mess. In LatinScreenshot2.jpg two of the lines
have been selected and the font redefined, which tidies up everything except
the mess imported from Wordpad.
So what happens if the font replaces the preceding character, but what you are
adding is not a CoD? This behaviour is inherent in Korean fonts. There are
relatively few letters in the Korean alphabet, but the letters of a syllable
have to be displayed as a single glyph. A syllable consists of an initial
consonant (which might be silent), a vowel, and an optional final consonant.
Unicode defines the separate letters as Hangul Jamo, and the combined glyphs
(over 10000 of them) as Hangul syllables. There’s a simple formula to get the
combined glyph from the letters. To try this you need to set the language in
the little window at the bottom to Korean, and choose Gungsuh or another Korean
font as the Asian font. Then set the font to Gungsuh 40 pt. You need to know
that h is U+1112, a is U+1161 and final n is U+11AB. The combined glyph han is
U+D55C (in Hangul Ha).
If you insert the characters in order you will find the same sort of behaviour
as with CoDs. If you insert a new han at the beginning of a line which already
contains a badly displayed han, you will find it works correctly, but it
doesn’t correct the existing mess. If you insert the combined glyph directly as
U+D55C, twice, you can see the correct spacing. The attached asian screenshot
shows, on successive lines, h: a: n: a messy ha: a messy han: two combined
glyphs correctly spaced: and finally a messy han with one inserted before which
has come out correctly.

There are two ways of not having to suffer this. One is to use a custom
keyboard layout, but for that you will need a program to do it. I use and am
very happy with KdbEdit, but there are doubtless others. I’ve based the layout
on the Canadian multilingual, which has dead keys for all the common accents
(so you can type PinYin direct, for example, without customising anything. It’s
provided by Windows). But you can’t type CoDs without customising. The trick is
to edit the dead keys so that not only does DK+ space give the accent alone,
but DK+= gives the associated CoD.
The second is to use a program to insert hex codes. I’ve cobbled one together
in C# which, when it loses the focus, transfers the character of the hex code
to the clipboard, so CtrlV will insert it. It works with surrogates too. The
program is presented as a tiny window which stays on top. Would be nice if this
were included in oo but that’s a different topic.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are watching all bug changes.

Reply via email to