WJCarpenter wrote:
> 
> >> Can somebody please explain the role of GR_Graphics::remapGlyph()?
> >> It converts zero-width characters into "degree" symbols.  This is
> >> the cause of Bug 1518.  Why do we do this?
> 
> Sorry, I neglected to address the part about bug 1518 in my last
> reponse.
> 
> remapGlyphs is not the cause of the problem in bug 1518.  It is the
> cause of the *described symptoms* of bug 1518, but the actual bug lies
> elsewhere.  If you have a look at the sample file that is attached
> with the bug (that is, look at it in a text editor), you'll see some
> extra junk after each of the characters that gets a degree symbol
> after in AbiWord.  It's the extra junk that is being rendered as a
> degree symbol by remapGlyphs, after AbiWord has rendered the 16-bit
> characters in question (though it renders them with the wrong
> diacritical markings in most cases).
>
> I don't know how the sample document was created, but if I carefully
> cut-and-paste copies of the correctly-displayed glyphs and save the
> file (using the 4 Jun nightly bidi Windows build from the web site),
> the original stuff in the sample document still has the extra junk,
> and the pasted copies don't have the extra junk (they also displayed
> properly).

Hehe contrary to popular belief, not all characters in Unicode
require a single codepoint.  Vietnamese is the prime example.
This extra junk is the tone marks and they are essential.
Vietnamese has 6 vowels and 6 tones making 36 needed characters
not including consonants!  Too many for 8 bits.  That's where
combining characters some in...

> If you open the sample document with MSWindows Notepad or MSWindows
> Wordpad, you'll see little hollow rectangles in the same places that
> AbiWord puts the 0xB0 degree symbol, presumably for the same reason.
> If you open the document with MSWord2000, it will ask you what
> encoding you want.  If you pick UTF-8, things look fine (I guess this
> is why the bug reports comment about pasting to MSWord seemed OK).  If
> you pick a different encoding, things looks various shades of not
> fine.

The file was created by typing into AbiWord with a Vietnamese keyboard
on Windows 2000.  It is UTF-8.  MSWord and Abi also support Windows
code page 1258 Vietnamese and VISCII encodings.  Abi also supports
TCVN which also preserves the "junk" (:

> So, in summary, I think the root cause of bug 1518 is in whatever
> thing created the sample document attached to the bug, or possibly the
> importer code is not doing the right thing.

I've provided more technical details in another post I comosed offline.

Andrew.

-- 
http://linguaphile.sourceforge.net

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


Reply via email to