https://bugs.documentfoundation.org/show_bug.cgi?id=163814

--- Comment #3 from Ming Hua <[email protected]> ---
Hi Mike,

(In reply to Mike Kaganski from comment #2)
> (that *may* be fixed anytime soon, *iif* the interested people
> knowing the language would engage)
I am actually interested in engaging, but my programming skills is a bit
lacking (never programmed in C++), and I don't have much time for Open Source
(for me it's a pure hobby) now.

So bear with me when I try to explain the details and difficulty of fixed to
this bug.  Hope more non-CJK developers can understand this issue better and
share their insights.

The simplified-to-traditional Chinese conversion, even at the character level
and not taking terms (it's actually closer to words and phrases, instead of
terms in special areas, but I digress) into consideration, is not a simple
matter.

When the mainland Chinese government made the simplification in 1950s, there
are many cases that multiple traditional characters are simplified to one
character.  In the example reported here, both "術" (U+8853) [1] and "朮"
(U+672E) [2] are simplified/standardized as "术" (U+672F) [3], and U+8853 is
actually much more commonly used than U+672E, the word "艺术/藝術" (means
art/artwork) being an example.

For some reason (my guess is the similarity of glyph shape), LibreOffice (or
the conversion table LO uses) chose U+672E instead of U+8853 when doing the
reversed one-to-multi conversion and ended with the wrong character most of the
time.

With Mike's pointer, I can write a patch fixing this specific mis-conversion
reported, but there are probably dozens, if not hundreds, similar ones still in
LO.  Even Microsoft Word doesn't do a very good job in this
simplified-to-traditional conversion work, therefore I recommended a dedicated
tool in my earlier reply.

(The following links are all in Chinese)
1. https://zi.tools/zi/%E8%A1%93
2. https://zi.tools/zi/%E6%9C%AE
3. https://zi.tools/zi/%E6%9C%AF

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to