https://bugs.documentfoundation.org/show_bug.cgi?id=106755

Khaled Hosny <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[email protected],
                   |                            |[email protected]
          Component|graphics stack              |Linguistic

--- Comment #3 from Khaled Hosny <[email protected]> ---
After a bit of debugging I think I found the root of this. Basically characters
from Combining Diacritical Marks block are classified as ScriptType::LATIN when
they should have been ScriptType::WEAK. This comes from
i18npool/source/breakiterator/breakiteratorImpl.cxx:

BreakIteratorImpl::getScriptClass() which calls
getCompatibilityScriptClassByBlock() which in turn checks the scriptList array
that assigns all blocks from UBLOCK_BASIC_LATIN to UBLOCK_ARMENIAN as
ScriptType::LATIN and this includes UBLOCK_COMBINING_DIACRITICAL_MARKS.

This comes from
https://gerrit.libreoffice.org/gitweb?p=core.git;a=commitdiff;h=bf355b97ee4b53e38975f0e1847eda5b3e05f920
to fix bug 38095. I don’t know what old classification was and whether it
indeed classified combining marks as Latin, but either way it does not make
terrible since.

If I modify scriptList to skip UBLOCK_COMBINING_DIACRITICAL_MARKS, then I get
correct rendering here.

-- 
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Libreoffice-bugs mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs

Reply via email to