[Libreoffice-bugs] [Bug 104195] Hunspell can't handle specific character in Guarani ' g̃'

bugzilla-daemon Mon, 28 Nov 2016 00:40:55 -0800

https://bugs.documentfoundation.org/show_bug.cgi?id=104195


László Németh <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[email protected]

--- Comment #3 from László Németh <[email protected]> ---
Command line Hunspell word tokenization differs from the LibreOffice break
iterator. Hunspell in LibreOffice can handle such combined Unicode characters
well, you only need to use UTF-8 encoded aff and dic files:

------ gug.aff ------
SET UTF-8 
.....

# for suggestions with correct combined diacritics:

MAP 2
MAP aá
MAP g(g̃)


-------  gug.dic -----
100000
ág̃a

(If both precomposed and combined diacritics are common for the given language,
you need the canonical form 


See also Hunspell 4 manual, for example:

       Use parenthesized groups for character sequences (eg. for composed Uni‐
       code characters):

              MAP 3
              MAP ß(ss)  (character sequence)
              MAP ﬁ(fi)  ("fi" compatibility characters for Unicode fi
ligature)
              MAP (ọ́)o   (composed Unicode character: ó with bottom dot)

-- 
You are receiving this mail because:
You are the assignee for the bug.

_______________________________________________
Libreoffice-bugs mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs

[Libreoffice-bugs] [Bug 104195] Hunspell can't handle specific character in Guarani ' g̃'

Reply via email to