Hi, I tried to download the OOo 3.1 but after searching from one page to another page, i only got OOo 3.0 to download.After the installation OOo 3.0 , i tried to use the temporary unicode Normalisation in the hunspell but it did not work. My input conversion table is shown below.
ICONV 7 ICONV ọ ọ ICONV ọ̀ ọ̀ ICONV ọ́ ọ́ ICONV ṣ ṣ ICONV ẹ̀ ẹ̀ ICONV ẹ́ ẹ́ ICONV ẹ ẹ The character in the second column were written in these sequency: alphabet first, followed by tone mark ( ) and then by the underdot last (.) while the character in third column were written in these sequency: alphabet first, followed by underdot(.) and then by tone mark( ).The OpenOffice writer did not recongnise the character as the same. Regards, Bolaji --- On Mon, 1/5/09, Németh László <[email protected]> wrote: From: Németh László <[email protected]> Subject: Re: [lingu-dev] Unicode normalisation To: [email protected] Date: Monday, January 5, 2009, 7:10 AM Hi, Really, this is not only a spell checking problem. OpenOffice.org has problems with both of visual and functional equivalence of Unicode characters. For example, here is the result of the Find all ä operation on ÄÄää, i.e. on the "A U+0308 (COMBINING DIARESIS) Ä a U+0308 ä" character sequence: http://www.flickr.com/photos/85171...@n00/3170574450/ It would be fine to solve this problem in the future OpenOffice.org versions by automatic Unicode normalization, also by OpenType support. Hunspell 1.2.x (I hope, it will be in OOo 3.1) has a temporary solution for Unicode normalization (canonical and compatiblity), the optional input/output conversion: ICONV 4 ICONV Ä Ä ICONV ä ä ICONV 가 ᄀ ᅡ ICONV fi fi First three conversion is canonical normalization: two composition and a Hangul decomposition. Conversion of the fi ligature is a compatibility normalization (but spell checking of words with f-ligatures needs fixed word breaking in OOo, too). Conversion of the spell checking suggestions to the original composed form: OCONV 2 OCONV ᄀ ᅡ 가 OCONV fi fi (Special spell checking requirements needs special solution. For example, German typography uses only f-ligatures within words, bot not in compound word boundary, so the previous OCONV fi fi conversion is not right for German. A redundant dictionary with non-suggested decomposed forms, and dictionary words with ligatures helps to check the correct typography of a German text: --- affix file --- NOSUGGEST * REP 2 REP fi fi REP fi fi --- dictionary file ---- finden/* finden ) Hyphenation of both of composed and decomposed characters is possible in OOo by redundant hyphenation patterns in OpenOffice.org. Compatibility equivalent ligatures can be handled by non-standard hyphenation (alternations): fi1/f=i,1,1 For thesauri it is a temporary solution using redundant items or references: finden->finden Incoming stemming in OOo thesaurus by Hunspell is also can handle normalization problem temporarily. ICONV input conversion or explicit stems ( --- dic file --- finden st:finden ) can give the normalized stems to the thesaurus component. Maybe a new Hunspell tool could help the spelling dictionary developers by the automatic generation of the ICONV normalization table. Regards, László 2009/1/5 Stephan Bergmann <[email protected]>: > On 01/02/09 09:51, F Wolff wrote: >> >> Hallo all >> >> We recently had a discussion on a list for African localisation about >> the utility of having Unicode normalisation automatically done in >> Hunspell, so that creators of spell checkers wouldn't need to worry >> about that. >> >> Is this a feature that would be useful to more people? Is there >> something generic in OOo that handles normalisation issues for other >> purposes? (searching, thesaurus, indexes, etc.) I can think of many >> places where it could be relevant. >> >> I'm curious to hear what other people think. > > I brought this up years ago as point 4 of > <http://www.openoffice.org/servlets/ReadMsg?list=dev&msgNo=7099>, but > nothing became of it back then... > > -Stephan > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
