Hi Javier, 2008/1/17, Javier SOLA <[EMAIL PROTECTED]>: > Hi Németh, > > Thanks ! > > Should I file a bug report in OpenOffice or in Hunspell?
OpenOffice > > We are still working with 2.1, because 2.2 and 2.3 had important issues > for Khmer. In 2.1 we can do spelling of Khmer without any problem. We > separate words with ZWSP, because graphically the words need to be > together, the space is used to mark a stop in the speech, equivalent to > the comma in English. > > We do have the problem of Hunspell suggesting SPACE as a separator, > instead of ZWSP, but this is a different issue. > > We have tested the latests builds of 2.4, and now ZWSP is not > interpreted as a word separator. If we separate words with ZWSP, they > are still considered as one word (as if the character was a ZWJ or > ZWNJ), and considered as misspelled. If we separate them with SPACE, > then it works. This is just a long shot, but... could it be that when > you added support for ZWJ (200D) and ZWNJ (200C) as characters that can > be placed inside a word, the ZWSP (200B) went into the same block of > characters that can be use inside words? Hunspell doesn't break the text in OpenOffice.org. OOo uses IBM ICU library for this task: http://wiki.services.openoffice.org/wiki/ICU It seems, updating IBM ICU in OpenOffice.org has generated your problem. Maybe new ICU files have overwritten the good syntax definitions of ZWSP tokenization. We need a new l10n OpenOffice.org issue with detailed bug report. Cheers, László > > Cheers, > > Javier > > Németh László wrote > > Hi Javier, > > > > I believe, I have got only a report about the tokenization problems of > > the command line version of Hunspell, yet. I will add the requested > > ZWSP suggestion instead of space, that is related to OpenOffice.org, > > but I don't know of ZWSP problems of OOo 2.4., if they exists. > > Could you send me a more detailed bug reports with the new Hunspell > > 1.2.2 beta, maybe tests with compiling without HAVE_ICONV macro of > > config.h? I'd like to fix your problem in 1.2.2, and integrate it with > > OpenOffice.org as soon as possible. > > > > Thanks in advance, > > László > > > > > > 2008/1/16, Javier SOLA <[EMAIL PROTECTED]>: > > > >> Hi Nemeth, > >> > >> In relation to the issue of using /u2xxx characters in Hunspell, I > >> wanted to ask you if there is any more information or development on it. > >> Any chances that it can be fixed in 2.4 (or have a patch that we can use). > >> > >> For Khmer we need to use ZWSP as word separator (words are written one > >> after the other without separation), and the spellchecker so far does > >> not work in 2.4 (it did in prior versions). > >> > >> I would be grateful for any information. > >> > >> Cheers, > >> > >> Javier > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [EMAIL PROTECTED] > >> For additional commands, e-mail: [EMAIL PROTECTED] > >> > >> > >> > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
