[l10n-dev] Breakiterator override probem

2008-11-05 Thread Alan Yaniger
Hi list-members, For Hebrew text, I would like to override the BreakIteratorImpl::endOfScript() function. I tried: - writing a Breakiterator_he class (with hxx and cxx files) , - I added it to the SLOFILES section of makefile.mk, - I added it to the instances array in registerservices.cxx -

Re: [l10n-dev] Breakiterator override probem

2008-11-05 Thread Eike Rathke
Hi Alan, On Wednesday, 2008-11-05 11:03:52 +0200, Alan Yaniger wrote: For Hebrew text, I would like to override the BreakIteratorImpl::endOfScript() function. I tried: - writing a Breakiterator_he class (with hxx and cxx files) , - I added it to the SLOFILES section of makefile.mk, - I

Re: [l10n-dev] Breakiterator override probem

2008-11-05 Thread Mathias Bauer
Hi Alan, Alan Yaniger wrote: Hi list-members, For Hebrew text, I would like to override the BreakIteratorImpl::endOfScript() function. I tried: - writing a Breakiterator_he class (with hxx and cxx files) , - I added it to the SLOFILES section of makefile.mk, - I added it to the

Re: [l10n-dev] Breakiterator override probem

2008-11-05 Thread Karl Hong
Hi Alan, ScriptType breakiterator is not controlled by language, but Unicode script type definition. It does not like character/word/sentence/line breakiterators, which can be customized by language, only one script type breakiterator for all languages. What would you like to do with

Re: [l10n-dev] Breakiterator override probem

2008-11-05 Thread Alan Yaniger
Hi Karl, I'm trying to address issue 51772. Single or double-quotes are used in Hebrew within a word to specify the sound j or acronyms, respectively. At present, they are considered as word breaks during spellchecking, because their script type is not COMPLEX, but LATIN. endOfScript()

Re: [l10n-dev] Breakiterator override probem

2008-11-05 Thread Karl Hong
Hi Eike, There is only one published breakiterator for application to call, it is implemented in breakiteratorImp.cxx, in which only mentioned 4 breaktierators call getLocaleSpecificBreakIterator to get language specific breakiterator. Logically ScriptType breakiterator could not be

Re: [l10n-dev] Breakiterator override probem

2008-11-05 Thread Karl Hong
Hi Alan, I would suggest you write a rule in data/dict_word.txt, something like hebrew_letter+quotation_markhebrew_letter+; it means a Hebrew word is one or more Hebrew letters, following by a quotation mark, and following by one or more Hebrew letters. for rule syntax, check ICU user guide