On Thu, 19 Nov 2009 01:08:38 +0100 Alexander Prinsier <aphe...@mailhaven.com> wrote:
> On 11/18/2009 10:30 PM, Stevan Bajić wrote: > >> So you mean, you can break cyrillic/slavic at spaces too like Western > >> languages? So then it'll work? You just break everything you know at > >> spaces, and what you don't know, like Chinese, at UTF32 code points. > >> > > No. I did not say that. You said: > > Western languages -> spaces > > Non-Western languages -> bytes > > > > And Cyrillic is a NON-WESTERN language. So the rule you mentioned is wrong. > > Cyrillic languages should break at space as well. > > Yeah that's what I said first, but it would be better to split cyrillic > languages at spaces as well. > Cyrillic is the same as any other western language. I mean in terms of word boundery. Cyrillic is as well letter oriented. All normal latin letters can be translated into Cyrillic: abcdefghijklmnopqrstuvwxyz -> абцдефгхијклмнопqрстувwxyз ABCDEFGHIJKLMNOPQRSTUVWXYZ -> АБЦДЕФГХИЈКЛМНОПQРСТУВWXYЗ Nothing special. It's not a symbol language like the Asian languages. So word bounderies for western languages apply 1 to 1 to Cyrillic languages. > So we can keep a list of 'space separated > languages'? > I don't know if that is a wise thing to do. It would be better if we could use a library handling that issue for us. > >> There's also IBM's ITU (open source library) if you need something heavier. > >> > > You are writing here to a IBM Business Partner. But ITU? Never heard of it > > in relation to open source library. I know ICU. Did you mean that? > > Whoops, yeah I meant ICU :) Heard the 'ITU' term all day today so I got > confused ;) > :) > Alexander > Steve > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Dspam-devel mailing list > Dspam-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/dspam-devel ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Dspam-devel mailing list Dspam-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspam-devel