On 11/18/2009 10:30 PM, Stevan Bajić wrote: >> So you mean, you can break cyrillic/slavic at spaces too like Western >> languages? So then it'll work? You just break everything you know at >> spaces, and what you don't know, like Chinese, at UTF32 code points. >> > No. I did not say that. You said: > Western languages -> spaces > Non-Western languages -> bytes > > And Cyrillic is a NON-WESTERN language. So the rule you mentioned is wrong. > Cyrillic languages should break at space as well.
Yeah that's what I said first, but it would be better to split cyrillic languages at spaces as well. So we can keep a list of 'space separated languages'? >> There's also IBM's ITU (open source library) if you need something heavier. >> > You are writing here to a IBM Business Partner. But ITU? Never heard of it in > relation to open source library. I know ICU. Did you mean that? Whoops, yeah I meant ICU :) Heard the 'ITU' term all day today so I got confused ;) Alexander ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Dspam-devel mailing list Dspam-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspam-devel