Re: [Dspam-devel] Quick comment on non Western languages

Alexander Prinsier Wed, 18 Nov 2009 16:11:42 -0800

On 11/18/2009 10:30 PM, Stevan Bajić wrote:
>> So you mean, you can break cyrillic/slavic at spaces too like Western
>> languages? So then it'll work? You just break everything you know at
>> spaces, and what you don't know, like Chinese, at UTF32 code points.
>>
> No. I did not say that. You said:
> Western languages ->  spaces
> Non-Western languages ->  bytes
>
> And Cyrillic is a NON-WESTERN language. So the rule you mentioned is wrong. 
> Cyrillic languages should break at space as well.


Yeah that's what I said first, but it would be better to split cyrillic 
languages at spaces as well. So we can keep a list of 'space separated 
languages'?

>> There's also IBM's ITU (open source library) if you need something heavier.
>>
> You are writing here to a IBM Business Partner. But ITU? Never heard of it in 
> relation to open source library. I know ICU. Did you mean that?

Whoops, yeah I meant ICU :) Heard the 'ITU' term all day today so I got 
confused ;)

Alexander

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Dspam-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspam-devel

Re: [Dspam-devel] Quick comment on non Western languages

Reply via email to