Hi Javier, 2008/1/17, Javier SOLA <[EMAIL PROTECTED]>: > Out of curiosity, and out of topic for this list. We are developing an > localization editor for XLIFF files, and we are trying to integrate > Hunspell. Do we need to do our own tokenization (for ZWSP)?
I have checked now, Hunspell handles ZWSP correctly: echo xxx$(echo -ne '\x0B\x20' | iconv -f utf-16 -t utf-8)xxx | hunspell -d en_US Hunspell 1.2.2b & xxx 4 0: xx, xix, x xx, xx x & xxx 4 6: xx, xix, x xx, xx x You can use Hunspell tokenization via its pipe interface or parser library. Cheers, László > > Cheers, > > Javier > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
