Hi Javier,

2008/1/17, Javier SOLA <[EMAIL PROTECTED]>:
> Out of curiosity, and out of topic for this list. We are developing an
> localization editor for XLIFF files, and we are trying to integrate
> Hunspell. Do we need to do our own tokenization (for ZWSP)?

I have checked now, Hunspell handles ZWSP correctly:

echo xxx$(echo -ne '\x0B\x20' | iconv -f utf-16 -t utf-8)xxx | hunspell -d en_US
Hunspell 1.2.2b
& xxx 4 0: xx, xix, x xx, xx x
& xxx 4 6: xx, xix, x xx, xx x

You can use Hunspell tokenization via its pipe interface or parser library.

Cheers,
László

>
> Cheers,
>
> Javier
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to