The tokenizer assumes it can always split on white spaces. So it will not
work without
modifying this code.

You could hack it by replacing all whitespaces with a special character in
your training and test
data.

For which language do you need that?

Jörn

On Sat, Feb 11, 2012 at 6:46 PM, Lee Hinman <[email protected]>wrote:

> Hey Guys,
>
> I'm trying to train a tokenizer that ignores spaces and only uses <SPLIT>
> to determine where to split. I wasn't able to find anything in the
> javadocs, is this possible with OpenNLP? If so, could someone point me in
> the right direction regarding it?
>
> - Lee Hinman

Reply via email to