On 09/02/12 09:21, Joern Kottmann wrote:
So the command line tool does not use the the english maxent tokenizer or is it that the tool is adding the spaces on whatever the pre-trained tokenizer returns?Try our command line tokenizer, it will output white space tokenized text
Jim