Dear all

I have not tracked yet the whole process but because some unexpected
doccat results I looked a little bit at the code.

Do you confirm that the DoccatTrainerTool whitespace tokenize (by
creating DocumentSample) while the DoccatTool "SimpleTokenize" ?

This should not be the case. Both should use the same tokenizer; in
particular : The whitespace tokenizer !

If not which one is used ?

Best regards

/Nicolas

Reply via email to