When I use "trainclassifier" I am able to run the 20 news groups just
fine. I'm also able to train on my own data up until around 10M
training documents.

Once I have enough training data, I find that "trainclassifier"
succeeds and "testclassifier" fails. I have no idea if it was a
training or testing problem. The errors reported by "testclassifier"
are http://pastebin.com/YKqbjAQH . I have a suspicion that I am
training on too much data, and need to increase the minDf, but I don't
see a way to do it with "trainclassifier"

While looking around for a fix, I read that "trainclassifier" is the
old way, and that "trainnb" fixed some unusual back-end errors (which
I suspect is what I'm getting).  What is the difference? Is there any
reason for me to start figuring how to use "trainnb"?

Reply via email to