I was trying the Naive Bayes classifier via the build-asf-email.sh file I committed the other day on a data set that had a fairly significant variation in the number of messages per training label and am noticing (still need to validate more) that the label with the least number of examples is often dominating the results. This seems counterintuitive to me. I would have expected the largest set would have dominated the results. If I even out the number of items per label, than I get reasonable results. Any thoughts on what I am seeing? If you are interested, I can share the details of the runs.
-Grant
