Hi,

I am doing a PhD project which requires some development of a variation on MNB (Multinomial Naive Bayes) classification. I want to compare what I am doing to Thunderbird's junk mail filter, more precisely - the bias depending on what part of the document collection the algorithm uses for training.

I know it uses a "train on error" policy, but can't find any documentation anywhere... For example - is this only for token training, or for class prior distributions as well (i.e. - is the spam/ham ratio calculated over the whole document collection, or over only the trained ones?)

Best,
Pavel
_______________________________________________
dev-documentation mailing list
dev-documentation@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-documentation

Reply via email to