Hi,
I am doing a PhD project which requires some development of a variation
on MNB (Multinomial Naive Bayes) classification. I want to compare what
I am doing to Thunderbird's junk mail filter, more precisely - the bias
depending on what part of the document collection the algorithm uses for
training.
I know it uses a "train on error" policy, but can't find any
documentation anywhere... For example - is this only for token training,
or for class prior distributions as well (i.e. - is the spam/ham ratio
calculated over the whole document collection, or over only the trained
ones?)
Best,
Pavel
_______________________________________________
dev-documentation mailing list
dev-documentation@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-documentation