Hi, In our experiments, we use a Multinomial Naive Bayes (MNB). The traditional MNB implies the TF weight of the words. We read in documentation http://scikit-learn.org/stable/modules/naive_bayes.html which describes Multinomial Naive Bayes that "... where the data are typically represented as word vector counts, although tf-idf vectors are also known to work well in practice". The "word vector counts" is a TF and it is well known. We have a problem which the "tf-idf vectors". In this case, i.e. tf-idf it was implemented the approach of the D. M. Rennie et all Tackling the Poor Assumptions of Naive Bayes Text Classification? In the documentation, there are not any citation of this solution.
Best, -- Marcin M.
_______________________________________________ scikit-learn mailing list [email protected] https://mail.python.org/mailman/listinfo/scikit-learn
