On 11/04/2016 05:45 AM, Marcin Mirończuk wrote:
Hi,
In our experiments, we use a Multinomial Naive Bayes (MNB). The traditional MNB implies the TF weight of the words. We read in documentation http://scikit-learn.org/stable/modules/naive_bayes.html which describes Multinomial Naive Bayes that "... where the data are typically represented as word vector counts, although tf-idf vectors are also known to work well in practice". The "word vector counts" is a TF and it is well known. We have a problem which the "tf-idf vectors". In this case, i.e. tf-idf it was implemented the approach of the D. M. Rennie et all Tackling the Poor Assumptions of Naive Bayes Text Classification? In the documentation, there are not any citation of this solution.
No, I think that paper implements something slightly different. The documentation says that you can apply the TfidfVectorizer instead of CountVectorizer and it can still work.
_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to