Hello, I'm currently comparing Mahout classification algorithms that can be used in text classification? I checked [1] but many of them have open issues so I'm not sure which of them are in working condition and properly supported by Mahout. According to what I've found so far, SVM is preferred for text classification because of its ability to work with high dimensional feature spaces. Also I've gone through MAHOUT-334 and this[2] recent mail thread.
According to the wiki, Naive Bayes seems to be a reliable candidate for a classification task. Could someone please provide more details on this and the suitability of Naive Bayes for text classification? Thanks, [1] https://cwiki.apache.org/confluence/display/MAHOUT/Algorithms [2] http://mail-archives.apache.org/mod_mbox/mahout-user/201312.mbox/%3ccao+e6vdke1pjli4wtdcg8x-h4gyb3bpe8bgriwx2j6j7ist...@mail.gmail.com%3e -- M.P. Tharindu Rusira Kumara Department of Computer Science and Engineering, University of Moratuwa, Sri Lanka. +94757033733 www.tharindu-rusira.blogspot.com
