I've seen a lot of discussions about implementing the above mentioned algorithm (SVM) , however i couldn't find any live examples designed for multi-classification tasks in which you have to classify the document between one of 10000+ classification categories. It seems impossible to me.
Ashish-12 wrote: > > Hi Chee Wu, > > If you're looking for a Java-based solution, you might find it worthwhile > to > look at LibSVM. You can use this open source package to train a Support > Vector Machine based classifier, which can then be used to classify the > documents that Nutch crawls for you. In general, more the number of > training > documents, better the accuracy. Keep in mind that training documents must > be > carefully hand-picked, to minimize false classification. You can use > LibSVM > for 2-class as well as multi-class classification tasks. > > -- > > Regards.... > > ~ Ashish Saharia ~ > > > > -----Original Message----- > From: chee wu [mailto:[EMAIL PROTECTED] > Sent: Sunday, February 04, 2007 7:29 PM > To: [email protected] > Subject: Any successful experiences for text classification ? > > Hi, > I am trying to divide all the web pages crawled to predefined > categories,does anybody have successfully fulfilled classification based > on Nutch? I did find some threads talking about this,but none of them are > clear enough. Below are some possible solutions mentioned in the past > threads : > 1. Using SVM-Light, but it seems a C based program ? > 2. Can I fulfill this based on Carrot2? > 3. Other open source software packages like Rainbow or IBM UIMA ? > I want to do a deeper research on the three options above,which one should > I > study first? Any other hints or experiences also are welcome! > > Thanks > -Chee > > > > > -- View this message in context: http://www.nabble.com/Any-successful-experiences-for--text-classification---tf3169828.html#a8802930 Sent from the Nutch - User mailing list archive at Nabble.com. ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
