I've seen a lot of discussions about implementing the above mentioned
algorithm (SVM) , however i couldn't find any live examples designed for
multi-classification tasks in which you have to classify the document
between one of 10000+ classification categories. 
It seems impossible to me.


Ashish-12 wrote:
> 
> Hi Chee Wu, 
> 
> If you're looking for a Java-based solution, you might find it worthwhile
> to
> look at LibSVM. You can use this open source package to train a Support
> Vector Machine based classifier, which can then be used to classify the
> documents that Nutch crawls for you. In general, more the number of
> training
> documents, better the accuracy. Keep in mind that training documents must
> be
> carefully hand-picked, to minimize false classification. You can use
> LibSVM
> for 2-class as well as multi-class classification tasks.
> 
> --
> 
> Regards....
> 
> ~ Ashish Saharia ~
> 
> 
> 
> -----Original Message-----
> From: chee wu [mailto:[EMAIL PROTECTED] 
> Sent: Sunday, February 04, 2007 7:29 PM
> To: [email protected]
> Subject: Any successful experiences for text classification ?
> 
> Hi,
>   I am trying to divide all the web pages crawled to predefined
> categories,does anybody  have successfully fulfilled  classification based
> on Nutch? I did find some threads talking about this,but none of them are
> clear enough. Below are some possible solutions mentioned in the past
> threads :
>   1. Using SVM-Light, but it seems a C based program ? 
>   2. Can I fulfill this based on Carrot2? 
>   3. Other open source software packages like Rainbow or IBM UIMA ?
> I want to do a deeper research on the three options above,which one should
> I
> study first? Any other hints or experiences also are welcome!
> 
> Thanks
> -Chee
>  
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Any-successful-experiences-for--text-classification---tf3169828.html#a8802930
Sent from the Nutch - User mailing list archive at Nabble.com.


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to