I am trying to perform classification for web pages in Chinese. Thank you Ashish. Yes, LibSVM might satisfy my requirement...I am also considering using LingPipe http://alias-i.com/lingpipe/demos/tutorial/classify/read-me.html It seems LingPipe can categorize the Chinese documents without word segmentations.any one has tried this? I'll write a index filter based on LingPipe.. I
----- Original Message ----- From: "Shay Lawless" <[EMAIL PROTECTED]> To: <[email protected]> Sent: Monday, February 05, 2007 5:43 PM Subject: Re: Any successful experiences for text classification ? > Hi Chee, > > Are you looking to perform this classification on a collection of local > documents or on a collectiong of web pages? > > Shay > > On 04/02/07, chee wu <[EMAIL PROTECTED]> wrote: >> >> Hi, >> I am trying to divide all the web pages crawled to predefined >> categories,does anybody have successfully fulfilled classification based >> on Nutch? I did find some threads talking about this,but none of them are >> clear enough. Below are some possible solutions mentioned in the past >> threads : >> 1. Using SVM-Light, but it seems a C based program ? >> 2. Can I fulfill this based on Carrot2? >> 3. Other open source software packages like Rainbow or IBM UIMA ? >> I want to do a deeper research on the three options above,which one should >> I study first? Any other hints or experiences also are welcome! >> >> Thanks >> -Chee >> > ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
