hi chee wu : the easiest way is to realize ur goal i think. but the carrot2 's performance is not very good. and the another important thing is that u should input the data with as little spam as possible , or u will get useless result.
On 2/4/07, chee wu <[EMAIL PROTECTED]> wrote:
Hi, I am trying to divide all the web pages crawled to predefined categories,does anybody have successfully fulfilled classification based on Nutch? I did find some threads talking about this,but none of them are clear enough. Below are some possible solutions mentioned in the past threads : 1. Using SVM-Light, but it seems a C based program ? 2. Can I fulfill this based on Carrot2? 3. Other open source software packages like Rainbow or IBM UIMA ? I want to do a deeper research on the three options above,which one should I study first? Any other hints or experiences also are welcome! Thanks -Chee
-- www.babatu.com
