Hi, I am trying to divide all the web pages crawled to predefined categories,does anybody have successfully fulfilled classification based on Nutch? I did find some threads talking about this,but none of them are clear enough. Below are some possible solutions mentioned in the past threads : 1. Using SVM-Light, but it seems a C based program ? 2. Can I fulfill this based on Carrot2? 3. Other open source software packages like Rainbow or IBM UIMA ? I want to do a deeper research on the three options above,which one should I study first? Any other hints or experiences also are welcome!
Thanks -Chee
