I have been dying to put a nutch project online. This is my first one (which isnt really a product or anything, hobby project).
Basically, I have a site where users can post links and also I process a list of RSS feeds for articles and add to the database. Well, I did a crawl on those set of links and came up with this search piece. I was kind of unimpressed with the number of links in the nutch crawl. There was a seed of about 15,000 links and with nutch I ended up with about 25,000 new links. I thought I could get a lot more. But, one key piece is that nutch crawled the content which is awesome. Try it out (remember, just a hobby site) http://www.botspiritcompany.com/botlist/spring/search/global.html -- Berlin Brown http://www.newspiritcompany.com - newspirit technologies ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
