Hi everyone, I'm new to Nutch and I would appreciate some advice...
I want to use Nutch to Crawl over urls and categorize them. I already have a running Hadoop cluster with Hadoop 1.0.3 and HBase 0.94.2, and I saw that Nutch 2.1 with Gora supports HBase as backend. I would like to start by running a basic crawler with this installations on a standalone machine and after I get the hang of it deploy it on the cluster / set up on another cluster. Anyone has a good advise for installation / setup ? Anyone used Nutch for website categorization ? Is 2.1 version compatible with HBase0.94.x (or actually is Gora compatible) ? Any help would be greatly appreciated.. Thanks, Amit.

