So what (stable) version of Nutch and which architecture would best fit my cluster ?
Is there a quick (simplified) deployment if I already have a running cluster and I don't want to change it's existing data or configuration ? Thanks. On Fri, Feb 15, 2013 at 12:42 AM, Lewis John Mcgibbney < [email protected]> wrote: > Hi Amit, > > On Thu, Feb 14, 2013 at 6:24 AM, Amit Sela <[email protected]> wrote: > > > > > I already have a running Hadoop cluster with Hadoop 1.0.3 and HBase > 0.94.2, > > and I saw that Nutch 2.1 with Gora supports HBase as backend. > > > > First thing's first. We cannot guarantee that Gora and subsequently Nutch > will work with the newer HBase 0.94.X branch. > You could try it out and get back to us, but the advice would be that it is > most likely incompatible. > > > > I would like to start by running a basic crawler with this installations > on > > a standalone machine and after I get the hang of it deploy it on the > > cluster / set up on another cluster. > > > > Anyone has a good advise for installation / setup ? > > > > http://wiki.apache.org/nutch/#Other_Tutorial.28s.29 > > > > > > Anyone used Nutch for website categorization ? > > > > You can find some info on suggestions from this thread > http://www.mail-archive.com/[email protected]/msg08066.html > > > > > > >

