Hi Kaz, Thanks for the heads up, please see https://issues.apache.org/jira/browse/GORA-201 and *http://s.apache.org/WbG *Glad you got it sorted Lewis* *
On Wed, Feb 6, 2013 at 7:42 PM, k4200 <k4...@kazu.tv> wrote: > Hi Lewis, > > There seems to be a bug in HBase 0.90.4 library, which comes with > Nutch. I replaced hbase-0.90.4.jar with hbase-0.90.6-cdh3u5.jar and > the problem resolved. > > Regards, > Kaz > > 2013/2/7 Lewis John Mcgibbney <lewis.mcgibb...@gmail.com>: > > Please let us know how you get on as we can add this to the 2.x errors > > section of the wiki. > > Thanks and good luck with the problem. > > Lewis > > > > On Wed, Feb 6, 2013 at 4:45 PM, k4200 <k4...@kazu.tv> wrote: > > > >> Hi Lewis, > >> > >> Thanks for your reply. > >> > >> 2013/2/7 Lewis John Mcgibbney <lewis.mcgibb...@gmail.com>: > >> > Hi, > >> > > >> > On Wednesday, February 6, 2013, k4200 <k4...@kazu.tv> wrote: > >> >> Q1. My first question is how to fix this issue? Do I need any other > >> >> settings fo Nutch to utilize an HBase cluster correctly? > >> > > >> > In short, I would personally shoot over the hbase lists. As you > mention, > >> > the ZK connections have been increased but you are still experiencing > >> > similar results. Did you mention which HBase dist you are using? > >> > >> Sorry. I should have mentioned this in the previous email. > >> I use CDH3 Update 5 on CentOS 6.3, so HBase 0.90.6 with some patches. > >> I'll ask the HBase list as well. > >> > >> >> Q2. The second question is about Nutch and Hadoop. I didn't install > >> >> Hadoop Job Tracker and Task Tracker because HBase itself doesn't need > >> >> them according to a SO question [2], but does Nutch need them for > some > >> >> types of jobs? > >> > > >> > No running Hadoop in pseudo or distrib mode is not a pre requisite for > >> > running Nutch successfully but it can be extremely helpful not only > >> because > >> > you get the web app navigation over job control. In the instance that > >> Nutch > >> > is being run without Hadoop JT and TT (e.g. local mode) it simply > relies > >> > upon the hadoop library pulled via Ivy. > >> > >> Thanks for the clarification. I'll run JT and TT. > >> > >> > > >> > I looked for some documents or diagrams that describe > >> >> the overall architecture of Nutch with Gora and HBase, but couldn't > >> >> find a good one. > >> >> > >> > > >> > Mmm. What exactly are you looking for here? We have various articles > here > >> > [0] which explain quite a bit to get you started. Inevitably there is > no > >> > substitute better than looking into the code and unfortunately we > don't > >> > have any diagrams as such. > >> > One resources which may be of interest (regarding the Gora API and > >> relevant > >> > layers) can be found in last years GSoC project reports [1]. There are > >> some > >> > Gora architecture class diagrams available there, however I warn that > >> > (latterly) they introduce the Gora Web Services API which was written > >> into > >> > the current 0.3 development code. > >> > >> Thanks for the pointers. And, you're right. I'll look into the code, > too. > >> > >> Thanks, > >> Kaz > >> > >> > > >> > hth somewhat though. > >> > Lewis > >> > > >> > [0] http://wiki.apache.org/nutch/#Nutch_2.x > >> > [1] http://svn.apache.org/repos/asf/gora/committers/reporting/ > >> > > > > > > > > -- > > *Lewis* > -- *Lewis*