One more question, if I get an error like this, what would be the first thing to look at in terms of configuration files?
$ bin/nutch crawl urls -depth 3 -topN 5 Exception in thread "main" org.apache.gora.util.GoraException: java.lang.IllegalArgumentException: Not a host:port pair: �[email protected],34547,1370285799327 at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167) at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135) at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75) at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:214) at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68) at org.apache.nutch.crawl.Crawler.run(Crawler.java:136) at org.apache.nutch.crawl.Crawler.run(Crawler.java:250) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.nutch.crawl.Crawler.main(Crawler.java:257) Caused by: java.lang.IllegalArgumentException: Not a host:port pair: �[email protected],34547,1370285799327 at org.apache.hadoop.hbase.HServerAddress.<init>(HServerAddress.java:60) at org.apache.hadoop.hbase.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:63) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:354) at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:94) at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:108) at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102) at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161) ... 8 more On Mon, Jun 3, 2013 at 3:31 PM, Tejas Patil <[email protected]>wrote: > HBase 0.90.6 is fine. I use that and didn't face any problems. > > > On Mon, Jun 3, 2013 at 11:57 AM, Yves S. Garret > <[email protected]>wrote: > > > Positive, I have HBase 0.90.6 running at the moment. Or would I need > > to revert to an earlier build? > > > > > > On Mon, Jun 3, 2013 at 3:47 AM, Ferdy Galema <[email protected] > > >wrote: > > > > > Hi, > > > > > > The following line still looks like your trying to connect to a newer > > > version of HBase, instead of the supported 0.90.X. Are you absolutely > > sure > > > you are running on 0.90? And not 0.92, 0.94, 0.95? > > > > > > GeneratorJob: org.apache.gora.util.GoraException: > > > java.lang.IllegalArgumentException: Not a host:port pair: > > > � [email protected],51982,1369874616660 > > > > > > > > > > > > > > > > > > > > > On Fri, May 31, 2013 at 12:57 AM, Lewis John Mcgibbney < > > > [email protected]> wrote: > > > > > > > In all honesty I would make sure that you have a local and up-to-date > > > > nutch-$version.job file generated and try it out in runtime/local > > before > > > > using the job in /runtime/deploy on your cluster. > > > > You will know if it is good to go or not. > > > > When you are ready to deploy it to your cluster (e.g. once your > > satisfied > > > > that it works on a test/sub set crawl) setup then just make it > > available > > > to > > > > your Hadoop Job tracker classpath. > > > > > > > > > > > > On Thu, May 30, 2013 at 3:48 PM, Yves S. Garret > > > > <[email protected]>wrote: > > > > > > > > > I have $HADOOP_INSTALL in my path, would this be enough Lewis? Or > > > > > would I need to copy around some jar files? > > > > > > > > > > > > > > > On Thu, May 30, 2013 at 6:35 PM, Lewis John Mcgibbney < > > > > > [email protected]> wrote: > > > > > > > > > > > Make sure that everything is compiled and you are running from > > > runtime > > > > or > > > > > > with the Jar in hadoop > > > > > > > > > > > > > > > > > > On Thu, May 30, 2013 at 3:00 PM, Yves S. Garret > > > > > > <[email protected]>wrote: > > > > > > > > > > > > > Here is my hbase-site.xml: > > > > > > > http://bin.cakephp.org/view/2054577438 > > > > > > > > > > > > > > I've set this property as well. > > > > > > > > > > > > > > > > > > > > > On Thu, May 30, 2013 at 5:57 PM, Shah, Nishant < > > [email protected] > > > > > > > > > > wrote: > > > > > > > > > > > > > > > What about your storage.data.store.class property in > > > nutch-site.xml > > > > > ? I > > > > > > > > think you have to change the value to use hbase. For me it is > > > > > > > > org.apache.gora.hbase.store.HBasetore. > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > > From: Yves S. Garret [mailto:[email protected]] > > > > > > > > Sent: Thursday, May 30, 2013 2:52 PM > > > > > > > > To: [email protected] > > > > > > > > Subject: Re: How to setup HBase as backend > > > > > > > > > > > > > > > > Yes. For the moment, for simplicity sake, I have everything > > > going > > > > to > > > > > > > /tmp. > > > > > > > > > > > > > > > > hbase(main):004:0> scan 'test' > > > > > > > > ROW > > > > > > > > COLUMN+CELL > > > > > > > > > > > > > > > > 0 row(s) in 0.2370 seconds > > > > > > > > > > > > > > > > I _should_ have a table "webpage being created when I run > > Nutch. > > > > > > > > > > > > > > > > > > > > > > > > On Thu, May 30, 2013 at 5:23 PM, Shah, Nishant < > > > [email protected] > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Is your hbase running ? > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > > > From: Yves S. Garret [mailto:[email protected]] > > > > > > > > > Sent: Thursday, May 30, 2013 2:18 PM > > > > > > > > > To: [email protected] > > > > > > > > > Subject: Re: How to setup HBase as backend > > > > > > > > > > > > > > > > > > Even when I do bin/nutch generate, this is what I get: > > > > > > > > > http://bin.cakephp.org/view/1815127825 > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, May 30, 2013 at 5:14 PM, Yves S. Garret > > > > > > > > > <[email protected]>wrote: > > > > > > > > > > > > > > > > > > > Ok, similar issue: > > > > > > > > > > http://bin.cakephp.org/view/180499048 > > > > > > > > > > > > > > > > > > > > I've left the defaults for config as they were, except > this > > > is > > > > in > > > > > > > > > > gora.properties in apache nutch. > > > > > > > > > > > > gora.datastore.default=org.apache.gora.hbase.store.HBaseStore > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 29, 2013 at 7:40 PM, Lewis John Mcgibbney < > > > > > > > > > > [email protected]> wrote: > > > > > > > > > > > > > > > > > > > >> Yes, as Tejas mentioned, He runs fine with 0.90.6 API > > > changes > > > > > make > > > > > > > > > >> more recent HBase versions incompatible. > > > > > > > > > >> We will be upgrading HBase API usage in Gora within the > > > > current > > > > > > > > > >> development drive. > > > > > > > > > >> Lewis > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret > > > > > > > > > >> <[email protected]>wrote: > > > > > > > > > >> > > > > > > > > > >> > Would HBase 0.90.X and Nutch 2.1 work? > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney > < > > > > > > > > > >> > [email protected]> wrote: > > > > > > > > > >> > > > > > > > > > > >> > > This is incompatible. > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret > > > > > > > > > >> > > <[email protected]>wrote: > > > > > > > > > >> > > > > > > > > > > > >> > > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1. > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina > > > > > > > > > >> > > > <[email protected]>wrote: > > > > > > > > > >> > > > > > > > > > > > > >> > > > > Hi Yves, > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > as Tejas said, your issue is almost certainly > due > > > to a > > > > > > > > > >> compatibility > > > > > > > > > >> > > > > problem between the version of Nutch and the one > > of > > > > > HBase. > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > I had the same problem and in my case it was due > > to > > > > the > > > > > > > > > >> > > > > HBase > > > > > > > > > >> > version. > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > I use Nutch 2.1 with HBase 0.90.4 and it works > > fine. > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > 2013/5/29 Yves S. Garret < > > > [email protected]> > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > Hi, I'm trying to run Nutch this time around > > with > > > > > HBase > > > > > > > > > >> > > > > > in the > > > > > > > > > >> > > > > background, > > > > > > > > > >> > > > > > as > > > > > > > > > >> > > > > > opposed to having MySQL instead. > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > In the past, I followed this tutorial: > > > > > > > > > >> > > > > > http://nlp.solutions.asia/?p=180 > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > This was all in good, but now that I have my > > > HBase, > > > > > I'd > > > > > > > > > >> > > > > > like to > > > > > > > > > >> use > > > > > > > > > >> > > > that. > > > > > > > > > >> > > > > > I left the configuration of Nutch as it was > and > > > > > > proceeded > > > > > > > > > >> > > > > > to > > > > > > > > > >> crawl > > > > > > > > > >> > > > > > nutch.apache.org. I got this error: > > > > > > > > > >> > > > > > http://bin.cakephp.org/view/1301117746 > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > What am I doing wrong? > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > At the moment, I'm reading through this, > trying > > to > > > > get > > > > > > my > > > > > > > > > >> > > > > > stack > > > > > > > > > >> to > > > > > > > > > >> > > > work, > > > > > > > > > >> > > > > > will write back if I make any progress: > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-stora > > > > > > > > > >> ge > > > > > > > > > >> .html > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > -- > > > > > > > > > >> > > > > Adriana Farina > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > -- > > > > > > > > > >> > > *Lewis* > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> -- > > > > > > > > > >> *Lewis* > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > *Lewis* > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > *Lewis* > > > > > > > > > > > > > > > > -- > > > *Ferdy Galema* > > > Kalooga Development > > > > > > -- > > > > > > *Kalooga* | Visual RelevanceCheck out our Visual Gallery Layer now!< > > > > > > http://www.independent.co.uk/arts-entertainment/music/news/david-cameron-gets-teenage-kicks-starring-in-one-direction-music-video-8499282.html#!kalooga-10369/%22One%20Direction%22 > > > > > > > Kalooga > > > > > > Helperpark 288 > > > 9723 ZA Groningen > > > The Netherlands > > > +31 50 2103400 > > > > > > www.kalooga.com > > > [email protected] EMEA > > > > > > 53 Davies Street > > > W1K 5JH London > > > United Kingdom > > > +44 20 7129 1430Kalooga Spain and LatAM > > > > > > Maria de Sevilla Diago No 3 > > > 28022 Madrid - Madrid > > > Spain > > > +34 670 580 872 > > > > > >

