Yes, with hbase. Here is the error

13/03/29 16:33:29 INFO zookeeper.ZooKeeper: Session: 0x13d7770d67d005f closed
13/03/29 16:33:29 ERROR crawl.WebTableReader: WebTableReader: 
java.lang.NullPointerException
        at org.apache.gora.hbase.store.HBaseStore.addFields(HBaseStore.java:398)
        at org.apache.gora.hbase.store.HBaseStore.execute(HBaseStore.java:360)
        at org.apache.nutch.crawl.WebTableReader.read(WebTableReader.java:234)
        at org.apache.nutch.crawl.WebTableReader.run(WebTableReader.java:476)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.crawl.WebTableReader.main(WebTableReader.java:412)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

 
If I revert to previous release it works fine.

Thanks.
Alex.

 

 

-----Original Message-----
From: Lewis John Mcgibbney <[email protected]>
To: user <[email protected]>
Sent: Fri, Mar 29, 2013 4:30 pm
Subject: Re: error using generate in 2.x


Hi Alex,
With HBase also?
There 'was' a bug in gora-cassandra module for this command + params
however I thought it had been addressed and therefore resolved it.
Lewis


On Fri, Mar 29, 2013 at 4:00 PM, <[email protected]> wrote:

> Hi,
>
> It seems that trunk has a few bugs. I found out that readdb -url urlname
> also gives errors.
>
> Thanks.
> Alex.
>
>
>
>
>
>
>
> -----Original Message-----
> From: kaveh minooie <[email protected]>
> To: user <[email protected]>
> Sent: Fri, Mar 29, 2013 1:53 pm
> Subject: Re: error using generate in 2.x
>
>
> Hi lewis
>
> the mapping file that I am using is the one that comes with nutch, and I
> haven't touched it. this message in the log is caused by using the
> -crawlId on the command line. for example this log was the result of
> this command :
>
> bin/nutch generate -topN 1000 -crawlId t1
>
> which causes the nutch( or i guess technically gora ) to use a table
> name 't1_webpage'. thou, I have to say that i don't understand the
> rational behind the code generating a warning like this ( I mean I know
> it is not actually a warning, just that the way the message has been
> phrased makes it look like warning) for something that should be a
> routine operation. for someone like me who is crawling ( i mean hoping
> to cause it is not working right now ) thousands of websites to maintain
> multiple crawldb ( or its equivalent in gora, webpage table ) for
> different group of websites.
>
>
> Now that being said, it has nothing to do with the problem that I am
> having. it is the same when I ommit the -crawlId parameter ( forcing it
> to use the default name webpage ), and more importantly it is new. I
> haven't had this problem before, it just started to happening 2 days ago
> when i pulled the latest commits to 2.x branch.
>
>
> On 03/29/2013 09:50 AM, Lewis John Mcgibbney wrote:
> > Hi Kaveh,
> > Firstly, as logged below, Gora attempts to associate your HBase table
> > configuration with specified tables (from within gora-hbase-mapping.xml)
> > however it seems that your case satisfies the condition "if
> > (!tableName.equals(tableNameFromMapping))" meaining that the table name
> is
> > not equal to the value for the table name attribute or that this value is
> > null.
> > This is allowed, but I am interested to find out what the mapping file
> > looks like... the entire file is not required, just the <class
> name="value"
> > snippet if this is possible.
> > I am not using the gora-hbase module and haven't ever seen anyone come
> > across this problem before.
> > Thanks
> > Lewis
> >
> > On Thursday, March 28, 2013, kaveh minooie <[email protected]> wrote:
> >
> >> 2013-03-28 11:06:25,158 INFO  store.HBaseStore - Keyclass and nameclass
> > match but mismatching table names  mappingfile schema is 'webpage' vs
> > actual schema 't1_webpage' , assuming they are the same.
> >
>
> --
> Kaveh Minooie
>
>
>


-- 
*Lewis*

 

Reply via email to