Update: hadoop and hbase jar version is not right. After updating jars in
'lib/' directory and rebuild, now it's throwing:

org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException:
org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException: Column
family mtdt: does not exist in region crawl,,1264048608430 in table {NAME =>
'crawl', FAMILIES => [{NAME => 'bas', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'cnt', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'cnttyp', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'fchi', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'fcht', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'hdrs', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'ilnk', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'modt', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'mtdt', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'olnk', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'prsstt', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'prtstt', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'prvfch', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'prvsig', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'repr', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'rtrs', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'scr', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'sig', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'stt', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'ttl', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}, {NAME => 'txt', COMPRESSION => 'NONE', VERSIONS =>
'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
BLOCKCACHE => 'true'}]}
    at
org.apache.hadoop.hbase.regionserver.HRegion.checkFamily(HRegion.java:2381)
    at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1241)
    at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1208)
    at
org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:1834)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
    at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)

    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
    at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    at
org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:94)
    at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:995)
    at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers$2.doCall(HConnectionManager.java:1193)
    at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1115)
    at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1201)
    at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:605)
    at org.apache.hadoop.hbase.client.HTable.put(HTable.java:470)
    at org.apache.nutch.crawl.Injector$UrlMapper.map(Injector.java:92)
    at org.apache.nutch.crawl.Injector$UrlMapper.map(Injector.java:62)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)

It's wield for the error message shows the Column family 'mtdt' does exist.


On Tue, Jan 12, 2010 at 3:43 PM, xiao yang <yangxiao9...@gmail.com> wrote:

> Hi, Doğacan
>
> I have checked out Nutchbase from
> http://svn.apache.org/repos/asf/lucene/nutch/branches/nutchbase/
> My Hbase version is 0.20.2.
>
> createtable succeeded, but inject doesn't work.
>
> $bin/nutch createtable *crawl*
>
> Here is the status of Hbase:
> hbase(main):014:0> list
> 10/01/12 15:37:43 DEBUG client.HConnectionManager$TableServers: Cache hit
> for row <> in tableName .META.: location server 10.214.10.146:34592,
> location region name .META.,,1
> *crawl*
>
> 1 row(s) in 0.0110 seconds
>
> $bin/nutch inject crawl urls
> Injector: starting
> Injector: urlDir: urls
> Injecting new users failed!
>
> Here is the log:
>
> 2010-01-12 15:38:57,515 WARN  mapred.LocalJobRunner - job_local_0001
> java.lang.reflect.UndeclaredThrowableException
>     at $Proxy0.getRegionInfo(Unknown Source)
>     at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:874)
>     at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:515)
>     at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:491)
>     at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:565)
>     at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:524)
>     at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:491)
>     at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:565)
>     at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:528)
>     at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:491)
>     at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:123)
>     at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:101)
>     at org.apache.nutch.crawl.Injector$UrlMapper.setup(Injector.java:102)
>     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:518)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303)
>     at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176)
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException:
> java.lang.NullPointerException
>     at java.lang.Class.searchMethods(Class.java:2646)
>     at java.lang.Class.getMethod0(Class.java:2670)
>     at java.lang.Class.getMethod(Class.java:1603)
>     at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:643)
>     at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>
>     at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:720)
>     at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:329)
>     ... 17 more
> 2010-01-12 15:38:57,806 WARN  crawl.Injector - Injecting new users failed!
>
> What's the problem?
> Thanks!
> Xiao
>
> 2009/8/17 Doğacan Güney (JIRA) <j...@apache.org>:
>
> >
> >    [
> https://issues.apache.org/jira/browse/NUTCH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743919#action_12743919]
> >
> > Doğacan Güney commented on NUTCH-650:
> > -------------------------------------
> >
> > I just committed code to branch nutchbase. The scoring API did not turn
> out as clean as I expected but I decided to put in what I have. Also, I made
> some changes so that web UI also works.
> >
> > I am leaving this issue open because I will add documentation tomorrow.
> Meanwhile,
> >
> > To download:
> >
> >  svn co http://svn.apache.org/repos/asf/lucene/nutch/branches/nutchbase
> >
> > Usage:
> >
> > After starting hbase 0.20 (checkout rev. 804408 from hbase branch 0.20),
> create a webtable with
> >
> >  bin/nutch createtable webtable
> >
> > After that, usage is similar.
> >
> >  bin/nutch inject webtable url_dir # inject urls
> >
> > for as many cycles as you want;
> >    bin/nutch generate webtable #-topN N works
> >    bin/nutch fetch webtable # -threads N works
> >    bin/nutch parse webtable
> >    bin/nutch updatetable webtable
> >
> >  bin/nutch index <index> webtable
> > or
> >  bin/nutch solrindex <solr url> webtable
> >
> > To use solr, use this schema file
> > http://www.ceng.metu.edu.tr/~e1345172/schema.xml<http://www.ceng.metu.edu.tr/%7Ee1345172/schema.xml>
> >
> >
> > Again, a note of warning: This is extremely new code. I hope people will
> test and use it but there is no guarantee that it will work :)
> >
> >
> >> Hbase Integration
> >> -----------------
> >>
> >>                 Key: NUTCH-650
> >>                 URL: https://issues.apache.org/jira/browse/NUTCH-650
> >>             Project: Nutch
> >>          Issue Type: New Feature
> >>    Affects Versions: 1.0.0
> >>            Reporter: Doğacan Güney
> >>            Assignee: Doğacan Güney
> >>             Fix For: 1.1
> >>
> >>         Attachments: hbase-integration_v1.patch, hbase_v2.patch,
> malformedurl.patch, meta.patch, meta2.patch, nofollow-hbase.patch,
> nutch-habase.patch, searching.diff, slash.patch
> >>
> >>
> >> This issue will track nutch/hbase integration
> >
> > --
> > This message is automatically generated by JIRA.
> > -
> > You can reply to this email to add a comment to the issue online.
> >
> >
>
>

Reply via email to