Hello,
I am trying to do my first crawl using HBase as a datastore in Nutch. I
have HBase 0.90.6 with Nutch 2.2.1 version.
I get the error message below when I am trying to crawl. Please help!!
2013-09-06 10:01:51,527 INFO crawl.InjectorJob - InjectorJob: Using class
org.apache.gora.hbase.store.HBaseStore as the Gora storage class.
2013-09-06 10:01:51,600 WARN util.NativeCodeLoader - Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable
2013-09-06 10:01:51,840 WARN snappy.LoadSnappy - Snappy native library not
loaded
2013-09-06 10:01:53,870 INFO mapreduce.GoraRecordWriter -
gora.buffer.write.limit = 10000
2013-09-06 10:01:54,515 INFO regex.RegexURLNormalizer - can't find rules
for scope 'inject', using default
2013-09-06 10:01:54,659 WARN mapred.FileOutputCommitter - Output path is
null in cleanup
2013-09-06 10:01:54,964 INFO crawl.InjectorJob - InjectorJob: total number
of urls rejected by filters: 0
2013-09-06 10:01:54,965 INFO crawl.InjectorJob - InjectorJob: total number
of urls injected after normalization and filtering: 1
2013-09-06 10:01:54,974 INFO crawl.FetchScheduleFactory - Using
FetchSchedule impl: org.apache.nutch.crawl.DefaultFetchSchedule
2013-09-06 10:01:54,975 INFO crawl.AbstractFetchSchedule -
defaultInterval=2592000
2013-09-06 10:01:54,975 INFO crawl.AbstractFetchSchedule -
maxInterval=7776000
2013-09-06 10:01:57,009 INFO mapreduce.GoraRecordReader -
gora.buffer.read.limit = 10000
2013-09-06 10:01:57,800 INFO crawl.FetchScheduleFactory - Using
FetchSchedule impl: org.apache.nutch.crawl.DefaultFetchSchedule
2013-09-06 10:01:57,800 INFO crawl.AbstractFetchSchedule -
defaultInterval=2592000
2013-09-06 10:01:57,800 INFO crawl.AbstractFetchSchedule -
maxInterval=7776000
2013-09-06 10:01:57,925 WARN mapred.FileOutputCommitter - Output path is
null in cleanup
2013-09-06 10:01:57,927 WARN mapred.LocalJobRunner -
job_local627799575_0002
java.lang.Exception: java.lang.NoSuchMethodError:
org.apache.gora.hbase.store.HBaseStore.newPersistent()Lorg/apache/gora/persistency/impl/PersistentBase;
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
Caused by: java.lang.NoSuchMethodError:
org.apache.gora.hbase.store.HBaseStore.newPersistent()Lorg/apache/gora/persistency/impl/PersistentBase;
at
org.apache.gora.hbase.store.HBaseStore.newInstance(HBaseStore.java:519)
at org.apache.gora.hbase.query.HBaseResult.readNext(HBaseResult.java:49)
at
org.apache.gora.hbase.query.HBaseScannerResult.nextInner(HBaseScannerResult.java:54)
at org.apache.gora.query.impl.ResultBase.next(ResultBase.java:112)
at
org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:111)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:531)
at
org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)