Hello, everybody.

I have rather strange behavior of Nutch 2.3: even initial Inject job is failing 
with the following exception (see below).
All Hadoop infrastructure is up and running:
root@5e7ca0b0c19d:~# jps
2810 NutchServer
1071 SecondaryNameNode
99 QuorumPeerMain
1694 ResourceManager
4598 Jps
795 NameNode
2243 HMaster
2376 HRegionServer
2669 ThriftServer
1789 NodeManager
913 DataNode

Even Nutch is configured correctly, because with the same configuration I was 
able to crawl some pages and see the data in Solr.
If I understand correctly, one of the goals on InjectorJob is to create 
'webpage' table inside of HBase. Shell of HBase also shows 0 tables created.

Do you have any ideas what is wrong here and what should be done to fix this.

2015-04-29 13:23:58,978 INFO  crawl.InjectorJob - InjectorJob: starting at 
2015-04-29 13:23:58
2015-04-29 13:23:58,979 INFO  crawl.InjectorJob - InjectorJob: Injecting 
urlDir: ram.txt
2015-04-29 13:24:01,434 ERROR store.HBaseStore - 
org.apache.hadoop.hbase.TableExistsException: webpage
2015-04-29 13:24:01,434 ERROR store.HBaseStore - 
[Ljava.lang.StackTraceElement;@6a19905e
2015-04-29 13:24:01,454 INFO  crawl.InjectorJob - InjectorJob: Using class 
org.apache.gora.hbase.store.HBaseStore as the Gora storage class.
2015-04-29 13:24:01,520 WARN  util.NativeCodeLoader - Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
2015-04-29 13:24:01,607 WARN  snappy.LoadSnappy - Snappy native library not 
loaded
2015-04-29 13:24:02,501 ERROR store.HBaseStore - 
org.apache.hadoop.hbase.TableExistsException: webpage
2015-04-29 13:24:02,501 ERROR store.HBaseStore - 
[Ljava.lang.StackTraceElement;@523b3317
2015-04-29 13:24:02,813 INFO  regex.RegexURLNormalizer - can't find rules for 
scope 'inject', using default
2015-04-29 13:24:02,986 WARN  
client.HConnectionManager$HConnectionImplementation - Encountered problems when 
prefetch META table:
org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for 
table: webpage, row=webpage,,99999999999999
        at 
org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:151)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:1059)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1121)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1001)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:958)
        at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:251)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:155)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:129)
        at 
org.apache.gora.hbase.store.HBaseTableConnection$1.<init>(HBaseTableConnection.java:87)
        at 
org.apache.gora.hbase.store.HBaseTableConnection.getTable(HBaseTableConnection.java:87)
        at 
org.apache.gora.hbase.store.HBaseTableConnection.put(HBaseTableConnection.java:186)
        at org.apache.gora.hbase.store.HBaseStore.put(HBaseStore.java:260)
        at org.apache.gora.hbase.store.HBaseStore.put(HBaseStore.java:79)
        at 
org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65)
        at 
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638)
        at 
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
        at 
org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:188)
        at org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:82)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015-04-29 13:24:02,996 ERROR store.HBaseStore - webpage
2015-04-29 13:24:02,996 ERROR store.HBaseStore - 
[Ljava.lang.StackTraceElement;@f757c05
2015-04-29 13:24:03,009 WARN  mapred.FileOutputCommitter - Output path is null 
in cleanup
2015-04-29 13:24:03,073 INFO  crawl.InjectorJob - InjectorJob: total number of 
urls rejected by filters: 0
2015-04-29 13:24:03,073 INFO  crawl.InjectorJob - InjectorJob: total number of 
urls injected after normalization and filtering: 1
2015-04-29 13:24:03,075 INFO  crawl.InjectorJob - Injector: finished at 
2015-04-29 13:24:03, elapsed: 00:00:04

Alexander Baranov

Reply via email to