Hi, I have successfully setup nutch 2.x with hbase-0.90.6 and my jobs are running fine. But there is one issue for which I need your help.
Earlier I was using Cassandra with nutch 2.x and data from my all jobs were used to go to 'webpage' keyspace. But in case of hbase-0.90.6 I can see there are 2 tables created , one is 'webpage' which always have 0 rows and other is 'crawlId_webpage' and that has some data. But when I run my solrIndexJob , no documents are added and I think this is due to the face that there is no parsed text present in 'crawlId_webpage' table for my crawled pages. I can also verify this in my ParseFilter plugin when I do Utf8 text = page.getText(); my text is always null and thats why I think solrindexjob is not inserting any doc to Solr. So what should I do here ? Why I am not having any text in hbase table ? And why there are two tables created 'webpage' & 'crawlid_webpage' ? Thanks guys for help & support. Tony.

