Hi all,
I'm using Nutch 2.3 in order to crawl some websites and index them into
ElasticSearch 1.3.
My problem is the database storage that I used. I have tested :
- HBase 0.94.14 : Works well, however, the fully distributed cluster
use HBase 1.0.0 and I can't use it because Nutch is not compatible with this
version.
- Accumulo 1.6 on cloudera : No errors with Nutch but it does nothing
in the database ... I'm not sure it works well...
- Cassandra 2.0.2 : It creates well the keyspace with tables but the
crawl doesn't success and I think it's a trouble with Nutch...
So... I would like to know what can I do now ?
I have a cluster with Hadoop 2.6.0 with HBase 1.0.0 and Accumulo 1.6 (fully
distributed). However, I am not sure that Nutch 2.3 works with this environment
but I would like some advices if you have any idea in order to help me ...
Thank you for reading.
Sincerely