We are using Nutch to crawl web sites and it stores documents at Hbase.
Nutch uses Solrj to send documents to be indexed. We have Hadoop at our
ecosystem as well. I think that there should be an implementation at Solrj
that sends documents (via CloudSolrServer or something like that) as
MapReduce jobs. Is there any implentation for it or is it not a good idea?

Reply via email to