We are using Nutch to crawl web sites and it stores documents at Hbase.
Nutch uses Solrj to send documents to be indexed. We have Hadoop at our
ecosystem as well. I think that there should be an implementation at Solrj
that sends documents (via CloudSolrServer or something like that) as
MapReduce
Why is it better to require another large software system (Hadoop), when it
works fine without it?
That just sounds like more stuff to configure, misconfigure, and cause problems
with indexing.
wunder
On Jul 5, 2013, at 4:48 AM, Furkan KAMACI wrote:
We are using Nutch to crawl web sites and
and easier to use!
-- Jack Krupansky
-Original Message-
From: Walter Underwood
Sent: Friday, July 05, 2013 12:11 PM
To: solr-user@lucene.apache.org
Subject: Re: Sending Documents via SolrServer as MapReduce Jobs at Solrj
Why is it better to require another large software system (Hadoop
to
argue for Solr to be made simpler and easier to use!
-- Jack Krupansky
-Original Message- From: Walter Underwood
Sent: Friday, July 05, 2013 12:11 PM
To: solr-user@lucene.apache.org
Subject: Re: Sending Documents via SolrServer as MapReduce Jobs at Solrj
Why is it better
to
argue for Solr to be made simpler and easier to use!
-- Jack Krupansky
-Original Message- From: Walter Underwood
Sent: Friday, July 05, 2013 12:11 PM
To: solr-user@lucene.apache.org
Subject: Re: Sending Documents via SolrServer as MapReduce Jobs at Solrj
Why
: Friday, July 05, 2013 12:11 PM
To: solr-user@lucene.apache.org
Subject: Re: Sending Documents via SolrServer as MapReduce Jobs at Solrj
Why is it better to require another large software system (Hadoop), when
it works fine without it?
That just sounds like more stuff to configure