Re: Solr and hadoop

2014-09-25 Thread Michael Della Bitta
Yes, there's SolrInputDocumentWritable and MapReduceIndexerTool, plus the Morphline stuff (check out https://github.com/markrmiller/solr-map-reduce-example). Michael Della Bitta Applications Developer o: +1 646 532 3062 appinions inc. “The Science of Influence Marketing” 18 East 41st Street

Re: Solr and hadoop

2014-09-25 Thread Tom Chen
I'm aware of the MapReduceIndexerTool (MRIT). That might be solving the indexing part -- the OutputFormat part. But what I asked for is more on the making Solr index data available to Hadoop MapReduce -- making Solr as a data store like what HDFS can provide. With a Solr InputFormat, we can make

Re: Solr and hadoop

2014-09-25 Thread Joel Bernstein
Hi Tom, I am not aware of a Solr InputFormat implementation yet. The /export handier, which outputs entire sorted results sets, was designed to support these types of bulk export operations efficiently. I think a Solr InputFormat would be excellent project to begin working on. Also SOLR-6526 is

Re: Solr with Hadoop

2013-07-18 Thread Matt Lieber
Rajesh, If you require to have an integration between Solr and Hadoop or NoSQL, I would recommend using a commercial distribution. I think most are free to use as long as you don't require support. I inquired about the Cloudera Search capability, but it seems like that far it is just preliminary:

RE: Solr with Hadoop

2013-07-18 Thread Saikat Kanjilal
have more questions. Regards From: mlie...@impetus.com To: solr-user@lucene.apache.org Subject: Re: Solr with Hadoop Date: Thu, 18 Jul 2013 15:41:36 + Rajesh, If you require to have an integration between Solr and Hadoop or NoSQL, I would recommend using a commercial distribution. I

Re: solr with hadoop

2010-07-06 Thread Jason Rutherglen
If you do distributed indexing correctly, what about updating the documents and what about replicating them correctly? Yes, you can do you and it'll work great. On Mon, Jul 5, 2010 at 7:42 AM, MitchK mitc...@web.de wrote: I need to revive this discussion... If you do distributed indexing

Re: solr with hadoop

2010-07-05 Thread MitchK
I need to revive this discussion... If you do distributed indexing correctly, what about updating the documents and what about replicating them correctly? Does this work? Or wasn't this an issue? Kind regards - Mitch -- View this message in context:

Re: solr with hadoop

2010-06-23 Thread Otis Gospodnetic
From: Jon Baer jonb...@gmail.com To: solr-user@lucene.apache.org Sent: Tue, June 22, 2010 12:47:14 PM Subject: Re: solr with hadoop I was playing around w/ Sqoop the other day, its a simple Cloudera tool for imports (mysql - hdfs) @ href=http://www.cloudera.com/developers/downloads/sqoop

Re: solr with hadoop

2010-06-22 Thread Neeb
Hi, We currently have a master-slave setup for solr with two slave servers. We are using Solrj (stream-update-solr-server) to index master slave, which takes 6 hours to index around 15 million documents. I would like to explore hadoop, in particularly for indexing job using mapreduce approach.

Re: solr with hadoop

2010-06-22 Thread Marc Sturlese
I think a good solution could be to use hadoop with SOLR-1301 to build solr shards and then use solr distributed search against these shards (you will have to copy to local from HDFS to search against them) -- View this message in context:

Re: solr with hadoop

2010-06-22 Thread MitchK
Message From: Stu Hood stuh...@webmail.us To: solr-user@lucene.apache.org Sent: Monday, January 7, 2008 7:14:20 PM Subject: Re: solr with hadoop As Mike suggested, we use Hadoop to organize our data en route to Solr. Hadoop allows us to load balance the indexing stage, and then we use

Re: solr with hadoop

2010-06-22 Thread Jon Baer
IndexWriter.addAllIndexes or do you do that outside Hadoop? Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Stu Hood stuh...@webmail.us To: solr-user@lucene.apache.org Sent: Monday, January 7, 2008 7:14:20 PM Subject: Re: solr with hadoop

Re: solr with hadoop

2008-01-07 Thread Stu Hood
Klaas [EMAIL PROTECTED] Sent: Friday, January 4, 2008 3:04pm To: solr-user@lucene.apache.org Subject: Re: solr with hadoop On 4-Jan-08, at 11:37 AM, Evgeniy Strokin wrote: I have huge index base (about 110 millions documents, 100 fields each). But size of the index base is reasonable, it's about

Re: solr with hadoop

2008-01-07 Thread Otis Gospodnetic
-user@lucene.apache.org Sent: Monday, January 7, 2008 7:14:20 PM Subject: Re: solr with hadoop As Mike suggested, we use Hadoop to organize our data en route to Solr. Hadoop allows us to load balance the indexing stage, and then we use the raw Lucene IndexWriter.addAllIndexes method to merge

Re: solr with hadoop

2008-01-04 Thread Mike Klaas
On 4-Jan-08, at 11:37 AM, Evgeniy Strokin wrote: I have huge index base (about 110 millions documents, 100 fields each). But size of the index base is reasonable, it's about 70 Gb. All I need is increase performance, since some queries, which match big number of documents, are running

Re: solr with hadoop

2008-01-04 Thread Ryan McKinley
Mike Klaas wrote: On 4-Jan-08, at 11:37 AM, Evgeniy Strokin wrote: I have huge index base (about 110 millions documents, 100 fields each). But size of the index base is reasonable, it's about 70 Gb. All I need is increase performance, since some queries, which match big number of documents,