I am doing some research about creating lucene/solr index using hadoop but there's not so much info around, would be great to see some code!!! (I am experiencing problems specially in duplication detection) Thanks
Shalin Shekhar Mangar wrote: > > On Mon, Mar 2, 2009 at 11:24 PM, Ning Li <[email protected]> wrote: > >> Hi, >> >> I wonder if there is interest in a contrib module that builds Solr >> index using Hadoop MapReduce? >> > > Absolutely! > > >> It is different from the Solr support in Nutch. The Solr support in >> Nutch sends a document to a Solr server in a reduce task. Here, I aim >> at building/updating Solr index within map/reduce tasks. Also, it >> achieves better parallelism when the number of map tasks is greater >> than the number of reduce tasks, which is usually the case. >> >> I worked out a very simple initial version. But I want to check if >> there is any interest before proceeding. If so, I'll open a Jira >> issue. >> > > +1 > > Please do. It'd be great to see this in Solr. > > -- > Regards, > Shalin Shekhar Mangar. > > -- View this message in context: http://www.nabble.com/Build-Solr-index-using-Hadoop-MapReduce-tp22293172p22296832.html Sent from the Solr - Dev mailing list archive at Nabble.com.
