[jira] Commented: (SOLR-1045) Build Solr index using Hadoop MapReduce

Lance Norskog (JIRA) Mon, 24 Aug 2009 14:04:25 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747070#action_12747070
 ]


Lance Norskog commented on SOLR-1045:
-------------------------------------

Map/Reduce would also be useful in the DataImportHandler. We're talking about 
parallelizing analysis stacks that require a lot of CPU. I would rather push 
this sort of thing out into the DIH - Solr Cell, for example. The DIH 
declaration language could have something like the ANT parallelization 
directives.

At this level of multi-threaded sophistication, Solr really wants to be an OSGi 
application instead of a custom-built mini application server.

> Build Solr index using Hadoop MapReduce
> ---------------------------------------
>
>                 Key: SOLR-1045
>                 URL: https://issues.apache.org/jira/browse/SOLR-1045
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Ning Li
>         Attachments: SOLR-1045.0.patch
>
>
> The goal is a contrib module that builds Solr index using Hadoop MapReduce.
> It is different from the Solr support in Nutch. The Solr support in Nutch 
> sends a document to a Solr server in a reduce task. Here, the goal is to 
> build/update Solr index within map/reduce tasks. Also, it achieves better 
> parallelism when the number of map tasks is greater than the number of reduce 
> tasks, which is usually the case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1045) Build Solr index using Hadoop MapReduce

Reply via email to