[jira] Commented: (SOLR-1045) Build Solr index using Hadoop MapReduce

Ning Li (JIRA) Wed, 04 Mar 2009 07:56:20 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678763#action_12678763
 ]


Ning Li commented on SOLR-1045:
-------------------------------

Shalin and Yonik, thanks for the comments on the two features. But what is a 
Solr index? I thought it is everything in the data directory, not just the 
Lucene index in the data/index directory, no? If that's the case:
  - On writing a Solr index in a ram directory, I'm aware of the directory 
factory, but it's only for the directory of Lucene index.
  - On merging multiple Solr indexes, besides merging the Lucene indexes, it 
also means somehow "merging" other data in the data directory (e.g. "merging" 
by rebuilding the spell check index).

Am I correct?

> Build Solr index using Hadoop MapReduce
> ---------------------------------------
>
>                 Key: SOLR-1045
>                 URL: https://issues.apache.org/jira/browse/SOLR-1045
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Ning Li
>         Attachments: SOLR-1045.0.patch
>
>
> The goal is a contrib module that builds Solr index using Hadoop MapReduce.
> It is different from the Solr support in Nutch. The Solr support in Nutch 
> sends a document to a Solr server in a reduce task. Here, the goal is to 
> build/update Solr index within map/reduce tasks. Also, it achieves better 
> parallelism when the number of map tasks is greater than the number of reduce 
> tasks, which is usually the case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1045) Build Solr index using Hadoop MapReduce

Reply via email to