[ 
https://issues.apache.org/jira/browse/LUCENE-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073462#comment-13073462
 ] 

Eks Dev commented on LUCENE-1879:
---------------------------------

The user mentioned above in comment was me, I guess. Commenting here just to 
add interesting use case that would be perfectly solved by this issue.  

Imagine solr Master - Slave setup, full document contains CONTENT and ID 
fields, e.g. 200Mio+ collection. On master, we need field ID indexed in order 
to process delete/update commands. On slave, we do not need lookup on ID and 
would like to keep our TermsDictionary small, without exploding TermsDictionary 
with 200Mio+ unique ID terms (ouch, this is a lot compared to 5Mio unique terms 
in CONTENT, with or without pulsing). 

With this issue,  this could be nativly achieved by modifying solr 
UpdateHandler not to transfer "ID-Index" to slaves at all.

There are other ways to fix it, but this would be the best.(I am currently 
investigating an option to transfer full index on update, but to filter-out 
TermsDictionary on IndexReader level (it remains on disk, but this part never 
gets accessed on slaves). I do not know yet if this is possible at all in 
general , e.g. FST based term dictionary is already built (prefix compressed 
TermDict would be doable)

> Parallel incremental indexing
> -----------------------------
>
>                 Key: LUCENE-1879
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1879
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: core/index
>            Reporter: Michael Busch
>            Assignee: Michael Busch
>             Fix For: 4.0
>
>         Attachments: parallel_incremental_indexing.tar
>
>
> A new feature that allows building parallel indexes and keeping them in sync 
> on a docID level, independent of the choice of the MergePolicy/MergeScheduler.
> Find details on the wiki page for this feature:
> http://wiki.apache.org/lucene-java/ParallelIncrementalIndexing 
> Discussion on java-dev:
> http://markmail.org/thread/ql3oxzkob7aqf3jd

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to