[
https://issues.apache.org/jira/browse/LUCENE-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845057#action_12845057
]
Michael McCandless commented on LUCENE-2312:
--------------------------------------------
Michael are you also going to [first] tackle truly separating the RAM segments?
I think we need this first ...
bq. Mike, Why does DocFieldConsumers have DocFieldConsumer one and two? How is
this class used? Thanks.
This is so we can make a "tee" in the indexing chain. Here's the default chain
(copied out of comment in DW):
{code}
DocConsumer / DocConsumerPerThread
--> code: DocFieldProcessor / DocFieldProcessorPerThread
--> DocFieldConsumer / DocFieldConsumerPerThread / DocFieldConsumerPerField
--> code: DocFieldConsumers / DocFieldConsumersPerThread /
DocFieldConsumersPerField
--> code: DocInverter / DocInverterPerThread / DocInverterPerField
--> InvertedDocConsumer / InvertedDocConsumerPerThread /
InvertedDocConsumerPerField
--> code: TermsHash / TermsHashPerThread / TermsHashPerField
--> TermsHashConsumer / TermsHashConsumerPerThread /
TermsHashConsumerPerField
--> code: FreqProxTermsWriter / FreqProxTermsWriterPerThread /
FreqProxTermsWriterPerField
--> code: TermVectorsTermsWriter /
TermVectorsTermsWriterPerThread / TermVectorsTermsWriterPerField
--> InvertedDocEndConsumer / InvertedDocConsumerPerThread /
InvertedDocConsumerPerField
--> code: NormsWriter / NormsWriterPerThread / NormsWriterPerField
--> code: StoredFieldsWriter / StoredFieldsWriterPerThread /
StoredFieldsWriterPerField
{code}
The tee is so the doc fields can go to both DocInvert (for creating postings &
term vectors) and to stored fields writer.
> Search on IndexWriter's RAM Buffer
> ----------------------------------
>
> Key: LUCENE-2312
> URL: https://issues.apache.org/jira/browse/LUCENE-2312
> Project: Lucene - Java
> Issue Type: New Feature
> Components: Search
> Affects Versions: 3.0.1
> Reporter: Jason Rutherglen
> Assignee: Michael Busch
> Fix For: 3.1
>
>
> In order to offer user's near realtime search, without incurring
> an indexing performance penalty, we can implement search on
> IndexWriter's RAM buffer. This is the buffer that is filled in
> RAM as documents are indexed. Currently the RAM buffer is
> flushed to the underlying directory (usually disk) before being
> made searchable.
> Todays Lucene based NRT systems must incur the cost of merging
> segments, which can slow indexing.
> Michael Busch has good suggestions regarding how to handle deletes using max
> doc ids.
> https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841923&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841923
> The area that isn't fully fleshed out is the terms dictionary,
> which needs to be sorted prior to queries executing. Currently
> IW implements a specialized hash table. Michael B has a
> suggestion here:
> https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841915&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841915
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]