[jira] Commented: (LUCENE-2312) Search on IndexWriter's RAM Buffer

Michael McCandless (JIRA) Thu, 18 Mar 2010 02:18:55 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846804#action_12846804
 ]


Michael McCandless commented on LUCENE-2312:
--------------------------------------------

{quote}
Can't we simply throw away the doc writer after a
successful segment flush (the IRs would refer to it, however
once they're closed, the DW would close as well)?
{quote}

I think that should be our first approach.  It means no pooling whatsoever.  
And it means that an app that doesn't aggressively close its old NRT readers 
will consume more RAM.

Though... the NRT readers will be able to search an active DW right?  Ie, it's 
only when that DW needs to flush, when the NRT readers would be tying up the 
RAM.

So, when a flush happens, existing NRT readers will hold a reference to that 
now-flushed DW, but when they reopen they will cutover to the on-disk segment.

I think this will be an OK limitation in practice.  Once NRT readers can search 
a live (still being written) DW, flushing of a DW will be a relatively rare 
event (unlike today where we must flush every time an NRT reader is opened).

> Search on IndexWriter's RAM Buffer
> ----------------------------------
>
>                 Key: LUCENE-2312
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2312
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Search
>    Affects Versions: 3.0.1
>            Reporter: Jason Rutherglen
>            Assignee: Michael Busch
>             Fix For: 3.1
>
>
> In order to offer user's near realtime search, without incurring
> an indexing performance penalty, we can implement search on
> IndexWriter's RAM buffer. This is the buffer that is filled in
> RAM as documents are indexed. Currently the RAM buffer is
> flushed to the underlying directory (usually disk) before being
> made searchable. 
> Todays Lucene based NRT systems must incur the cost of merging
> segments, which can slow indexing. 
> Michael Busch has good suggestions regarding how to handle deletes using max 
> doc ids.  
> https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841923&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841923
> The area that isn't fully fleshed out is the terms dictionary,
> which needs to be sorted prior to queries executing. Currently
> IW implements a specialized hash table. Michael B has a
> suggestion here: 
> https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841915&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841915

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Commented: (LUCENE-2312) Search on IndexWriter's RAM Buffer

Reply via email to