[jira] Commented: (LUCENE-1313) Realtime Search

Jason Rutherglen (JIRA) Tue, 26 May 2009 09:44:11 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713097#action_12713097
 ]


Jason Rutherglen commented on LUCENE-1313:
------------------------------------------

I started on the pooled ram model after the last patch because
it is cleaner. Bytes are allocated up to the given limit set by
IW.setRAMBufferSizeMB. As mentioned below, we may want to add a
setting for the max ram temporarily used.

I'm reusing the DocumentsWriter.numBytesAlloc/numBytesUsed and
created a RAMPolicy that manages ramDirBytesAlloc and
ramDirBytes. Each time a merge is scheduled, the
sizeof(segments) is allocated by RAMPolicy and the segmentsAlloc
is stored in OneMerge. Once the merge completes or fails, the
ramDirBytesAlloc is adjusted by the difference between the
actual bytes used and OM.ramDirAlloc. This way we always have
the most accurate ramDir allocation in RamP, and we properly
adjust the amount of ram consumed. This works well with our
concurrent merging model where we can't predict when a merge
will complete.

{quote}One challenge we face is ensuring that while we are
flushing all ram segments to disk, we don't block the
getReader() turnaround. IE we can't make getReader() do that
flush synchronously....perhaps we "merge RAM segments to disk" a
bit early, eg once RAM consumed is > 90% of the total RAM
buffer{quote}

You're talking about the synchronization in IW.doFlushInternal
which would block getReader while writing a segment to disk? Our
default RAMPolicy should be one where we always flush the ram
buffer to the ramdir. Basically there must always be room in the
ram dir for the ram buffer. ramdir + (rambuf * 2) < maxSize. Or
do we assume that it's ok for ramUsed to temporarily exceed
ramMax by a given percent (110% which would be an option in
RAMPolicy)? while ramBuf is being flushed to ramDir? 

We may want to make some assumptions about usage of getReader
(i.e. getReader is called fairly often such that the rambuffer
is usually less than half of the ram used) when flushToRam=true
so that we can get a version of this functionality out the door,
then iterate as we gather feedback from users?

I'll include the comments in the next patch. 

> Realtime Search
> ---------------
>
>                 Key: LUCENE-1313
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1313
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>    Affects Versions: 2.4.1
>            Reporter: Jason Rutherglen
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1313.jar, LUCENE-1313.patch, LUCENE-1313.patch, 
> LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, 
> LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, 
> LUCENE-1313.patch, lucene-1313.patch, lucene-1313.patch, lucene-1313.patch, 
> lucene-1313.patch
>
>
> Enable near realtime search in Lucene without external
> dependencies. When RAM NRT is enabled, the implementation adds a
> RAMDirectory to IndexWriter. Flushes go to the ramdir unless
> there is no available space. Merges are completed in the ram
> dir until there is no more available ram. 
> IW.optimize and IW.commit flush the ramdir to the primary
> directory, all other operations try to keep segments in ram
> until there is no more space.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Commented: (LUCENE-1313) Realtime Search

Reply via email to