[ 
https://issues.apache.org/jira/browse/LUCENE-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703686#action_12703686
 ] 

Michael McCandless commented on LUCENE-1313:
--------------------------------------------

Yonik raised a good question on LUCENE-1618, which is what gains do we really 
expect to see by using RAMDir for the tiny recently flushed segments?

It would be nice if we could approximately measure this before putting more 
work into this issue -- if the gains are not "decent" this optimization may not 
be worthwhile.

Of course, we are talking about 100s of milliseconds for the turnaround time to 
add docs & open an NRT reader, so if the time for writing/opening many tiny 
files in RAMDir vs FSDir  differs by say 10s of msecs then we should pursue 
this.  We should also consider that the IO system may very well be quite busy 
(doing merge(s), backups, etc.) and that'd make it slower to have to create 
tiny files.

A simpler optimization might be to allow using CFS for tiny files (even when 
CFS is turned off), but built the CFS in RAM (ie, write tiny files first to 
RAMFiles, then make the CFS file on disk).  That might get most of the gains 
since the FSDir sees only one file created per tiny segment, not N.

> Realtime Search
> ---------------
>
>                 Key: LUCENE-1313
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1313
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>    Affects Versions: 2.4.1
>            Reporter: Jason Rutherglen
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1313.jar, LUCENE-1313.patch, LUCENE-1313.patch, 
> LUCENE-1313.patch, LUCENE-1313.patch, lucene-1313.patch, lucene-1313.patch, 
> lucene-1313.patch, lucene-1313.patch
>
>
> Realtime search with transactional semantics.  
> Possible future directions:
>   * Optimistic concurrency
>   * Replication
> Encoding each transaction into a set of bytes by writing to a RAMDirectory 
> enables replication.  It is difficult to replicate using other methods 
> because while the document may easily be serialized, the analyzer cannot.
> I think this issue can hold realtime benchmarks which include indexing and 
> searching concurrently.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to