[ https://issues.apache.org/jira/browse/LUCENE-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated LUCENE-1516: --------------------------------------- Attachment: magnetic.png ssd.png OK I ran a basic initial test of the latency when opening a near real-time reader. Using contrib/benchmark, I index wikipedia docs like normal, but then I added a NearRealTimeReaderTask, which runs a BG thread that once every N (I did 3) seconds it gets a new reader from the writer. Then it does a simple search for term "1" in the body, and sorts by the docdate field. I measured milli-seconds to reopen and to run the search, and plot those as a function of index size. I ran two tests. The first (attached as ssd.png) stores the index on an Intel X25M solid-state disk; the second (attached as magnetic.png) on a WD Velociraptor. Notes: * The reopen time is ~700 msec in both cases, and doesn't change much as index grows (which is nice). * It is quite noisy, likely due to merges committing. * I logged (but did not graph) the flush time vs actual reopen time, and it's the flush time that dominates. This is good because with a slower indexing rate, this flush time would go way down. My guess is flushing is CPU bound not IO bound. * The search time ramps up linearly (expected), and also shows spikes due to merging. There's one massive spike at the end of the SSD one that's odd (did not correspond to reopening after a merge, though perhaps during a merge). * This is a somewhat overly stressful test because I'm indexing docs at full speed. Whereas I'd expect for the typical near realtime search app, the docs would usually be trickling in more slowly and a reopen would happen after just a few docs. * SSD and magnetic look pretty darn similar, though magnetic shows more noise and maybe is more affected by merges. * I'm doing no deletions in this test, but the typical near real-time app presumably would. > Integrate IndexReader with IndexWriter > --------------------------------------- > > Key: LUCENE-1516 > URL: https://issues.apache.org/jira/browse/LUCENE-1516 > Project: Lucene - Java > Issue Type: Improvement > Affects Versions: 2.4 > Reporter: Jason Rutherglen > Assignee: Michael McCandless > Priority: Minor > Fix For: 2.9 > > Attachments: LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch, > LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch, > LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch, > LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch, > LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch, > LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch, > LUCENE-1516.patch, magnetic.png, ssd.png > > Original Estimate: 672h > Remaining Estimate: 672h > > The current problem is an IndexReader and IndexWriter cannot be open > at the same time and perform updates as they both require a write > lock to the index. While methods such as IW.deleteDocuments enables > deleting from IW, methods such as IR.deleteDocument(int doc) and > norms updating are not available from IW. This limits the > capabilities of performing updates to the index dynamically or in > realtime without closing the IW and opening an IR, deleting or > updating norms, flushing, then opening the IW again, a process which > can be detrimental to realtime updates. > This patch will expose an IndexWriter.getReader method that returns > the currently flushed state of the index as a class that implements > IndexReader. The new IR implementation will differ from existing IR > implementations such as MultiSegmentReader in that flushing will > synchronize updates with IW in part by sharing the write lock. All > methods of IR will be usable including reopen and clone. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org