[jira] Commented: (LUCENE-753) Use NIO positional read to avoid synchronization in FSIndexInput

Matthew Mastracci (JIRA) Sat, 23 Aug 2008 15:32:36 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12625117#action_12625117
 ]


Matthew Mastracci commented on LUCENE-753:
------------------------------------------

Michael,

bq. Are you really sure you're not accidentally closing the searcher before 
calling Searcher.docFreqs? Are you calling docFreqs directly from your app?

Our IndexReaders are actually managed in a shared pool (currently 8 
IndexReaders, shared round-robin style as requests come in).  We have some 
custom reference counting logic that's supposed to keep the readers alive as 
long as somebody has them open.  As new index snapshots come in, the 
IndexReaders are re-opened and reference counts ensure that any old index 
readers in use are kept alive until the searchers are done with them.  I'm 
guessing we have an error in our reference counting logic that just doesn't 
show up under MMapDirectory (as you mentioned, close() is a no-op).

We're calling docFreqs directly from our app.  I'm guessing that it just 
happens to be the most likely item to be called after we roll to a new index 
snapshot.

I don't have hard performance numbers right now, but we were having a hard time 
saturating I/O or CPU with FSDirectory.  The locking was basically killing us.  
When we switched to MMapDirectory and turned on compound files, our performance 
jumped at least 2x.  The preliminary results I'm seeing with NIOFSDirectory 
seem to indicate that it's slightly faster than MMapDirectory.

I'll try setting our app back to using the old FSDirectory and see if the 
exceptions still occur.  I'll also try to fiddle with our unit tests to make 
sure we're correctly ref-counting all of our index readers.

BTW, I ran a quick FSDirectory/MMapDirectory/NIOFSDirectory shootout.  It uses 
a parallel benchmark that roughly models what our real-life benchmark is like.  
I ran the benchmark once through to warm the disk cache, then got the 
following.  The numbers are fairly stable across various runs once the disk 
caches are warm:

FS: 33644ms
MMap: 28616ms
NIOFS: 33189ms

I'm a bit surprised at the results myself, but I've spent a bit of time tuning 
the indexes to maximize concurrency.  I'll double-check that the benchmark is 
correctly running all of the tests.

The benchmark effectively runs 10-20 queries in parallel at a time, then waits 
for all queries to complete.  It does this end-to-end for a number of different 
query batches, then totals up the time to complete each batch.


> Use NIO positional read to avoid synchronization in FSIndexInput
> ----------------------------------------------------------------
>
>                 Key: LUCENE-753
>                 URL: https://issues.apache.org/jira/browse/LUCENE-753
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Store
>            Reporter: Yonik Seeley
>            Assignee: Michael McCandless
>             Fix For: 2.4
>
>         Attachments: FileReadTest.java, FileReadTest.java, FileReadTest.java, 
> FileReadTest.java, FileReadTest.java, FileReadTest.java, FileReadTest.java, 
> FSDirectoryPool.patch, FSIndexInput.patch, FSIndexInput.patch, 
> LUCENE-753.patch, LUCENE-753.patch, lucene-753.patch, lucene-753.patch
>
>
> As suggested by Doug, we could use NIO pread to avoid synchronization on the 
> underlying file.
> This could mitigate any MT performance drop caused by reducing the number of 
> files in the index format.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-753) Use NIO positional read to avoid synchronization in FSIndexInput

Reply via email to