[ https://issues.apache.org/jira/browse/LUCENE-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12625117#action_12625117 ]
Matthew Mastracci commented on LUCENE-753: ------------------------------------------ Michael, bq. Are you really sure you're not accidentally closing the searcher before calling Searcher.docFreqs? Are you calling docFreqs directly from your app? Our IndexReaders are actually managed in a shared pool (currently 8 IndexReaders, shared round-robin style as requests come in). We have some custom reference counting logic that's supposed to keep the readers alive as long as somebody has them open. As new index snapshots come in, the IndexReaders are re-opened and reference counts ensure that any old index readers in use are kept alive until the searchers are done with them. I'm guessing we have an error in our reference counting logic that just doesn't show up under MMapDirectory (as you mentioned, close() is a no-op). We're calling docFreqs directly from our app. I'm guessing that it just happens to be the most likely item to be called after we roll to a new index snapshot. I don't have hard performance numbers right now, but we were having a hard time saturating I/O or CPU with FSDirectory. The locking was basically killing us. When we switched to MMapDirectory and turned on compound files, our performance jumped at least 2x. The preliminary results I'm seeing with NIOFSDirectory seem to indicate that it's slightly faster than MMapDirectory. I'll try setting our app back to using the old FSDirectory and see if the exceptions still occur. I'll also try to fiddle with our unit tests to make sure we're correctly ref-counting all of our index readers. BTW, I ran a quick FSDirectory/MMapDirectory/NIOFSDirectory shootout. It uses a parallel benchmark that roughly models what our real-life benchmark is like. I ran the benchmark once through to warm the disk cache, then got the following. The numbers are fairly stable across various runs once the disk caches are warm: FS: 33644ms MMap: 28616ms NIOFS: 33189ms I'm a bit surprised at the results myself, but I've spent a bit of time tuning the indexes to maximize concurrency. I'll double-check that the benchmark is correctly running all of the tests. The benchmark effectively runs 10-20 queries in parallel at a time, then waits for all queries to complete. It does this end-to-end for a number of different query batches, then totals up the time to complete each batch. > Use NIO positional read to avoid synchronization in FSIndexInput > ---------------------------------------------------------------- > > Key: LUCENE-753 > URL: https://issues.apache.org/jira/browse/LUCENE-753 > Project: Lucene - Java > Issue Type: New Feature > Components: Store > Reporter: Yonik Seeley > Assignee: Michael McCandless > Fix For: 2.4 > > Attachments: FileReadTest.java, FileReadTest.java, FileReadTest.java, > FileReadTest.java, FileReadTest.java, FileReadTest.java, FileReadTest.java, > FSDirectoryPool.patch, FSIndexInput.patch, FSIndexInput.patch, > LUCENE-753.patch, LUCENE-753.patch, lucene-753.patch, lucene-753.patch > > > As suggested by Doug, we could use NIO pread to avoid synchronization on the > underlying file. > This could mitigate any MT performance drop caused by reducing the number of > files in the index format. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]