[ https://issues.apache.org/jira/browse/LUCENE-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790673#action_12790673 ]
Michael McCandless commented on LUCENE-2120: -------------------------------------------- Thanks for the details John! {quote} bq: Why does Zoie even retain 3 readers? Why not keep only the current one? 1 mem reader for when the disk batch, 1 mem reader for the time disk reader indexes, 1 disk reader {quote} Hmm -- is this what the {{private static int MAX_READER_GENERATION = 3;}} in ThrottledLuceneNRTDataConsumer (link above) is doing? From that code, it looks like it just always retains the last 3 reopened readers... it sounds like the logic to keep the 2 mem readers & 1 disk reader is elsewhere (in addition) in Zoie? Or maybe this logic is what's doing that? (Not yet familiar enough w/ Zoie...). bq. By default, it only runs with Medline data. You don't need both. perf/settings/index.properties->data.type dictates which to use, file->medline, wiki->wikipedia OK I'll try to use only Wikipedia -- I've already slurped that down. It'd be great to get a ContentSource in contrib/benchmark that can produce docs from medline... {quote} You should use the branch: BR_DELETE_OPT It has the optimization you suggested on handling deleted docs, e.g. should not check for each hit candidate with IntSetAccelerator. Also, I have added a DataConsumer to handle delayed reopen for NRT case. You see the file handle leakage quickly with it: see perf/conf/zoie.properties to turn on ThrottledLuceneNRTDataConsumer. On my mac, I use lsof to see the file handle count. {quote} OK, will do. So (on this branch) you resolve the deleted docs in the BG, so that, once finished, the reader no longer double-checks hits for deletions? That sounds like a good improvement... It's sort of a "warm in the background" tradeoff, ie, give me my reader very quickly, even if the first searches against it must run a bit slower since they double check deletions, until the warming is done.... vs Lucene which forcefully "warms" (making reopen time longer) before returning the reader to you. > Possible file handle leak in near real-time reader > -------------------------------------------------- > > Key: LUCENE-2120 > URL: https://issues.apache.org/jira/browse/LUCENE-2120 > Project: Lucene - Java > Issue Type: Bug > Components: Index > Affects Versions: 3.1 > Reporter: Michael McCandless > Assignee: Michael McCandless > Fix For: 3.1 > > > Spinoff of LUCENE-1526: Jake/John hit file descriptor exhaustion when testing > NRT. > I've tried to repro this, stress testing NRT, saturating reopens, indexing, > searching, but haven't found any issue. > Let's try to get to the bottom of it, here... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org