[ https://issues.apache.org/jira/browse/LUCENE-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845036#action_12845036 ]
Jason Rutherglen commented on LUCENE-2312: ------------------------------------------ A few notes so far: * IW flush could become thread dependent (eg, it'll only flush for the current doc writer) or maybe it should flush all doc writers? Close will shut down and flush all doc writers. * A new term will first check the hash table for existence (as currently), if it's not in the term hash table only then will it be added to the btree (btw, binary search is O(log N) on average?) This way we're avoiding the somewhat costlier btree existence check per token. * The algorithm for flushing doc writers based on RAM consumption can simply be, on exceed, flush the doc writer consuming the most RAM? * I gutted the PerThread classes, then realized, it's all too intertwined. I'd rather get *something* working, than spend an excessive amount of time rearranging code that already works. > Search on IndexWriter's RAM Buffer > ---------------------------------- > > Key: LUCENE-2312 > URL: https://issues.apache.org/jira/browse/LUCENE-2312 > Project: Lucene - Java > Issue Type: New Feature > Components: Search > Affects Versions: 3.0.1 > Reporter: Jason Rutherglen > Assignee: Michael Busch > Fix For: 3.0.2 > > > In order to offer user's near realtime search, without incurring > an indexing performance penalty, we can implement search on > IndexWriter's RAM buffer. This is the buffer that is filled in > RAM as documents are indexed. Currently the RAM buffer is > flushed to the underlying directory (usually disk) before being > made searchable. > Todays Lucene based NRT systems must incur the cost of merging > segments, which can slow indexing. > Michael Busch has good suggestions regarding how to handle deletes using max > doc ids. > https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841923&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841923 > The area that isn't fully fleshed out is the terms dictionary, > which needs to be sorted prior to queries executing. Currently > IW implements a specialized hash table. Michael B has a > suggestion here: > https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841915&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841915 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org