[ https://issues.apache.org/jira/browse/LUCENE-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844891#action_12844891 ]
Jason Rutherglen commented on LUCENE-2312: ------------------------------------------ >From LUCENE-2293: {quote}(b-tree, or, simply sort-on-demand the first time a query needs it, though that cost increases the larger your RAM segments get, ie, not incremental to the # docs you just added){quote} For the terms dictionary, perhaps a terms array (this could be a RawPostingList[], or an array of objects with pointers to a RawPostingList with some helper methods like getTerm and compareTo), is kept in sorted order, we then binary search and insert new RawPostingLists/terms into the array. We *could* implement a 2 dimensional array, allowing us to make a per reader copy of the 1st dimension of array. This would maintain transactional consistency (ie, a reader's array isn't changing as a term enum is traversing in another thread). {quote}Also, we have to solve what happens to a reader using a RAM segment that's been flushed. Perhaps we don't reuse RAM at that point, ie, rely on GC to reclaim once all readers using that RAM segment have closed.{quote} I don't think we have a choice here? > Search on IndexWriter's RAM Buffer > ---------------------------------- > > Key: LUCENE-2312 > URL: https://issues.apache.org/jira/browse/LUCENE-2312 > Project: Lucene - Java > Issue Type: New Feature > Components: Search > Affects Versions: 3.0.1 > Reporter: Jason Rutherglen > Fix For: 3.0.2 > > > In order to offer user's near realtime search, without incurring > an indexing performance penalty, we can implement search on > IndexWriter's RAM buffer. This is the buffer that is filled in > RAM as documents are indexed. Currently the RAM buffer is > flushed to the underlying directory (usually disk) before being > made searchable. > Todays Lucene based NRT systems must incur the cost of merging > segments, which can slow indexing. > Michael Busch has good suggestions regarding how to handle deletes using max > doc ids. > https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841923&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841923 > The area that isn't fully fleshed out is the terms dictionary, > which needs to be sorted prior to queries executing. Currently > IW implements a specialized hash table. Michael B has a > suggestion here: > https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841915&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841915 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org