[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated LUCENE-1483: --------------------------------------- Attachment: LUCENE-1483.patch Attached initial patch (derived from one of the earlier patches). Alot of work remains. TestSort (and likely others) fail. {quote} > Thats were I don't follow though - its not ords in the queue right? Its > ScoreDocs. Thats whats getting me at the moment. {quote} Exactly -- so I built first cut at the alternative "copy value" approach, where the comparator (new FieldComparator abstract class) is responsible for holding the values it needs for docs inserted into the queue. I also added TopFieldValueDocCollector (extends DocCollector), and ByValueFieldSortedHitQueue (extends PriorityQueue) that interacts with the FieldComparators. (We can change these names...). I updated IndexSearcher to use this new queue for field sorting. This patch only handles SortField.{DOC,SCORE,INT} now, but I think the approach has early surprising promise: I'm seeing a sizable performance gain for the "sort by int field" case (13.76 sec vs 17.95 sec for 300 queries getting top 100 hits from 1M results) --> 23% faster. I verified for the test sort alg (above) it's producing the right results (at least top 40 docs match). I didn't expect such performance gain (I was hoping for not much performance loss, actually). I think it may be that although the initial value copy adds some cost, the within-queue comparsions are then faster because you don't have to deref back to the fieldcache array. It seems we keep accidentally discovering performance gains here :) If we go forward with this approach I think it'd mean deprecating FieldSortedHitQueue & ScoreDocComparator, because I think there's no back-compatible way to migrate forward. I also like that this approach means we only need an iterator interface to FieldCache values (for LUCENE-831). Mark can you look this over and see if it makes sense and maybe try to tackle the other sort types? String will be the most interesting but I think very doable. > Change IndexSearcher to use MultiSearcher semantics for multiple subreaders > --------------------------------------------------------------------------- > > Key: LUCENE-1483 > URL: https://issues.apache.org/jira/browse/LUCENE-1483 > Project: Lucene - Java > Issue Type: Improvement > Affects Versions: 2.9 > Reporter: Mark Miller > Priority: Minor > Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, > LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, > LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch > > > FieldCache and Filters are forced down to a single segment reader, allowing > for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org