[ https://issues.apache.org/jira/browse/LUCENE-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744833#action_12744833 ]
Tim Smith commented on LUCENE-1821: ----------------------------------- bq. The goal is to move all caches to the segment level in Lucene - we don't want to encourage users to cache per multi-reader by providing API help to do so. I agree that this is the goal, and that using per segment caches should be the encouraged route for field caching needs. I plan to update the vast majority of the caches i use to be loaded on a per segment basis once i switch to 2.9 to take advantage of this. But it should still be possible for advanced users to do caching on the multireader level. This may require porting upon subsequent versions of lucene (as i'm seeing i will have to for 2.9), however this should remain possible bq. If you need index wide stats, you use the Weight. I'm currently using weight to get this cache on the multireader level, however with 2.9 i will have to jump through some more hoops in order to be able to use this cache on each sub reader's scorer bq. You are trying to use the internal ids externally All my usage of "internal" docids occurs inside Weight, Scorer, and HitCollector implementations. I don't see how this is really "external" as it is using published interfaces. Its just that the interpretation of these interfaces changed for 2.9 (i have no problem with this as long as i can port from 2.4 with minimal to moderate effort). The reason they were able to change was only because no implementations provided by vanilla lucene or in contrib required the "whollistic" view of the index bq. The FieldCache is the caching mechanism that Lucene supports with internal ids - and it supports it per segment. The FieldCache mechanism did not meet all my needs with regards to schema/retention policy/etc, so i have been doing caching in my own code base for quite some time. While the FieldCache usage should be encouraged, it should not be required of advanced users. It should be acceptable for advanced users to feel some pain on upgrading, but there should be a rather clear path for doing so (without a loss of functionality, and ideally without requiring custom patches on top of a released version of lucene) bq. Sorting is internal. While sorting is provided by lucene APIs, there is nothing (and should be nothing) stopping someone from performing sorting on their own terms via the Collector interface and their own priority queues/API > Weight.scorer() not passed doc offset for "sub reader" > ------------------------------------------------------ > > Key: LUCENE-1821 > URL: https://issues.apache.org/jira/browse/LUCENE-1821 > Project: Lucene - Java > Issue Type: Bug > Components: Search > Affects Versions: 2.9 > Reporter: Tim Smith > > Now that searching is done on a per segment basis, there is no way for a > Scorer to know the "actual" doc id for the document's it matches (only the > relative doc offset into the segment) > If using caches in your scorer that are based on the "entire" index (all > segments), there is now no way to index into them properly from inside a > Scorer because the scorer is not passed the needed offset to calculate the > "real" docid > suggest having Weight.scorer() method also take a integer for the doc offset > Abstract Weight class should have a constructor that takes this offset as > well as a method to get the offset > All Weights that have "sub" weights must pass this offset down to created > "sub" weights -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org