[ 
https://issues.apache.org/jira/browse/LUCENE-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744833#action_12744833
 ] 

Tim Smith commented on LUCENE-1821:
-----------------------------------

bq. The goal is to move all caches to the segment level in Lucene - we don't 
want to encourage users to cache per multi-reader by providing API help to do 
so.
I agree that this is the goal, and that using per segment caches should be the 
encouraged route for field caching needs. 
I plan to update the vast majority of the caches i use to be loaded on a per 
segment basis once i switch to 2.9 to take advantage of this.
But it should still be possible for advanced users to do caching on the 
multireader level. This may require porting upon subsequent versions of lucene 
(as i'm seeing i will have to for 2.9), however this should remain possible

bq. If you need index wide stats, you use the Weight.
I'm currently using weight to get this cache on the multireader level, however 
with 2.9 i will have to jump through some more hoops in order to be able to use 
this cache on each sub reader's scorer

bq. You are trying to use the internal ids externally
All my usage of "internal" docids occurs inside Weight, Scorer, and 
HitCollector implementations. I don't see how this is really "external" as it 
is using published interfaces. Its just that the interpretation of these 
interfaces changed for 2.9 (i have no problem with this as long as i can port 
from 2.4 with minimal to moderate effort). The reason they were able to change 
was only because no implementations provided by vanilla lucene or in contrib 
required the "whollistic" view of the index

bq. The FieldCache is the caching mechanism that Lucene supports with internal 
ids - and it supports it per segment.
The FieldCache mechanism did not meet all my needs with regards to 
schema/retention policy/etc, so i have been doing caching in my own code base 
for quite some time. While the FieldCache usage should be encouraged, it should 
not be required of advanced users. It should be acceptable for advanced users 
to feel some pain on upgrading, but there should be a rather clear path for 
doing so (without a loss of functionality, and ideally without requiring custom 
patches on top of a released version of lucene)

bq. Sorting is internal.
While sorting is provided by lucene APIs, there is nothing (and should be 
nothing) stopping someone from performing sorting on their own terms via the 
Collector interface and their own priority queues/API



> Weight.scorer() not passed doc offset for "sub reader"
> ------------------------------------------------------
>
>                 Key: LUCENE-1821
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1821
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>    Affects Versions: 2.9
>            Reporter: Tim Smith
>
> Now that searching is done on a per segment basis, there is no way for a 
> Scorer to know the "actual" doc id for the document's it matches (only the 
> relative doc offset into the segment)
> If using caches in your scorer that are based on the "entire" index (all 
> segments), there is now no way to index into them properly from inside a 
> Scorer because the scorer is not passed the needed offset to calculate the 
> "real" docid
> suggest having Weight.scorer() method also take a integer for the doc offset
> Abstract Weight class should have a constructor that takes this offset as 
> well as a method to get the offset
> All Weights that have "sub" weights must pass this offset down to created 
> "sub" weights

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to