[jira] Commented: (LUCENE-1821) Weight.scorer() not passed doc offset for "sub reader"

Mark Miller (JIRA) Tue, 18 Aug 2009 19:27:41 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744847#action_12744847
 ]


Mark Miller commented on LUCENE-1821:
-------------------------------------

The "internal" vs "external" is kind of confusing made up terms - my fault 
really.

When I think of using the ids 'internally' I'm thinking that you are taking the 
index reader and making no assumptions. You just use the single reader and its 
id space. You can use those ids to get values, and you can map from those ids 
to values.

The assumption being made here is that you can load up ords for every doc and 
that these ords will be comparable in a way that every document id across the 
whole index maps to the same ord if it has the same value for a field. Nothing 
in the API promised that to my knowledge - it just happened to be a happy side 
effect. 

bq. While sorting is provided by lucene APIs, there is nothing (and should be 
nothing) stopping someone from performing sorting on their own terms via the 
Collector interface and their own priority queues/API
 
Indeed - just like there is nothing stopping you from continuing to use a 
MultiReader for this functionality.

What I mean by sorting is internal is that we specifically support comparing 
ords/values across readers. I think we would prefer that you don't count on ids 
coming from the top reader or a sub reader in other cases. We don't promise one 
way or another. We just give a reader and say work with this reader.

Experts can generally jump around that if they need to - Solr does a bit of 
this - or you can choose to continue using Multi-Readers.

I'm not saying we should make it impossible for you to do this - but I don't 
think we should open a path for scorers to reconstruct multi-reader virtual 
ids. I don't think a Scorer should know or care why type of IndexReader it is 
working with.

> Weight.scorer() not passed doc offset for "sub reader"
> ------------------------------------------------------
>
>                 Key: LUCENE-1821
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1821
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>    Affects Versions: 2.9
>            Reporter: Tim Smith
>
> Now that searching is done on a per segment basis, there is no way for a 
> Scorer to know the "actual" doc id for the document's it matches (only the 
> relative doc offset into the segment)
> If using caches in your scorer that are based on the "entire" index (all 
> segments), there is now no way to index into them properly from inside a 
> Scorer because the scorer is not passed the needed offset to calculate the 
> "real" docid
> suggest having Weight.scorer() method also take a integer for the doc offset
> Abstract Weight class should have a constructor that takes this offset as 
> well as a method to get the offset
> All Weights that have "sub" weights must pass this offset down to created 
> "sub" weights

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Commented: (LUCENE-1821) Weight.scorer() not passed doc offset for "sub reader"

Reply via email to