[
https://issues.apache.org/jira/browse/LUCENE-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744847#action_12744847
]
Mark Miller commented on LUCENE-1821:
-------------------------------------
The "internal" vs "external" is kind of confusing made up terms - my fault
really.
When I think of using the ids 'internally' I'm thinking that you are taking the
index reader and making no assumptions. You just use the single reader and its
id space. You can use those ids to get values, and you can map from those ids
to values.
The assumption being made here is that you can load up ords for every doc and
that these ords will be comparable in a way that every document id across the
whole index maps to the same ord if it has the same value for a field. Nothing
in the API promised that to my knowledge - it just happened to be a happy side
effect.
bq. While sorting is provided by lucene APIs, there is nothing (and should be
nothing) stopping someone from performing sorting on their own terms via the
Collector interface and their own priority queues/API
Indeed - just like there is nothing stopping you from continuing to use a
MultiReader for this functionality.
What I mean by sorting is internal is that we specifically support comparing
ords/values across readers. I think we would prefer that you don't count on ids
coming from the top reader or a sub reader in other cases. We don't promise one
way or another. We just give a reader and say work with this reader.
Experts can generally jump around that if they need to - Solr does a bit of
this - or you can choose to continue using Multi-Readers.
I'm not saying we should make it impossible for you to do this - but I don't
think we should open a path for scorers to reconstruct multi-reader virtual
ids. I don't think a Scorer should know or care why type of IndexReader it is
working with.
> Weight.scorer() not passed doc offset for "sub reader"
> ------------------------------------------------------
>
> Key: LUCENE-1821
> URL: https://issues.apache.org/jira/browse/LUCENE-1821
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 2.9
> Reporter: Tim Smith
>
> Now that searching is done on a per segment basis, there is no way for a
> Scorer to know the "actual" doc id for the document's it matches (only the
> relative doc offset into the segment)
> If using caches in your scorer that are based on the "entire" index (all
> segments), there is now no way to index into them properly from inside a
> Scorer because the scorer is not passed the needed offset to calculate the
> "real" docid
> suggest having Weight.scorer() method also take a integer for the doc offset
> Abstract Weight class should have a constructor that takes this offset as
> well as a method to get the offset
> All Weights that have "sub" weights must pass this offset down to created
> "sub" weights
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]