[
https://issues.apache.org/jira/browse/LUCENE-7500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Smiley updated LUCENE-7500:
---------------------------------
Attachment: LUCENE_7500_Remove_LeafReader_fields.patch
LUCENE_7500_avoid_leafReader_fields.patch
I updated both patches. I expanded the "avoid" patch to also avoid calling
{{MultiFields.getFields(IndexReader}} when there's an obvious replacement for
{{MultiFields.getTerms(IndexReader,String)}}. I also updated a few places in
Solr that could use the pre-existing instance of MultiFields indirectly via
Solr's SlowCompositeReaderWrapper.
The main "remove" patch has additional changes:
* Enhanced {{MultiFields.getTerms(IndexReader,String)}} to do its job directly
instead of being implemented as {{getFields(r).terms(field)}}. In addition to
avoiding getFields, this should be faster for any callers as there's no
iteration of FieldInfos for each leaf going on.
* {{WeightedSpanTermExtractor}} has an inner class DelegatingLeafReader that
explicitly overrode getFieldInfos to throw an UnsupportedOperationException.
This is problematic (tripped TestSurroundQueryParser failure); I think it's
fine to let it delegate. This failure was identified before this round of
changes but the exact circumstances will probably not occur now that
SrndTermQuery's calling of MultiFields.getTerms no longer relies on getFields
and thus no longer will actually examine FieldInfos. Nonetheless I think simply
delegating getFieldInfos is best.
> Nuke Fields.java in lieu of LeafReader.getTerms(fieldName)
> ----------------------------------------------------------
>
> Key: LUCENE-7500
> URL: https://issues.apache.org/jira/browse/LUCENE-7500
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: David Smiley
> Assignee: David Smiley
> Fix For: master (7.0)
>
> Attachments: LUCENE_7500_avoid_leafReader_fields.patch,
> LUCENE_7500_avoid_leafReader_fields.patch,
> LUCENE_7500_Remove_LeafReader_fields.patch,
> LUCENE_7500_Remove_LeafReader_fields.patch
>
>
> {{Fields}} seems like a pointless intermediary between the {{LeafReader}} and
> {{Terms}}. Why not have {{LeafReader.getTerms(fieldName)}} instead? One loses
> the ability to get the count and iterate over indexed fields, but it's not
> clear what real use-cases are for that and such rare needs could figure that
> out with FieldInfos.
> [~mikemccand] pointed out that we'd probably need to re-introduce a
> {{TermVectors}} class since TV's are row-oriented not column-oriented. IMO
> they should be column-oriented but that'd be a separate issue.
> _(p.s. I'm lacking time to do this w/i the next couple months so if someone
> else wants to tackle it then great)_
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]