[
https://issues.apache.org/jira/browse/LUCENE-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Adrien Grand updated LUCENE-6766:
---------------------------------
Attachment: LUCENE-6766.patch
Here is a first prototype that:
- moves sorting logic from misc to core
- removes SortingMergePolicy
- adds an "indexSort" parameter to IndexWriterConfig and SegmentInfo, with
null meaning that the index order is unspecified
- SimpleTextCodec (de)serializes this indexOrder parameter, other codecs
ignore it for now
- refactors a bit the doc ID remapping logic in IndexWriter when there have
been deletions while some segments were being merged
Open question: how should we serialize the SortField objects? Should we have a
fixed list of supported SortField parameters or should we allow SortField
parameters to serialize themselves?
There are lots of things we could do on the search side, but for now I'd like
to focus on the indexing side and making sure the sort order of segments is
easily accessible.
> Make index sorting a first-class citizen
> ----------------------------------------
>
> Key: LUCENE-6766
> URL: https://issues.apache.org/jira/browse/LUCENE-6766
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Priority: Minor
> Attachments: LUCENE-6766.patch
>
>
> Today index sorting is a very expert feature. You need to use a custom merge
> policy, custom collectors, etc. I would like to explore making it a
> first-class citizen so that:
> - the sort order could be configured on IndexWriterConfig
> - segments would record the sort order that was used to write them
> - IndexSearcher could automatically early terminate when computing top docs
> on a sort order that is a prefix of the sort order of a segment (and if the
> user is not interested in totalHits).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]