[jira] [Updated] (LUCENE-6766) Make index sorting a first-class citizen

2016-05-10 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-6766:
---
Attachment: LUCENE-6766.patch

I think this is ready ... here's the current patch against master.

I still need to run "first do no harm" indexing performance tests to
make sure there is not too much of a hit when indexing without an
index sort.

I don't plan to rush this in for 6.1 ... I'll commit to master, and
after we release 6.1 (Soon I think?: so many geo improvements!), I
plan to backport for 6.2.


> Make index sorting a first-class citizen
> 
>
> Key: LUCENE-6766
> URL: https://issues.apache.org/jira/browse/LUCENE-6766
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-6766.patch, LUCENE-6766.patch, LUCENE-6766.patch
>
>
> Today index sorting is a very expert feature. You need to use a custom merge 
> policy, custom collectors, etc. I would like to explore making it a 
> first-class citizen so that:
>  - the sort order could be configured on IndexWriterConfig
>  - segments would record the sort order that was used to write them
>  - IndexSearcher could automatically early terminate when computing top docs 
> on a sort order that is a prefix of the sort order of a segment (and if the 
> user is not interested in totalHits).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6766) Make index sorting a first-class citizen

2016-05-06 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-6766:
---
Attachment: LUCENE-6766.patch

Here's the current patch (generated from {{diffSources.py}})...

> Make index sorting a first-class citizen
> 
>
> Key: LUCENE-6766
> URL: https://issues.apache.org/jira/browse/LUCENE-6766
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-6766.patch, LUCENE-6766.patch
>
>
> Today index sorting is a very expert feature. You need to use a custom merge 
> policy, custom collectors, etc. I would like to explore making it a 
> first-class citizen so that:
>  - the sort order could be configured on IndexWriterConfig
>  - segments would record the sort order that was used to write them
>  - IndexSearcher could automatically early terminate when computing top docs 
> on a sort order that is a prefix of the sort order of a segment (and if the 
> user is not interested in totalHits).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6766) Make index sorting a first-class citizen

2015-09-25 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-6766:
-
Attachment: LUCENE-6766.patch

Here is a first prototype that:
 - moves sorting logic from misc to core
 - removes SortingMergePolicy
 - adds an "indexSort" parameter to IndexWriterConfig and SegmentInfo, with 
null meaning that the index order is unspecified
 - SimpleTextCodec (de)serializes this indexOrder parameter, other codecs 
ignore it for now
 - refactors a bit the doc ID remapping logic in IndexWriter when there have 
been deletions while some segments were being merged

Open question: how should we serialize the SortField objects? Should we have a 
fixed list of supported SortField parameters or should we allow SortField 
parameters to serialize themselves?

There are lots of things we could do on the search side, but for now I'd like 
to focus on the indexing side and making sure the sort order of segments is 
easily accessible.

> Make index sorting a first-class citizen
> 
>
> Key: LUCENE-6766
> URL: https://issues.apache.org/jira/browse/LUCENE-6766
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-6766.patch
>
>
> Today index sorting is a very expert feature. You need to use a custom merge 
> policy, custom collectors, etc. I would like to explore making it a 
> first-class citizen so that:
>  - the sort order could be configured on IndexWriterConfig
>  - segments would record the sort order that was used to write them
>  - IndexSearcher could automatically early terminate when computing top docs 
> on a sort order that is a prefix of the sort order of a segment (and if the 
> user is not interested in totalHits).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org