[ 
https://issues.apache.org/jira/browse/LUCENE-3918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13591699#comment-13591699
 ] 

Shai Erera commented on LUCENE-3918:
------------------------------------

bq. Then lets do this! (the right way, in memory during indexing before it hits 
the disk, not re-ordering existing on-disk segments after the fact)

This issue is about porting the previous IndexSorter implementation to trunk 
API. The previous one offered a one-time sorting of an index, so is this one. 
While that doesn't mean we shouldn't explore alternatives, I find it a much 
lower hanging fruit than LUCENE-4752, especially as no one yet assigned the 
issue to himself, nor it looks like any progress was made. If LUCENE-4752 will 
eventually see the light of day, I don't mind if we nuke IndexSorter completely 
(by a SortingCodec I guess?), but until then, I think that offering users *A* 
way to sort their index is valuable too.

Also, it's not clear to me at the moment (but I admit I haven't thought about 
it much) how can you sort documents during indexing, while the values to be 
sorted by may still be unknown? I.e. what if your sort-by-key is a 
NumericDocValues which the Codec hasn't seen yet? How should it write posting 
lists, stored fields etc.? Does this mean the Codec must cache the entire 
to-be-written segment in RAM? That will consume much more RAM than the approach 
in this issue ...

I think that online sorting is much more powerful than a one-time sort, but 
there's work to do to make it happen, and efficiently. Therefore until then, I 
think that we should proceed with this offline sorting strategy, which is 
better than nothing.
                
> Port index sorter to trunk APIs
> -------------------------------
>
>                 Key: LUCENE-3918
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3918
>             Project: Lucene - Core
>          Issue Type: Task
>          Components: modules/other
>    Affects Versions: 4.0-ALPHA
>            Reporter: Robert Muir
>             Fix For: 4.2, 5.0
>
>         Attachments: LUCENE-3918.patch, LUCENE-3918.patch, LUCENE-3918.patch
>
>
> LUCENE-2482 added an IndexSorter to 3.x, but we need to port this
> functionality to 4.0 apis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to