[ 
https://issues.apache.org/jira/browse/LUCENE-7396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-7396:
---------------------------------
    Attachment: LUCENE-7396.patch

Here is a patch that uses a different approach. Flush passes a special 
implementation of a PointsReader that allows points to be reordered, so that 
codecs can sort points in the order that they are interested in. The benefit 
compared to the previous patch is that it is not specific to a codec anymore 
and also that it can be used in the multi-dimensional case. I got the following 
flush times (as reported by the IndexWriter log) with a 1GB buffer:

|| Flush time (ms)||master||patch||
|IndexAndSearchOpenStreetMaps1D (1 dim)|31089|18954 
({color:green}-39.0%{color})|
|IndexAndSearchOpenStreetMaps (2 dims)|123461|85235 
({color:green}-30.1%{color})|

This looks encouraging, especially given that it also uses less memory than the 
current approach. However the patch is a bit disappointing in that it has a 
completely different implementation of the writing of the tree depending on 
whether the input can be reordered or not. I'll look into whether I can clean 
this up a bit.

> Speed up flush of 1-dimension points
> ------------------------------------
>
>                 Key: LUCENE-7396
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7396
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-7396.patch, LUCENE-7396.patch
>
>
> 1D points already have an optimized merge implementation which works when 
> points come in order. So maybe we could make IndexWriter's PointValuesWriter 
> sort before feeding the PointsFormat and somehow propagate the information to 
> the PointsFormat?
> The benefit is that flushing could directly stream points to disk with little 
> memory usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to