[
https://issues.apache.org/jira/browse/LUCENE-7396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Adrien Grand updated LUCENE-7396:
---------------------------------
Attachment: LUCENE-7396.patch
Here is a patch that uses a different approach. Flush passes a special
implementation of a PointsReader that allows points to be reordered, so that
codecs can sort points in the order that they are interested in. The benefit
compared to the previous patch is that it is not specific to a codec anymore
and also that it can be used in the multi-dimensional case. I got the following
flush times (as reported by the IndexWriter log) with a 1GB buffer:
|| Flush time (ms)||master||patch||
|IndexAndSearchOpenStreetMaps1D (1 dim)|31089|18954
({color:green}-39.0%{color})|
|IndexAndSearchOpenStreetMaps (2 dims)|123461|85235
({color:green}-30.1%{color})|
This looks encouraging, especially given that it also uses less memory than the
current approach. However the patch is a bit disappointing in that it has a
completely different implementation of the writing of the tree depending on
whether the input can be reordered or not. I'll look into whether I can clean
this up a bit.
> Speed up flush of 1-dimension points
> ------------------------------------
>
> Key: LUCENE-7396
> URL: https://issues.apache.org/jira/browse/LUCENE-7396
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Priority: Minor
> Attachments: LUCENE-7396.patch, LUCENE-7396.patch
>
>
> 1D points already have an optimized merge implementation which works when
> points come in order. So maybe we could make IndexWriter's PointValuesWriter
> sort before feeding the PointsFormat and somehow propagate the information to
> the PointsFormat?
> The benefit is that flushing could directly stream points to disk with little
> memory usage.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]