[
https://issues.apache.org/jira/browse/LUCENE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16763375#comment-16763375
]
Ignacio Vera commented on LUCENE-8687:
--------------------------------------
I have opened a PR for this approach. Performance test shows a nice increase in
indexing throughput:
||Approach||Index time (sec): Dev||Index Time (sec): Base||Index Time:
Diff||Force merge time (sec): Dev||Force Merge time (sec): Base||Force Merge
Time: Diff||Index size (GB): Dev||Index size (GB): Base||Index Size:
Diff||Reader heap (MB): Dev||Reader heap (MB): Base||Reader heap: Diff||
|points|148.4s|178.9s|-17%|73.6s|92.2s|-20%|0.55|0.55| 0%|1.57|1.57| 0%|
|shapes|246.6s|283.7s|-13%|148.5s|168.2s|-12%|1.29|1.29| 0%|1.61|1.61| 0%|
|geo3d|180.9s|204.8s|-12%|79.8s|103.7s|-23%|0.75|0.75| 0%|1.58|1.58| 0%|
> Optimise radix partitioning for points on heap
> ----------------------------------------------
>
> Key: LUCENE-8687
> URL: https://issues.apache.org/jira/browse/LUCENE-8687
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Ignacio Vera
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> In LUCENE-8673 it was introduced radix partitioning for merging segments. It
> currently works the same when you have data offline and or heap. It makes
> sense when data is on-heap, to not have multiple copies but perform the
> partitioning always in the same object, similar to what it is done with
> `MutablePointValues`.
> This will allow as well to hold more points in memory.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]