[ 
https://issues.apache.org/jira/browse/LUCENE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16763375#comment-16763375
 ] 

Ignacio Vera commented on LUCENE-8687:
--------------------------------------

I have opened a PR for this approach. Performance test shows a nice increase in 
indexing throughput:

||Approach||Index time (sec): Dev||Index Time (sec): Base||Index Time: 
Diff||Force merge time (sec): Dev||Force Merge time (sec): Base||Force Merge 
Time: Diff||Index size (GB): Dev||Index size (GB): Base||Index Size: 
Diff||Reader heap (MB): Dev||Reader heap (MB): Base||Reader heap: Diff||
|points|148.4s|178.9s|-17%|73.6s|92.2s|-20%|0.55|0.55| 0%|1.57|1.57| 0%|
|shapes|246.6s|283.7s|-13%|148.5s|168.2s|-12%|1.29|1.29| 0%|1.61|1.61| 0%|
|geo3d|180.9s|204.8s|-12%|79.8s|103.7s|-23%|0.75|0.75| 0%|1.58|1.58| 0%|

> Optimise radix partitioning for points on heap
> ----------------------------------------------
>
>                 Key: LUCENE-8687
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8687
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Ignacio Vera
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> In LUCENE-8673 it was introduced radix partitioning for merging segments. It 
> currently works the same when you have data offline and or heap. It makes 
> sense when data is on-heap, to not have multiple copies but perform the 
> partitioning always in the same object, similar to what it is done with 
> `MutablePointValues`. 
> This will allow as well to hold more points in memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to