[
https://issues.apache.org/jira/browse/LUCENE-8590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16710326#comment-16710326
]
Simon Willnauer commented on LUCENE-8590:
-----------------------------------------
copying this from the PullRequest for reference:
I ran a very simple measurement on top of `BufferedUpdates` adding 10k random
updates with a constant seed. Here are the results:
||Setup || master || patch || Percentage||
|RamBytesUsed Numeric/SameField/RandomValue/SameDocUpTo | 2783514 bytes |
549930 bytes | 19%|
|RamBytesUsed Numeric/SameField/RandomValue/RandomDocUpTo | 2783514 bytes |
593294 bytes | 21%|
|RamBytesUsed Numeric/SameField/SameValue | 2783514 bytes | 469546 bytes | 16%|
|RamBytesUsed Numeric/SameField/SameValue/RandomDocUpTo| 2783514 bytes | 512910
bytes | 18%|
> Optimize DocValues update datastructures
> ----------------------------------------
>
> Key: LUCENE-8590
> URL: https://issues.apache.org/jira/browse/LUCENE-8590
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Simon Willnauer
> Assignee: Simon Willnauer
> Priority: Major
> Fix For: master (8.0), 7.7
>
> Time Spent: 3h 50m
> Remaining Estimate: 0h
>
> Today we are using a LinkedHashMap to buffer doc-values updates in
> BufferedUpdates. This on the one hand uses an Object based datastructure
> and on the other requires re-encoding the data into a more compact
> representation
> once the BufferedUpdates are frozen. This change uses a more compact
> represenation
> for the updates already in the BufferedUpdates in a parallel-array like
> datastructure
> that can be reused in FrozenBufferedDeletes. It also adds an much simpler
> to use
> API to consume the updates and allows for internal memory optimization
> for common
> case updates.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]