[ 
https://issues.apache.org/jira/browse/LUCENE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15689447#comment-15689447
 ] 

Toke Eskildsen commented on LUCENE-7521:
----------------------------------------

I was involved in the original PackedInts implementation, where I did quite a 
bit of performance testing of the two different approaches: Optimal memory 
packing (Packed64) and word-aligned packing (Packed64SingleBlock). They were 
named different back then, but the principles and the performance-relevant code 
parts were about the same. The JIRA is LUCENE-1990. The conclusion then was 
that aligned won in a few cases but added quite a lot of complexity, so it was 
scrapped.

Two years later the aligned version was re-introduced in LUCENE-4062. Again 
there were some performance testing. Performance characteristics differed 
depending on CPU structure and in-memory array size (cache utilization really). 
Overall it seemed that aligned packing was faster, but not by much on the i7 
(desktop & Xeon). 

One important observation from the JIRA is that only the BPVs (Bits Per Value) 
3, 5, 6, 7, 9, 10, 12 and 21 that differ in representation (and get/set 
algorithm) between packed and aligned. There's some poor graphs from an old 
comparison of those values on http://ekot.dk/misc/packedints/padding.html where 
contiguous=packed and padding=aligned. This was for a small (10M values, AFAIR) 
set. Note how the performance difference between the implementation varies a 
lot, depending on CPU type.

Long story longer, I still favour having only 1 underlying format ("optimal" 
packed): Too little gain in too few cases for a high code complexity cost with 
aligned. On a related node, a high-quality micro-benchmark for structures like 
these would be great.

> Simplify PackedInts
> -------------------
>
>                 Key: LUCENE-7521
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7521
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-7521.patch
>
>
> We have a lot of specialization in PackedInts about how to keep packed arrays 
> of longs in memory. However, most use-cases have slowly moved to DirectWriter 
> and DirectMonotonicWriter and most specializations we have are barely used 
> for performance-sensitive operations, so I'd like to clean this up a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to