[
https://issues.apache.org/jira/browse/LUCENE-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14017193#comment-14017193
]
Robert Muir commented on LUCENE-5731:
-------------------------------------
Related: the in-ram stuff is also complex, and has tons of generated code.
For the postings lists mike and I experimented with a much simpler approach
here: https://github.com/rmuir/lucene-solr/tree/packypack
It gives speedups (especially for positions with higher bpv), with 800 lines of
total code
(https://github.com/rmuir/lucene-solr/blob/packypack/lucene/core/src/java/org/apache/lucene/util/lightpacked/SimplePackedInts.java)
versus the huge size bloat we have today. So I think the in-ram stuff can use
a touchup as well, but we dont need to tackle that here.
> split direct packed ints from in-ram ones
> -----------------------------------------
>
> Key: LUCENE-5731
> URL: https://issues.apache.org/jira/browse/LUCENE-5731
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Robert Muir
> Assignee: Robert Muir
>
> Currently there is an oversharing problem in packedints that imposes too many
> requirements on improving it:
> * every packed ints must be able to be loaded directly, or in ram, or
> iterated with.
> * things like filepointers are expected to be adjusted (this is especially
> stupid) in all cases
> * lots of unnecessary abstractions
> * versioning etc is complex
> None of this flexibility is needed or buys us anything, and it prevents
> performance improvements (e.g. i just want to add 3 bytes at the end of
> on-disk streams to reduce the number of bytebuffer calls and thats seriously
> impossible with the current situation).
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]