[ 
https://issues.apache.org/jira/browse/LUCENE-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14017193#comment-14017193
 ] 

Robert Muir commented on LUCENE-5731:
-------------------------------------

Related: the in-ram stuff is also complex, and has tons of generated code.

For the postings lists mike and I experimented with a much simpler approach 
here: https://github.com/rmuir/lucene-solr/tree/packypack

It gives speedups (especially for positions with higher bpv), with 800 lines of 
total code 
(https://github.com/rmuir/lucene-solr/blob/packypack/lucene/core/src/java/org/apache/lucene/util/lightpacked/SimplePackedInts.java)
  versus the huge size bloat we have today. So I think the in-ram stuff can use 
a touchup as well, but we dont need to tackle that here.

> split direct packed ints from in-ram ones
> -----------------------------------------
>
>                 Key: LUCENE-5731
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5731
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>
> Currently there is an oversharing problem in packedints that imposes too many 
> requirements on improving it:
> * every packed ints must be able to be loaded directly, or in ram, or 
> iterated with.
> * things like filepointers are expected to be adjusted (this is especially 
> stupid) in all cases
> * lots of unnecessary abstractions
> * versioning etc is complex
> None of this flexibility is needed or buys us anything, and it prevents 
> performance improvements (e.g. i just want to add 3 bytes at the end of 
> on-disk streams to reduce the number of bytebuffer calls and thats seriously 
> impossible with the current situation).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to