[ 
https://issues.apache.org/jira/browse/LUCENE-9877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17311684#comment-17311684
 ] 

Michael Sokolov edited comment on LUCENE-9877 at 3/30/21, 5:38 PM:
-------------------------------------------------------------------

> I was able to setup ant/JDK8 and made sure "ant precommit" passed, but I 
>wasn't able to get the benchmarks running for it as of now.

 

Yeah, luceneutil benchmarks track lucene/main and aren't able to build 8.x any 
more. I thought about how we might support that, but it's probably a bunch of 
work (have to have different jvms around, support ant and gradle), and only 
occasionally needed.


was (Author: sokolov):
> I was able to setup ant/JDK8 and made sure "ant precommit" passed, but I 
>wasn't able to get the benchmarks running for it as of now.

 

Yeah, benchmarks track lucene/main and aren't able to build 8.x any more. I 
thought about how we might support that, but it's probably a bunch of work 
(have to have different jvms around, support ant and gradle), and only 
occasionally needed.

> Explore increasing the allowable exceptions in PForUtil
> -------------------------------------------------------
>
>                 Key: LUCENE-9877
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9877
>             Project: Lucene - Core
>          Issue Type: Task
>          Components: core/codecs
>    Affects Versions: main (9.0)
>            Reporter: Greg Miller
>            Priority: Minor
>             Fix For: 8.9
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Piggybacking a little off of the investigation I was doing over in 
> LUCENE-9850 I thought it might also be worth-while exploring the impact of 
> increasing the number of allowable exceptions in PForUtil. The aim of this 
> investigation is to see if we could reduce index size by allowing for more 
> exceptions without significant negative impact to performance.
> PForUtil currently allows for up to 3 exceptions, and it only uses 3 bits to 
> encode the number of exceptions (using the remaining 3 bits of the byte used 
> to also encode the number of bits-per-value, which requires 5 bits). Each 
> exception used is encoded with a two full bytes, using a maximum of 6 bytes 
> per block.
> It seems to me like 7 might be a more ideal number of exceptions if index 
> size is the driving motivation. My thought process is that, in the 
> worst-case, 7 exceptions would be used to save only a single bit-per-value in 
> the corresponding block. With 128 entries per block, this would save 16 
> bytes. So with 14 bytes used to encode the exception values (7 x 2 bytes per 
> exception), we would save a two bytes in total (just slightly better than 
> breaking even). If we need fewer than the 7 exceptions, or if we're able to 
> save more than 1 bit-per-value, it's all additional savings. I suppose the 
> question is what kind of performance hit we might observe due to decoding 
> more exceptions.
> Also note that 7 exceptions is the max we can encode with the 3 bits we 
> currently have available for the number of exceptions. So moving to 8 
> exceptions would not only take 16 bytes to encode the exceptions (if using 
> all of them), but we'd need one more byte per block to encode the exception 
> count. So in the worst case of using all 8 exceptions to save 1 bit per 
> value, we'd actually be worse off.
> I'll post some results here for discussion or at least for public record of 
> my work for future reference.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to