richardstartin opened a new pull request #24310: use bitmap appender to 
optimise bitmap incrementally and avoid binary…
URL: https://github.com/apache/spark/pull/24310
 
 
   … search
   
   ## What changes were proposed in this pull request?
   
   This PR modifies `HighlyCompressedMapStatus` to use a new feature in 
RoaringBitmap which buffers insertions to 16-but containers and appends 
containers to the underlying bitmap as late as possible. This has two effects: 
the best container type is chosen incrementally, so there is no need to call 
`runOptimize` and there are never binary searches in the high 16 bits to locate 
the container to add a bit to, which improves insertion performance. This 
performance improvement is proportional to the number of empty blocks, but 
always avoids a call to `runOptimize`.
   
   ## How was this patch tested?
   
   This change was verified not to break existing unit tests. New tests to 
demonstrate that the new mechanism always builds a bitmap as compressed as a 
bitmap calling `runOptimize` were added, as well as justification (in terms of 
bitmap size) for the existing decision to represent empty blocks rather than 
full blocks.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to