[ 
https://issues.apache.org/jira/browse/CASSANDRA-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14632257#comment-14632257
 ] 

Jonathan Ellis commented on CASSANDRA-9843:
-------------------------------------------

Rather than just adding an ARF per partition (the way we used to have a BF -- 
the difference is that BF is not useful for scans but this would be), we may be 
able to adapt this further by moving our index into the ARF.  Instead of just a 
bit indicating yes or no we could have the offset for the start of each range 
[that we do have data for] in the leaf.

(The "adaptive" in ARF means you can tune it to index hot parts of the data 
range in greater detail, without increasing the total memory used, at the cost 
of less detail for the cold ranges.  We could do this in Cassandra as well, 
writing updated ARF to a new file.  This could reduce the memory problems of 
pulling the indexes for very large partitions into memory.  However, the paper 
describes very good results even without adaptation, so this is not required 
for proof of concept.)

> Augment or replace partition index with adaptive range filters
> --------------------------------------------------------------
>
>                 Key: CASSANDRA-9843
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9843
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: T Jake Luciani
>              Labels: performance
>
> Adaptive range filters are, in principle, bloom filters for range queries.  
> They provide a space-efficient way to avoid scanning a partition when we can 
> tell that we do not contain any data for the range requested.  Like BF, they 
> can return false positives but not false negatives.
> The implementation is of course totally different from BF.  ARF is a tree 
> where each leaf of the tree is a range of data and a bit, either on or off, 
> denoting whether we have *some* data in that range.
> ARF are described here: http://www.vldb.org/pvldb/vol6/p1714-kossmann.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to