[ 
https://issues.apache.org/jira/browse/CASSANDRA-5906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13830317#comment-13830317
 ] 

Matt Abrams commented on CASSANDRA-5906:
----------------------------------------

When SP > 0 the algorithm uses a variant of a linear counter to get very 
accurate counts at small cardinality.  At some threshold the algorithm switches 
from a linear counter to HLL.   Linear counters grow in size as a function of 
the number of inputs where HLL's size is a function of the desired error rate.  
We could (should?) tune the threshold so that the size so that the conversion 
happens earlier.  Currently the threshold is equal to 2^p * .75.


> Avoid allocating over-large bloom filters
> -----------------------------------------
>
>                 Key: CASSANDRA-5906
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5906
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Yuki Morishita
>             Fix For: 2.1
>
>
> We conservatively estimate the number of partitions post-compaction to be the 
> total number of partitions pre-compaction.  That is, we assume the worst-case 
> scenario of no partition overlap at all.
> This can result in substantial memory wasted in sstables resulting from 
> highly overlapping compactions.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to