[ 
https://issues.apache.org/jira/browse/CASSANDRA-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833930#action_12833930
 ] 

Stu Hood commented on CASSANDRA-790:
------------------------------------

> please split the refactoring into a separate patch
Which refactoring? I don't think I did anything that wasn't necessary in order 
to cap the number of available buckets.

> BF constructors that do not chain is a design smell; one of them only being 
> called from tests is also a smell
The 'maxFalsePosProb' constructor has never been called anywhere but tests, but 
it was very elegant, and someone spent a lot of time on it, so I wasn't sure 
whether to remove it.

> low-level BF constructor taking hash & bucket counts, and then factories to 
> do the high level things
Agreed... that would be much better. I'll add factories with warnings.

> SSTables limited to (2^31)/15 keys
> ----------------------------------
>
>                 Key: CASSANDRA-790
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-790
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.5, 0.6, 0.7
>            Reporter: Stu Hood
>            Priority: Blocker
>             Fix For: 0.5, 0.6, 0.7
>
>         Attachments: 
> 0001-Change-parameters-to-BloomCalculations-in-order-to-c.patch, 
> 0002-Add-timeouts-to-forceBlockingFlush-during-tests.patch
>
>
> The current BloomFilter implementation requires a BitSet of (bucket_count * 
> num_keys) in size, and that calculation is currently performed in an integer, 
> which causes overflow for around 140 million keys in one SSTable.
> Short term fix: perform the calculation in a long, and cap the value to the 
> maximum size of a BitSet.
> Long term fix: begin partitioning BitSets, perhaps using Linear Bloom Filters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to