[ 
https://issues.apache.org/jira/browse/CASSANDRA-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385804#comment-14385804
 ] 

Benedict commented on CASSANDRA-9060:
-------------------------------------

bq. I think the immediate problem is that they are created to allow room for 
all keys in all anticompacted tables, whereas anticompactions process one table 
at a time

Thanks. You're right., and this is definitely something to fix in 2.1

In this instance we don't use HLL cardinality estimators, but the index 
summary, which isn't probabilistic. What it is, however, is only accurate to a 
certain granularity. As a first patch your approach reduces the problem to the 
one I initially assumed it was, i.e. a doubling of required space (instead of 
\*N), but with a small amount of TLC the estimatedKeysForRanges() method could 
be modified to give a lower bound for the size of both resultant tables (at the 
moment it can significantly over estimate in some scenarios, but also cannot 
easily estimate the cardinality of the negation of the range - so we would have 
to subtract the overestimation, giving an underestimate which is much worse).

Your patch looks to me to significantly improve the status quo, so I will 
commit it now, and we can address a slightly improved patch for perhaps 2.1.5

> Anticompaction hangs on bloom filter bitset serialization 
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-9060
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9060
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Gustav Munkby
>            Assignee: Marcus Eriksson
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: 2.1-9060-simple.patch, trunk-9060.patch
>
>
> I tried running an incremental repair against a 15-node vnode-cluster with 
> roughly 500GB data running on 2.1.3-SNAPSHOT, without performing the 
> suggested migration steps. I manually chose a small range for the repair 
> (using --start/end-token). The actual repair part took almost no time at all, 
> but the anticompactions took a lot of time (not surprisingly).
> Obviously, this might not be the ideal way to run incremental repairs, but I 
> wanted to look into what made the whole process so slow. The results were 
> rather surprising. The majority of the time was spent serializing bloom 
> filters.
> The reason seemed to be two-fold. First, the bloom-filters generated were 
> huge (probably because the original SSTables were large). With a proper 
> migration to incremental repairs, I'm guessing this would not happen. 
> Secondly, however, the bloom filters were being written to the output one 
> byte at a time (with quite a few type-conversions on the way) to transform 
> the little-endian in-memory representation to the big-endian on-disk 
> representation.
> I have implemented a solution where big-endian is used in-memory as well as 
> on-disk, which obviously makes de-/serialization much, much faster. This 
> introduces some slight overhead when checking the bloom filter, but I can't 
> see how that would be problematic. An obvious alternative would be to still 
> perform the serialization/deserialization using a byte array, but perform the 
> byte-order swap there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to