[
https://issues.apache.org/jira/browse/CASSANDRA-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13753627#comment-13753627
]
Chris Burroughs commented on CASSANDRA-5939:
--------------------------------------------
Ignoring anything I know about the code I would expect as a user if I told
cassandra to use 2 gigabytes of cache it would use about two gigabytes, either
all on heap or mixed heap/native depending on which provider was chosen. I
would also expect that the completely on-heap one would be able to keep
somewhat fewer entires, but that both providers would be consistent in their
calculations.
This the first example ConcurrentLinkedHashCacheProvider says each row+overhead
is 2147398344./23217 = 92492 bytes while SerializingCacheProvider says
18417254./221709 = 83 bytes. While java has overhead, it's not 1000x. I don't
have a jconsole screenshot but I'm pretty sure that *total* heap was < 2 GB
while ConcurrentLinkedHashCacheProvider was saying it was full. Even if
ConcurrentLinkedHashCacheProvider needed 1000x the memory, it should
consistently do so while I was seeing one node have 10k entries, while another
had over 400k.
> Cache Providers calculate very different row sizes
> --------------------------------------------------
>
> Key: CASSANDRA-5939
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5939
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: 1.2.8
> Reporter: Chris Burroughs
> Assignee: Vijay
>
> Took the same production node and bounced it 4 times comparing version and
> cache provider. ConcurrentLinkedHashCacheProvider and
> SerializingCacheProvider produce very different results resulting in an order
> of magnitude difference in rows cached. In all cases the row cache size was
> 2048 MB. Hit rate is provided for color, but entries & size are the
> important part.
> 1.2.8 ConcurrentLinkedHashCacheProvider:
> * entries: 23,217
> * hit rate: 43%
> * size: 2,147,398,344
> 1.2.8 about 20 minutes of SerializingCacheProvider:
> * entries: 221,709
> * hit rate: 68%
> * size: 18,417254
> 1.2.5 ConcurrentLinkedHashCacheProvider:
> * entries: 25,967
> * hit rate: ~ 50%
> * size: 2,147,421,704
> 1.2.5 about 20 minutes of SerializingCacheProvider:
> * entries: 228,457
> * hit rate: ~ 70%
> * size: 19,070,315
> A related(?) problem is that the ConcurrentLinkedHashCacheProvider sizes seem
> to be highly variable. Digging up the values for 5 different nodes in the
> cluster using ConcurrentLinkedHashCacheProvider shows a wide variance in
> number of entries:
> * 12k
> * 444k
> * 10k
> * 25k
> * 25k
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira