[ 
https://issues.apache.org/jira/browse/CASSANDRA-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13753627#comment-13753627
 ] 

Chris Burroughs commented on CASSANDRA-5939:
--------------------------------------------

Ignoring anything I know about the code I would expect as a user if I told 
cassandra to use 2 gigabytes of cache it would use about two gigabytes, either 
all on heap or mixed heap/native depending on which provider was chosen.  I 
would also expect that the completely on-heap one would be able to keep 
somewhat fewer entires, but that both providers would be consistent in their 
calculations.  

This the first example ConcurrentLinkedHashCacheProvider says each row+overhead 
is 2147398344./23217 = 92492 bytes while SerializingCacheProvider says 
18417254./221709 = 83 bytes.  While java has overhead, it's not 1000x.  I don't 
have a jconsole screenshot but I'm pretty sure that *total* heap was < 2 GB 
while ConcurrentLinkedHashCacheProvider was saying it was full.  Even if 
ConcurrentLinkedHashCacheProvider needed 1000x the memory, it should 
consistently do so while I was seeing one node have 10k entries, while another 
had over 400k.
                
> Cache Providers calculate very different row sizes
> --------------------------------------------------
>
>                 Key: CASSANDRA-5939
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5939
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: 1.2.8
>            Reporter: Chris Burroughs
>            Assignee: Vijay
>
> Took the same production node and bounced it 4 times comparing version and 
> cache provider.  ConcurrentLinkedHashCacheProvider and 
> SerializingCacheProvider produce very different results resulting in an order 
> of magnitude difference in rows cached.  In all cases the row cache size was 
> 2048 MB.  Hit rate is provided for color, but entries & size are the 
> important part.
> 1.2.8 ConcurrentLinkedHashCacheProvider:
>  * entries: 23,217
>  * hit rate: 43%
>  * size: 2,147,398,344
> 1.2.8 about 20 minutes of SerializingCacheProvider:
>  * entries: 221,709
>  * hit rate: 68%
>  * size: 18,417254
> 1.2.5 ConcurrentLinkedHashCacheProvider:
>  * entries: 25,967
>  * hit rate: ~ 50%
>  * size:  2,147,421,704
> 1.2.5 about 20 minutes of SerializingCacheProvider:
>  * entries: 228,457
>  * hit rate: ~ 70%
>  * size: 19,070,315
> A related(?) problem is that the ConcurrentLinkedHashCacheProvider sizes seem 
> to be highly variable.  Digging up the values for 5 different nodes in the 
> cluster using ConcurrentLinkedHashCacheProvider shows a wide variance in 
> number of entries:
>  * 12k
>  * 444k
>  * 10k
>  * 25k
>  * 25k

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to