[
https://issues.apache.org/jira/browse/CASSANDRA-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13754387#comment-13754387
]
Vijay commented on CASSANDRA-5939:
----------------------------------
{quote}
While java has overhead, it's not...
{quote}
Well try the following code in CacheProviderTest
{code}
@Test
public void testCompareSizes() throws IOException
{
RowCacheKey key = new RowCacheKey(UUID.randomUUID(),
ByteBufferUtil.bytes("test"));
ColumnFamily cf = createCF();
System.out.println("size:" + (key.memorySize() + cf.memorySize()));
System.out.println("key size:" + key.memorySize());
System.out.println("value size:" + cf.memorySize());
RowCacheSerializer serializer = new RowCacheSerializer();
DataOutputBuffer out = new DataOutputBuffer();
serializer.serialize(cf, out);
System.out.println("ser size:" + out.getLength());
IRowCacheEntry cf2 = serializer.deserialize(new DataInputStream(new
ByteArrayInputStream(out.getData())));
Assert.assertEquals(cf, cf2);
}
{code}
output (actually value/CF overhead memorySize uses measureDeep() JAMM)
{code}
size:74120
key size:48
value size:74072
ser size:66
{code}
I am just trying to figure out if there is any bug I am missing/overlooking. I
agree that we need to have a configuration for the key size in JVM heap to
contain OOM's etc.
We can use this ticket to solve that issue. I do understand, we have removed
CLHM in 2.0 so we can concentrate on getting a better configuration for SC.
> Cache Providers calculate very different row sizes
> --------------------------------------------------
>
> Key: CASSANDRA-5939
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5939
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: 1.2.8
> Reporter: Chris Burroughs
> Assignee: Vijay
>
> Took the same production node and bounced it 4 times comparing version and
> cache provider. ConcurrentLinkedHashCacheProvider and
> SerializingCacheProvider produce very different results resulting in an order
> of magnitude difference in rows cached. In all cases the row cache size was
> 2048 MB. Hit rate is provided for color, but entries & size are the
> important part.
> 1.2.8 ConcurrentLinkedHashCacheProvider:
> * entries: 23,217
> * hit rate: 43%
> * size: 2,147,398,344
> 1.2.8 about 20 minutes of SerializingCacheProvider:
> * entries: 221,709
> * hit rate: 68%
> * size: 18,417254
> 1.2.5 ConcurrentLinkedHashCacheProvider:
> * entries: 25,967
> * hit rate: ~ 50%
> * size: 2,147,421,704
> 1.2.5 about 20 minutes of SerializingCacheProvider:
> * entries: 228,457
> * hit rate: ~ 70%
> * size: 19,070,315
> A related(?) problem is that the ConcurrentLinkedHashCacheProvider sizes seem
> to be highly variable. Digging up the values for 5 different nodes in the
> cluster using ConcurrentLinkedHashCacheProvider shows a wide variance in
> number of entries:
> * 12k
> * 444k
> * 10k
> * 25k
> * 25k
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira