[jira] [Updated] (CASSANDRA-6609) Reduce Bloom Filter Garbage Allocation

2014-01-27 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-6609:


Attachment: tmp2.patch

I'm attaching a patch that resolves all garbage in this code path, and actually 
improves performance by around 10-15% as well, through the use of a ThreadLocal 
long[]

This is a suboptimal solution, in my book, as it's not clear that the 
performance boost will remain in normal system running, as the ThreadLocal 
access will probably become more costly. But as things stand this is the only 
way to eliminate all garbage, and I think that is paramount for this method. 
It's unlikely the ThreadLocal lookup will become dramatically more costly, so I 
think until stack allocation can be improved this is the best bet.

 Reduce Bloom Filter Garbage Allocation
 --

 Key: CASSANDRA-6609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6609
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
 Attachments: tmp.diff, tmp2.patch


 Just spotted that we allocate potentially large amounts of garbage on bloom 
 filter lookups, since we allocate a new long[] for each hash() and to store 
 the bucket indexes we visit, in a manner that guarantees they are allocated 
 on heap. With a lot of sstables and many requests, this could easily be 
 hundreds of megabytes of young gen churn per second.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6609) Reduce Bloom Filter Garbage Allocation

2014-01-27 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-6609:


Attachment: tmp3.patch

Uh, that is if it weren't for an awful fat finger error. Fixed, and also 
reintroduced a deoptimised public getHashBuckets for use by the unit tests 
(deoptimised because it permits far more hashes than we ever do for realz, so I 
skip the ThreadLocal so we don't have to allocate an array that large).

 Reduce Bloom Filter Garbage Allocation
 --

 Key: CASSANDRA-6609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6609
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
 Attachments: tmp.diff, tmp2.patch, tmp3.patch


 Just spotted that we allocate potentially large amounts of garbage on bloom 
 filter lookups, since we allocate a new long[] for each hash() and to store 
 the bucket indexes we visit, in a manner that guarantees they are allocated 
 on heap. With a lot of sstables and many requests, this could easily be 
 hundreds of megabytes of young gen churn per second.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6609) Reduce Bloom Filter Garbage Allocation

2014-01-21 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-6609:


Attachment: tmp.diff

I've attached a quick and simple patch that reduces garbage by a factor of 6, 
but also slightly increases a bloom filter lookup cost (600ns to 750ns, 
approximately). This is suboptimal, and I'm not necessarily suggesting we use 
this patch as it stands. I attempted to coax the VM to allocate the arrays on 
heap so we can still benefit from whatever loop unrolling optimisations are 
kicking in with the original code, but I failed in my initial attempt. I will 
have another look at it again soon to see if we can get the best of both worlds.

 Reduce Bloom Filter Garbage Allocation
 --

 Key: CASSANDRA-6609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6609
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
 Attachments: tmp.diff


 Just spotted that we allocate potentially large amounts of garbage on bloom 
 filter lookups, since we allocate a new long[] for each hash() and to store 
 the bucket indexes we visit, in a manner that guarantees they are allocated 
 on heap. With a lot of sstables and many requests, this could easily be 
 hundreds of megabytes of young gen churn per second.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)