[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2015-01-21 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285916#comment-14285916
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

If you turn the OOME throwing on in C* I am +1.
 
I did a quick performance test with the cache and compared it to the 
SerializingCache. I didn't test a scenario where it would be better/faster, but 
the performance looked just as good. Very noisy test with different results 
every time I restarted so maybe not a great way to measure.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Robert Stupp
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2015-01-20 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284416#comment-14284416
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

bq. I am +1 conditional on the library throwing OOME if the allocator fails.

Will add a configuration switch to explicitly enable this behavior.

bq. There are also still some internal properties inside OHC that are don't 
have a prefix.

Yes, debug-mode and disable-jemalloc did not had the prefix. Will be changed.

bq. I noticed you fixed some C* bugs ... need to be backported?

It's only the {{==}} to {{equals}} change in 
{{ColumnFamilyStore.cleanupCache}}. It's not necessary to fix it for older 
versions, since the {{UUID}} instance is taken from {{CFMetaData}} - so the 
{{==}} is (was) correct.

bq. Can you publish a new version to maven central so I can benchmark it vs the 
old cache implementation?

OHC 0.3 + 0.3.1 are on Maven Central.
Note: OHC 0.3.1 incorporates the changes above (might not found using Maven 
Central search, but artifacts are there)
C* git branch updated to use 0.3.1

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Robert Stupp
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2015-01-20 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284230#comment-14284230
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

I am +1 conditional on the library throwing OOME if the allocator fails. It 
should be the caller of the library that decides how to handle the situation 
not the library IMO.

There are also still some internal properties inside OHC that are don't have a 
prefix. 

I noticed you fixed some C* bugs 
https://github.com/snazy/cassandra/compare/7438-pluggable#diff-98f5acb96aa6d684781936c141132e2aL1915
 
Do those fixes need to be backported?

Can you publish a new version to maven central so I can benchmark it vs the old 
cache implementation?

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Robert Stupp
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2015-01-19 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14282467#comment-14282467
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

I think the possibly best alternative to access malloc/free is {{Unsafe}} with 
jemalloc in LD_PRELOAD. Native code of {{Unsafe.allocateMemory}} is basically 
just a wrapper around {{malloc()}}/{{free()}}.

Updated the git branch with the following changes:
* update to OHC 0.3
* benchmark: add new command line option to specify key length (-kl)
* free capacity handling moved to segments
* allow to specify preferred memory allocation via system property 
org.caffinitas.ohc.allocator
* allow to specify defaults of OHCacheBuilder via system properties prefixed 
with org.caffinitas.org.
* benchmark: make metrics in local to the driver threads
* benchmark: disable bucket histogram in stats by default

I did not change the default number of segments = 2 * CPUs - but I thought 
about that (since you experienced that 256 segments on c3.8xlarge gives some 
improvement). A naive approach to say e.g. 8 * CPUs feels too heavy for small 
systems (with one socket) and might be too much outside of benchmarking. If 
someone wants to get most out of it in production and really hits the number of 
segments, he can always configure it better. WDYT?

Using jemalloc on Linux via LD_PRELOAD is probably the way to go in C* (since 
off-heap is also used elsewhere).
I think we should leave the OS allocator on OSX.
Don't know much about allocator performance on Windows.

For now I do not plan any new features for C* - so maybe we shall start a final 
review round?

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Robert Stupp
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2015-01-16 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280945#comment-14280945
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

I ran the benchmark on the develop branch today using a c3.8xlarge and profiled 
with flight recorder. There is definitely some contention on the lock in JNA. I 
also see a little in AbstractQueuedSynchronizer from locking the segments. 
along with some park/unpark activity.

I built jemalloc (-march=native --disable-fill --disable-stats). The Ubuntu 
package compiles at o2 instead of o3. I am getting full utilization across 30 
threads if I increase the number of segments to 256 otherwise it hovers around 
2600% (with 30 threads). It cuts in half the number of instances of contention 
in the profiler.

The workload settings you ran with resulted in a lot of cache (ohcache, not CPU 
cache) misses. I think a real workload where the cache is useful will have more 
hits.

One note about the benchmark, building the histogram of buckets is not a 
lightweight operation. I think that should be off by default. I removed it for 
my testing. Otherwise it looks ok. Using the Timer as shared state in a 
micro-benchmarks is probably not the way to go. I would have a timer per driver 
thread and then aggregate.

I am running 1-30 threads and it will take a few hours to finish. I am going to 
look into benchmarking inside C* and comparing the existing cache 
implementation to OHC now.

I used this which gave me mostly cache hits and filled up quite a bit of RAM. 
It takes a minute or two to fill the cache.
{noformat}
#!/bin/sh
LD_PRELOAD=~/jemalloc-3.6.0/lib/libjemalloc.so.1 \
java -Xmx8g -XX:+UnlockCommercialFeatures -XX:+FlightRecorder \
-DDISABLE_JEMALLOC=true \
-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=7091 
-Dcom.sun.management.jmxremote.local.only=false \
-Dcom.sun.management.jmxremote.authenticate=false 
-Dcom.sun.management.jmxremote.ssl=false \
-Djava.rmi.server.hostname=ec2-54-172-234-230.compute-1.amazonaws.com \
-jar ohc-benchmark/target/ohc-benchmark-0.3-SNAPSHOT.jar  \
-rkd 'gaussian(1..1500,2)' -wkd 'gaussian(1..1500,2)' -vs 
'gaussian(1024..4096,2)' -r .9 -cap 320 \
-d 120 -t 30 \
-sc 256
{noformat}

256 segments, jemalloc LD_PRELOAD, -DDISABLE_JEMALLOC=true
{noformat}
 Reads : one/five/fifteen/mean:  2503894/2143858/2036336/2459949
 count:   295258886 
 min/max/mean/stddev: 0.00047/ 0.76172/ 0.00652/ 0.03865
 75/95/98/99/999/median:  0.00439/ 0.00697/ 0.01147/ 0.03458/ 
0.75864/ 0.00342
 Writes: one/five/fifteen/mean:  278134/238242/226326/273275
 count:32800525 
 min/max/mean/stddev: 0.00176/ 0.89665/ 0.00945/ 0.03986
 75/95/98/99/999/median:  0.00719/ 0.01180/ 0.01816/ 0.11640/ 
0.89006/ 0.00556
{noformat}

256 segments, jemalloc via jna
{noformat}
 Reads : one/five/fifteen/mean:  2343872/1458688/1159829/2387622
 count:   286635526 
 min/max/mean/stddev: 0.00054/ 0.97114/ 0.00756/ 0.04664
 75/95/98/99/999/median:  0.00435/ 0.00675/ 0.00985/ 0.05139/ 
0.95959/ 0.00341
 Writes: one/five/fifteen/mean:  260376/162076/128883/265250
 count:31843705 
 min/max/mean/stddev: 0.00267/ 0.70586/ 0.01502/ 0.05161
 75/95/98/99/999/median:  0.01049/ 0.01695/ 0.04193/ 0.36639/ 
0.70331/ 0.00859
{noformat}

default segments, jemalloc LD_PRELOAD, -DDISABLE_JEMALLOC=true
{noformat}
 Reads : one/five/fifteen/mean:  2148677/1630379/1448226/2202878
 count:   264549288 
 min/max/mean/stddev: 0.00035/ 0.66081/ 0.00820/ 0.03519
 75/95/98/99/999/median:  0.00435/ 0.01247/ 0.05423/ 0.20834/ 
0.65286/ 0.00323
 Writes: one/five/fifteen/mean:  238699/180945/160641/244767
 count:29395103 
 min/max/mean/stddev: 0.00172/ 0.39821/ 0.01120/ 0.03079
 75/95/98/99/999/median:  0.00805/ 0.02124/ 0.08665/ 0.18473/ 
0.39776/ 0.00574
{noformat}

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Robert Stupp
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs 

[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2015-01-13 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14275809#comment-14275809
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

bq. Making freeCapacity a per-segment field: Then I'd prefer to reverse the 
stuff - i.e. add an allocatedBytes field to each segment. Operations would 
get the (calculated) free capacity as a parameter and act on that value. Or 
were you thinking about dividing capacity by number of segments and use that as 
the max capacity for each segment?
The goal of splitting the locks and tables into segments is to eliminate any 
globally shared cache lines that are written to by common operations on the 
cache. Having every put modify the free capacity introduces a potential point 
of contention. Only way to really understand the impact is to have a good micro 
benchmark you trust and try it both ways.

I think that you would split the capacity across the segments. Do exactly what 
you are doing now, but do the check inside the segment. Since puts are allowed 
to fail I don't think you have to do anything else.

bq. Regarding rehash/iterators: Could be simply worked around by counting the 
number of active iterators and just don't rehash while an iterator is active. 
That's better than e.g. returning duplicate keys or keys not at all - i.e. 
people relying on that functionality.
This is an OHC not a C* issue. I think from C*'s perspective it can be wrong 
rarely and it doesn't matter since it doesn't effect correctness. Definitely 
worth documenting though.

bq. I lean towards removing the new tables implementation in OHC. It has the 
big drawback that it only a allows a specific number of entries per bucket 
(e.g. 8). But I'd like to defer that decision after some tests on a NUMA 
machine.
You are on to something in terms of making a faster hash table, but it doesn't 
seem like huge win given the short length of most chains (1, or 2) and the 
overhead of the allocator and locking etc. It would show up in a 
micro-benchmark, but not in C*. I would like to stick with linked for C* for 
now since it's easy to understand and I've looked at it a few times.
 
I think I already sent you a link to this 
https://www.cs.cmu.edu/~dga/papers/silt-sosp2011.pdf but there are a lot of 
ideas there for dense hash tables. You can chain together multiple buckets so 
the entries per bucket becomes a function of cache line size.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Robert Stupp
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2015-01-13 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14275047#comment-14275047
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

Thanks for the review. Many useful hints in it :)

I'll reduce configuration stuff in C* integration as suggested and add some 
default-by-system-property mechanism (as suggested).

Making freeCapacity a per-segment field:
Then I'd prefer to reverse the stuff - i.e. add an allocatedBytes field to 
each segment. Operations would get the (calculated) free capacity as a 
parameter and act on that value. Or were you thinking about dividing capacity 
by number of segments and use that as the max capacity for each segment?

Regarding rehash/iterators: Could be simply worked around by counting the 
number of active iterators and just don't rehash while an iterator is active. 
That's better than e.g. returning duplicate keys or keys not at all - i.e. 
people relying on that functionality.

I just started JMH without any additional parameters. It's called during Maven 
test phase (unless you specify -DskipTests).

You're right. Murmur3 + UTF8 need more tests.

Didn't notice that that fastutil is that fat. Already replaced with an own 
implementation.

I lean towards removing the new tables implementation in OHC. It has the big 
drawback that it only a allows a specific number of entries per bucket (e.g. 
8). But I'd like to defer that decision after some tests on a NUMA machine.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Robert Stupp
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2015-01-12 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14274005#comment-14274005
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

If you go all the way down the JMH rabbit hole you don't need to do any of your 
own timing and JMH will actually do some smart things to give you accurate 
timing and ameliorate the impact of non-scalable/expensive timing measurement. 
Metrics uses System.nanoTime() internally so it isn't really any better as far 
as I can tell. System.nanoTime() on Linux is pretty scalable 
http://shipilev.net/blog/2014/nanotrusting-nanotime/. When I tested it in JMH 
it actually seemed to be linearly scalable, but JMH will solve that for you 
even on platforms where nanoTime is finicky.

The C* integration looks good. I'm glad it was easy. When it comes to exposing 
configuration parameters less is more

The stress tool when used without workload profiles does some validation. It 
checks that values are there and that the contents are correct.

Did not know about the JNA synchronized block. That is surprising, but I am 
glad to hear it is getting fixed. For access to jemalloc I recommend using 
unsafe and LD_PRELOAD jemalloc. I think that would be the recommended approach 
and the one you should benchmark against and JNA would be there as a fallback. 
That gives you a JNI call for allocation/deallocation.

I am trying out the JMH benchmark and looking at the new linked implementation 
right now. How are you starting the JMH benchmark?


 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Robert Stupp
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2015-01-12 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14274409#comment-14274409
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

I did another review. The additional test coverage looks great.

Don’t throw Error, throw runtime exceptions on things like serialization 
issues. The only place it make sense to throw error is when allocating memory 
fails. That would match the behavior of ByteBuffer.allocateDirect. I don’t see 
failure to allocate from the heap allocator as recoverable even in the context 
of the cache. IOError is thrown from one place in the entire JDK (Console) so 
it's an odd choice.

freeCapacity should reall be a field inside each segment and full/not full and 
eviction decisions should be made inside each segment independently. In 
practice inside C* it’s probably fine as just an AtomicLong, but I want to see 
OHC be all it can be.

Rehash test could validate the data. After the rehash. It could also validate 
the rehash under concurrent access, say have a  reader thread that is randomly 
accessing values  the already inserted value.I can’t tell if the crosscheck 
test inserts enough values to trigger rehashing.

Inlining the murmur3 changes makes me a little uncomfortable. It’s good see see 
some test coverage comparing with another implementation, but it’s over a small 
set of data. It seems like the Unsigned stuff necessary to perfectly mimic the 
native version of murmur3 is missing?

Add 2-4 byte coed points for the UTF-8 tests.

FastUtil is a 17 megabyte dependency all to get one array list.

The cross checking implementation is really nice.

Looking at the AbstractKeyIterator, I don’t see how it can do the right thing 
when a segment rehashes. It will point to a random spot in the segment after a 
rehash right? In practice maybe this doesn’t matter since they should size up 
promptly and it’s just an optimization that we dump this stuff at all. I can 
understand what the current code does so I lean towards keeping it.

There are a couple of places (serializeForPut, putInternal, maybe others) where 
there are two exception handlers that each de-allocate the same piece of 
memory. The deallocation could go in a finally instead of the exception 
handlers since it always happens.


 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Robert Stupp
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2015-01-09 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14271745#comment-14271745
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

Note: OHC how has cache-loader support (https://github.com/snazy/ohc/issues/3). 
Could be an alternative for RowCacheSentinel.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Robert Stupp
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2015-01-07 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267558#comment-14267558
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

BTW: Is there any singe-node-cluster test that has been used to test the 'old' 
row cache or a test that runs against a single-node-cluster and verifies the 
data being written during a long run - i.e. several hours?

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Robert Stupp
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2015-01-06 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265947#comment-14265947
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

The latest (just checked in) benchmark implementation gives much better 
results. Using 
{{com.codahale.metrics.Timer#time(java.util.concurrent.CallableT)}} 
eliminates use of {{System.nanoTime()}} or 
{{ThreadMXBean.getCurrentThreadCpuTime()}} - it can directly use its internal 
clock.
The benchmark {{java -jar ohc-benchmark/target/ohc-benchmark-0.2-SNAPSHOT.jar 
-rkd 'gaussian(1..2000,2)' -wkd 'gaussian(1..2000,2)' -vs 
'gaussian(1024..4096,2)' -r .9 -cap 16 -d 30 -t 30}} improved from 800k 
reads to 3.3M reads per second w/ 8 cores). So yes - benchmark was measuring 
its own mad code. Due to that I edited my previous comment with the benchmark 
results since those are invalid now.

I've added a (yet simple) JMH benchmark as a separate module. This one can 
cause high system CPU usage - at operation rates of 2M per second or more (8 
cores). I think these rates are really fine.

Note: these rates cannot be achieved in production since then you'll obviously 
have to pay for (de)serialization, too.

So we want to address these topics as follow-up:
* own off-heap allocator
* C* ability to access off-heap cached rows
* C* ability to serialize hot keys directly from off-heap (might be a minor win 
since it's triggered not that often)
* per-table knob to control whether to add to row-cache on writes -- I strongly 
believe that this is a useful feature (maybe LHF) on workloads where read and 
written data work on different (row} keys.
* investigate if counter-cache can benefit
* investigate if key-cache can benefit

bq. You could start with it outside and publish to maven central and if there 
an issue getting patches applied quickly we can always fork it in C*.
OK

bq. pluggable row cache
Then I'll start with that - just make row-cache pluggable and the 
implementation configurable.

Note: JNA has a synchronized block that's executed at every call - version 
4.2.0 fixes this (don't know when it will be released).

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2015-01-06 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266809#comment-14266809
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

OHC works in Cassandra:
* unit tests pass ({{ant test}}, not difference against trunk)
* get and put verified in debugger and a (simple) table
* row cache saving and load working, too

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Robert Stupp
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2015-01-02 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263083#comment-14263083
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

I went to run the benchmark myself and I noticed you used a uniform 
distribution for the keys. I don't think that makes sense for testing a cache 
where the primary benefit is going to be from cacheable access patterns. I 
would use extreme with .6 or .5 for the shape.

I am also confused by the benchmark implementation. There are threads 
generating the tasks and then handing them off to other threads for execution. 
This means the benchmark is measuring unrelated things like the performance of 
the queue used for receiving tasks and returning results as well as the general 
design of the harness. It makes me wonder if that is the source of the 
under-utilization issue.

I think this might work well as a JMH benchmark and the parameterization would 
make it easy to put together a full test matrix that anyone can run with one 
command.

I tried to run it and it seems to go for longer than expected. I specified -d 
300 and it is still going. The benchmark is doing work according to top.

I ran on a c3.8xlarge using the Rightscale 14.1 base server template running 
Ubuntu 14.04, Oracle JDK8u25, I got jemalloc from the libjemalloc1 package. 
Cloned OHC today and ran the benchmarking using
bq.java -jar ohc-benchmark/target/ohc-benchmark-0.2-SNAPSHOT.jar -rkd 
'gaussian(1..2000,2)' -wkd 'gaussian(1..2000,2)' -vs 
'gaussian(1024..4096,2)' -r .9 -cap 160 -d 300 -t 30 -dr 8
after running mvn package.


 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-12-31 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14262472#comment-14262472
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

I have an in progress response to your earlier comment. I'll address the 
benchmark here.
 
I wouldn't sweat allocator performance. Ultimately we will have to have our own 
if only to accurately enforce memory utilization (user asks for 200 megabytes, 
we use 400, not cool). I think the blueprint for how to do this already exists 
in something like memcached in terms of how to allocate and defragment. We just 
need to adapt it for our approach where it is a pool of independently locked 
hash tables.

The overhead of copying is where zero deserialization and ref-counting start to 
be a win since you don't have to copy at all. I wouldn't get worked up on 
optimizing for that yet since that requires upstream to be smarter about how it 
uses the cache. If upstream can parse the cache value and extract a subset 
without copying the entire thing it will handle larger values more gracefully. 
At some point upstream might also hold partial rows as well.

I would like to see the ability to spin all cores against the cache, at least 
for relatively small values. Not being able to do that is a little concerning. 
Are threads blocking inside the allocator? Do the utilization issues occur with 
large or small values?

I don't have a real baseline with whether these numbers are good or bad. They 
sound okay and as you say you would expect the allocator to be one of the 
slowest parts. I am not sure testing with 500 threads is realistic since 
threads have a pretty good chance of being descheduled while holding a lock and 
that isn't as likely to happen under real usage conditions. I would test with 
say 30 threads on that hardware. 

For say 16k values measuring scaling from 1-30 threads would give us an idea of 
how well things are going. That would also give you better feedback on whether 
different numbers of stripes help or not.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-12-31 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14262487#comment-14262487
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

bq. Whether to migrate whole OHC code into org.apache.cassandra codebase (with 
the option to either turn it on or off).
I am open to either. I asked Benedict and he prefers having it inside C* so we 
can patch it. The advantage of having it outside is that it might see use 
elsewhere and get additional eyes/contributions. You could start with it 
outside and publish to maven central and if there an issue getting patches 
applied quickly we can always fork it in C*.

bq. Whether to implement a “pluggable row cache“ (to allow multiple 
implementations)
I think that we aren't going to need multiple cache implementations in the long 
run. Seems like we should be able to have on that can be configured to have the 
desired behavior. Benedict doesn't feel strongly about it either. If Vijay 
wants to continue working on another implementation then we would want to keep 
it pluggable the way it currently is.

It looks like the KeyCache and CounterCache both use a different implementation 
and not SerializingCache. I am not clear on why they don’t use serializing 
cache. It's worth evaluating why that is before converging on a single 
implementation.

bq. New per-table knob to enable whether to populate entries to the row cache 
on reads+writes or just on reads (to target different workloads)
Sounds like it would be useful, but first we have to come up with someone 
somewhere that says I want this, or a workload where this is the right call. 
There may also be correctness issues to think about see next item.

bq. Rethink about whether to keep the current RowCacheSentinel implementation 
as is - if I understand it correctly, it just reduces the number of cache-put 
operations (cache hit on a sentinel performs a disk read). A compromise 
regarding additional serialization cost?
I think it is for correctness? 
https://issues.apache.org/jira/browse/CASSANDRA-3862
I'm still reading up on this.

bq. Improvement of key (de)serialization (saving the row cache to disk) - use 
direct I/O
There is some trickiness here because the AutoSavingCache breaks apart the keys 
to determine where the data goes.
bq. Optimizations of value deserialization effort - let C* directly access a 
cached row in off-heap memory instead of the deserialization (and on-heap 
object construction) overhead.
I think these two together would make a good follow up ticket. Another good 
follow up ticket would be addressing the allocator for performance and for 
fragmentation.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-12-23 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14257721#comment-14257721
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

I had the opportunity to test OHC on a big machine.
First: it works - very happy about that :)

Some things I want to notice:
* high number of segments do not have any really measurable influence (default 
of 2* # of cores is fine)
* throughput heavily depends on serialization (hash entry size) - Java8 gave 
about 10% to 15% improvement in some tests (either on {{Unsafe.copyMemory}} or 
something related like JNI barrier)
* the number of entries per bucket stays pretty low with the default load 
factor of .75 - vast majority has 0 or 1 entries, some 2 or 3 and few up to 8

Issue (not solvable yet):
It works great for hash entries to approx. 64kB with good to great throughput. 
Above that barrier it first works good but after some time the system spends a 
huge amount of CPU time (~95%) in {{malloc()}} / {{free()}} (with jemalloc, 
Unsafe.allocate is not worth discussing at all on Linux).
I tried to add some „memory buffer cache“ that caches free’d hash entries for 
reuse. But it turned out that in the end it would be too complex if done right. 
The current implementation is still in the code, but must be explicitly enabled 
with a system property. Workloads with small entries and high number of threads 
easily trigger Linux OOM protection (that kills the process). Please note that 
it works with large hash entries - but throughput drops dramatically to just a 
few thousand writes per second.

Some numbers (value sizes have gaussian distribution). Had to do these tests in 
a hurry because I had to give back the machine. Code used during these tests is 
tagged as {{0.1-SNAP-Bench}} in git. Throughput is limited by {{malloc()}} / 
{{free()}} and most tests did only use 50% of available CPU capacity (on 
_c3.8xlarge_ - 32 cores, Intel Xeon E5-2680v2 @2.8GHz, 64GB).
* 1k..200k value size, 32 threads, 1M keys, 90% read ratio, 32GB: 22k 
writes/sec, 200k reads/sec, ~8k evictions/sec, write: 8ms (99perc), read: 
3ms(99perc)
* 1k..64k value size, 500 threads, 1M keys, 90% read ratio, 32GB: 55k 
writes/sec, 499k reads/sec, ~2k evictions/sec, write: .1ms (99perc), read: 
.03ms(99perc)
* 1k..64k value size, 500 threads, 1M keys, 50% read ratio, 32GB: 195k 
writes/sec, 195k reads/sec, ~9k evictions/sec, write: .2ms (99perc), read: 
.1ms(99perc)
* 1k..64k value size, 500 threads, 1M keys, 10% read ratio, 32GB: 185k 
writes/sec, 20k reads/sec, ~7k evictions/sec, write: 4ms (99perc), read: 
.07ms(99perc)
* 1k..16k value size, 500 threads, 5M keys, 90% read ratio, 32GB: 110k 
writes/sec, 1M reads/sec, 30k evictions/sec, write: .04ms (99perc), read: 
.01ms(99perc)
* 1k..16k value size, 500 threads, 5M keys, 50% read ratio, 32GB: 420k 
writes/sec, 420k reads/sec, 125k evictions/sec, write: .06ms (99perc), read: 
.01ms(99perc)
* 1k..16k value size, 500 threads, 5M keys, 10% read ratio, 32GB: 435k 
writes/sec, 48k reads/sec, 130k evictions/sec, write: .06ms (99perc), read: 
.01ms(99perc)
* 1k..4k value size, 500 threads, 20M keys, 90% read ratio, 32GB: 140k 
writes/sec, 1.25M reads/sec, 50k evictions/sec, write: .02ms (99perc), read: 
.005ms(99perc)
* 1k..4k value size, 500 threads, 20M keys, 50% read ratio, 32GB: 530k 
writes/sec, 530k reads/sec, 220k evictions/sec, write: .04ms (99perc), read: 
.005ms(99perc)
* 1k..4k value size, 500 threads, 20M keys, 10% read ratio, 32GB: 665k 
writes/sec, 74k reads/sec, 250k evcictions/sec, write: .04ms (99perc), read: 
.005ms(99perc)

Command line to execute the benchmark:
{code}
java -jar ohc-benchmark/target/ohc-benchmark-0.1-SNAPSHOT.jar -rkd 
'uniform(1..2000)' -wkd 'uniform(1..2000)' -vs 'gaussian(1024..4096,2)' 
-r .1 -cap 320 -d 86400 -t 500 -dr 8

-r = read rate
-d = duration
-t = # of threads
-dr = # of driver threads that feed the worker threads
-rkd = read key distribution
-wkd = write key distribution
-vs = value size
-cap = capacity
{code}

Sample bucket histogram from 20M test:
{code}
[0..0]: 8118604
[1..1]: 5892298
[2..2]: 2138308
[3..3]: 518089
[4..4]: 94441
[5..5]: 13672
[6..6]: 1599
[7..7]: 189
[8..9]: 16
{code}

After trapping into that memory management issue with varying allocation sized 
of some few kB to several MB, I think that it’s still worth to work on an own 
off-heap memory management. Maybe some block-based approach (fixed or 
variable). But that’s out of the scope of this ticket.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay

[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-12-18 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14251537#comment-14251537
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

I’ve nearly finished the OHC implementation. Unit tests cover all functionality 
required by C* and a separate test-only implementation is now used to verify 
the implementation (entry (de)serialization is not extensively covered by the 
tests, yet). OHC interface is changed towards the functionality required by C*.

Maven executes the unit tests both with and without jemalloc (only if jemalloc 
is installed, of course).

[~aweisberg], [~benedict] can you have a look at the current OHC code?

I’d like to know how it could/should be integrated in C*. IMO there are two 
decisions to be made:
* Whether to migrate whole OHC code into org.apache.cassandra codebase (with 
the option to either turn it on or off).
* Whether to implement a “pluggable row cache“ (to allow multiple 
implementations)

I've got some ideas regarding row cache which are out of scope of this ticket:
* New per-table knob to enable whether to populate entries to the row cache on 
reads+writes or just on reads (to target different workloads)
* Rethink about whether to keep the current {{RowCacheSentinel}} implementation 
as is - if I understand it correctly, it just reduces the number of cache-put 
operations (cache hit on a sentinel performs a disk read). A compromise 
regarding additional serialization cost?
* Improvement of key (de)serialization (saving the row cache to disk) - use 
direct I/O
* Optimizations of value deserialization effort - let C* directly access a 
cached row in off-heap memory instead of the deserialization (and on-heap 
object construction) overhead.

Note: although the jemalloc allocator provides a {{getTotalAllocated()}} 
method, the result is not correct and I don't know why. The result depends on 
jemalloc configure settings ({{--en/disable-tcache}}). According to the 
man-page the result should be correct (sum of {{stats.allocated}} and 
{{stats.huge.allocated}}), but it isn't (verified with a coded memory leak of 
small allocations that didn't increase the value). Iterating over the jemalloc 
_arenas_ and _bins_ does not help since the two mentioned values are 
aggregations of these.


 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-12-18 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14251756#comment-14251756
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

Ah - a burn test is still missing. I'll add some code that is able to verify 
the cache contents, key iterators, and such stuff.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-12-09 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14239729#comment-14239729
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

Lot's of cool stuff here Robert.

Unit test wise there is a lot of code that is only covered indirectly or not at 
all and the behaviors are not checked for explicitly. I don't think it makes 
sense to include code that doesn't doesn't have a unit test claiming it does 
what is says it does. The various input/output streams, buffering, and 
compression all need units tests. Uns.java needs a unit test for pretty much 
every method as well as the validation functionality. HashEntry has a bunch of 
uncovered functions. For me the lack of test coverage is the biggest barrier.

What I am reacting to is that the tests are black box and miss things. 
OffHeapMap containsEntry has no tests. removeEntry has untested code. 
removeLink still has untested code. There is untested histogram stuff, 
deserializeEntry, serializeEntry. HashEntry classes have untested functions. 
HashEntries has many predicates that are untested.

Having a unit test that fuzzes against a parallel implementation at the same 
time using a different LRU map implementation would be great for a black box 
test. You can stripe the other implementation the same way so that the eviction 
matches.

One of my previous comments was that SegmentCacheImpl duplicates reference 
counting code from OffHeapMap and should just delegate. It ends up doing that 
anyways.

I would really like to see the cleanup/eviction code go away. If inserting an 
entry would blow capacity remove entries until it doesn't. I don't see a reason 
to monkey with thresholds. 

At some point the existing C* cache interface needs to gel with your work. 
Right now C* uses the hotN and getKeys interface to return the contents of the 
cache for persistence. I think the path of least resistance to start would be 
to implement the existing interface and then come back and look at how to get 
compression and more efficient IO into all the implementations. The existing 
stuff in C* doesn't do compression and doesn't buffer its IO. I would prefer to 
minimize major changes to the existing C* code. I want to get it working and 
then iterate further for other improvements like more efficient cache 
serialization.

You could change the OHC interface or implement an adapter. I think it's fine 
to modify ICache to return iterables or iterators instead of a collections to 
incrementally produce key set and hot keys. For everything else I would really 
like to see things to stay the same unless there is something to be gained by 
changing the interface.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-12-09 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14240193#comment-14240193
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

bq. Lot's of cool stuff

Thx :)

Unit testing: you are absolutely right. (Will go on with that next)

bq. unit test that fuzzes against a parallel implementation at the same time 
using a different LRU map implementation 

Do you mean sth. like LinkedHashMap with removeEldestEntry() ? It's some effort 
to get a nice implementation for unit tests - but, yeah - makes sense.

bq. duplicates reference counting code

removed duplicated code

bq. cleanup/eviction code go away ... remove entries until it [fits]

much easier ; cleaner code ; implemented - but not completely sold on the new 
implementation yet (quick hack yet)

bq. C* cache interface ... get compression and more efficient IO [later]

That's fair. I just saw some minutes ago that row-cache serialization only 
persists the keys and not the values - so the existing implementation in OHC 
would need to be changed / extended / whatever. I thought it persists the 
value, too.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-12-09 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14240235#comment-14240235
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

bq. That's fair. I just saw some minutes ago that row-cache serialization 
only persists the keys and not the values - so the existing implementation in 
OHC would need to be changed / extended / whatever. I thought it persists the 
value, too.
I was also confused by that. Persisting the values would break cache 
invalidation in a way that is hard to correct without integrating with the 
commit log.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-12-09 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14240305#comment-14240305
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

Yep - persisting the values would cause inconsistencies - either on it's own or 
by users deleting saved caches.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-12-07 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237162#comment-14237162
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

Also pushed persistence of cache content using Snappy compression.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-12-07 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237170#comment-14237170
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

Rehashing: hm - at {{o.a.c.db.ColumnFamilyStore#getThroughCache}} (better: 
{{RowCacheKey}}) we only have the token/key but no (good) hash for the key.

The savings by using a 32 bit hash is about 8 bytes per cache entry 
(reference-counter field can then be reduced from 64 bit to 32 bit and still 
keeping the 8 byte boundaries for key and value data). But this seems not to 
have any measurable effect if e.g. jemalloc aligns allocated memory blocks on 
bigger page sizes depending on whole cache entry size (e.g. several kB or MB).

OHC always calculates its own murmur3 hash using the serialized cache key. I 
_hope_ to achieve a better distribution across segments and buckets by using 64 
bits - but not sure on this. My preference of using 64 hash bits is basically 
it feels better.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-12-04 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234696#comment-14234696
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

Just pushed some OHC additions to github:
* key-iterator (used by CacheService class to invalidate column families)
* (de)serialization of cache content to disk using direct I/O from off-heap.  
Means that the row cache content does not need to go though the heap for 
serialization and deserialization. Compression should also be possible in 
off-heap using the static methods in Snappy class since these expect direct 
buffers so there's nearly no pressure for that on the heap. Background: the 
implementation basically lies the address and length of the hash entry into 
DirectByteBuffer class so FileChannel is able to read into it/write from it.


 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-12-03 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232833#comment-14232833
 ] 

Benedict commented on CASSANDRA-7438:
-

re: hash bits:

there's not really a dramatic benefit to using more than 32-bits. We will 
always use the upper bits for the segment and the lower bits for the bucket, 
for which 4B items is plenty, although we don't have proper entropy for all the 
bits; we may have only 28-bits of good collision free-ness; we will want to 
rehash the murmur hash to ensure this is spread evenly to avoid a grow boundary 
consistently failing to reduce collisions. 

The one advantage of having some spare hash bits is that we can use these to 
avoid running a potentially expensive comparison on a large key until high 
confidence we've found the correct item - and as the number of unused hash bits 
for indexing dwindle, the value of this goes up. But the number of instances 
where this helps will be vanishingly small, since the head of the key will be 
on the same cache line and a hash collision and key prefix collision is pretty 
unlikely. It might be more significant if we were to use open-address hashing, 
as we would have excellent locality and reduce the number of expected cache 
misses for a lookup. But this won't be measurable above the cache serialization 
costs. We do already have these hash bits calculated in c*, typically. We also 
are unlikely to notice the overhead - allocations are likely to have ~16 bytes 
of overhead, be padded to the nearest 8 or 16 bytes, and a row has a lot of 
bumpf to encode. I doubt there will be any variation in storage costs from 
using all 64 bits.

i.e., whatever floats your boat

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-12-02 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231327#comment-14231327
 ] 

Vijay commented on CASSANDRA-7438:
--

[~snazy] I was trying to compare the OHC and found few major bugs.

1) You have individual method synchronization on the Map, which doesn't ensure 
that your get is locked before a put is performed (same with clean, hot(N), 
remove etc), look at SynchronizedMap source code to do it right else will crash 
soon.
2) Even after i fix it, there is correctness in the hashing algorithm i think. 
Get returns a lot of error and looks like there is some memory leaks too.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-12-02 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231627#comment-14231627
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

Robert I don't seem to be getting the latest code for your work on master? For 
instance the key comparison code does 8 bytes at a time and doesn't handle 
trailing bytes as far as I can tell.

To Vijay's point. A pseudo-random test against the map that does say 200 
million operations against a keyspace of several million entries and mirrors 
the operations on a regular hash map and checks they have the same contents 
periodically would be helpful in having some confidence in the map. Size it so 
the LRU doesn't do anything. Print the seed at the beginning of the test so it 
can be reproduced. I think this basically duplicates the benchmark, but having 
it as a unit test is nice. We can tune the number of operations and keys down 
for running in CI. You could also look a the unit tests for Guava's cache or 
j.u.HashMap and borrow those. Nice thing about data structure APIs is that the 
tests already exist.

bq. Yes, basically from JDK. Could not get that via inheritance.
What are the licensing and attribution requirements for that code?

bq. IMO hash code should be 64 bits because 32 bits might not be sufficient.
[~benedict] might have some opinions on how to get the best bits out of 
MurmurHash3. 32 bits is 256-512 gigabytes of cache for 128 byte entries which 
is not bad. I don't feel strongly either way since I don't know whether callers 
will have the hash precomputed.

bq. Nope - would not be. But it's 2^27 (limited by a stupid constant used for 
both max# of segments and max# of buckets). Worth taking a look at it - it's 
weird, yes.
In OffHeapMap line 222 it seems to have a gate preventing rehashing to  2 ^ 24 
buckets.

bq. (Hope I caught all of your comments)
I'll check them once you update.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-12-02 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231817#comment-14231817
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

[~vijay2...@yahoo.com] can you explain what kind of bugs?

bq. licensing and attribution requirements
It's already in C* code base in exactly the same way.

Also pushed some changes:
* increased max# of segment and buckets to 2^30 (means approx 1B segments times 
1B bucket)
* add some prototype for direct I/O for row cache serialization (zero copy) - 
just as a demo (just coded, not tested yet)
* uses Unsafe for value (de)serialization
* move (most) statistic counters to OffHeapMap to reduce contention caused by 
volatile (really makes sense)
* remove use of guava cache API
* corrected and improved key comparison

Regarding the 64 bit hash. It's 64 bit since OHC takes the the most significant 
bits for the segment and the least significant bits for the hash inside a 
segment. Both are limited to 30 bits = 60 bits.


 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-12-02 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231878#comment-14231878
 ] 

Vijay commented on CASSANDRA-7438:
--

Never mind, my bad it was related the below (which needs to be more 
configurable instead) and the items where going missing earlier than i thought 
it should and looks you just evict the items per segment (If a segment is used 
more more items will disappear from that segment and the lest used segment 
items will remain).
{code}
// 12.5% if capacity less than 8GB
// 10% if capacity less than 16 GB
// 5% if capacity is higher than 16GB
{code}

Also noticed you don't have replace which Cassandra uses. 
Anyways i am going to stop working on this for now, let me know if someone 
wants any other info.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-12-01 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230675#comment-14230675
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

Look pretty nice.

Suggestions:
* Push the stats into the segments and gather them the way you do free capacity 
and cleanup count. You can drop the volatile (technically you will have to 
synchronize on read). Inside each OffHeapMap put the stats members (and 
anything mutable) as the first declared fields. In practice this can put them 
on the same cache line as the lock field in the object header. It will also be 
just one flush at the end of the critical section. Stats collection should be 
free so no reason not to leave it on all the time.
* I am not sure batch cleanup makes sense. When inserting an item into the 
cache would blow the size requirement I would just evict elements until 
inserting it wouldn't. Is there a specific efficiency you think you are going 
to get from doing it in batches?
* Cache is the wrong API to use since it doesn't allow lazy deserialization and 
zero copy. Since entries are refcounted there is no need make a copy. Might be 
something to save for later since everything upstream expects a POJO of some 
sort.
* Key buffer might be worth a thread local sized to a high watermark

Do we have a decent way to do line level code review? I can't  leave comments 
on github unless there is a pull request. Line level stuff
* Don't catch exceptions and handle inside the map. Let them all propagate to 
the caller and use try/finally to do cleanup. I know you have to wrap and 
rethrow some things, but avoid where possible.
* Compare key compares 8 bytes at a time, how does it handle trailing bytes and 
alignment?
* Agrona has an Unsafe ByteBuffer implementation that looks like it makes a 
little better use of various intrinsics then AbstractDataOutput. Does some 
other nifty stuff as well. 
https://github.com/real-logic/Agrona/blob/master/src/main/java/uk/co/real_logic/agrona/concurrent/UnsafeBuffer.java
* In OffHeapMap.touch lines 439 and 453 are not covered by tests. Coverage 
looks a little weird in that a lot of the cases are always hit but some don't 
touch both branches. If lruTail == hashEntryAddr maybe assert next is null.
* Rename mutating OffHeapMap lruNext and lruPrev to reflect that they mutate. 
In general rename mutating methods to reflect they do that such as the two 
versions of first
* I don't see why the cache can't use CPU endianness since the key/value are 
just copied.
* Did you get the UTF encoded string stuff from somewhere? I see something 
similar in the jdk, can you get that via inheritance?
* HashEntryInput, AbstractDataOutput  are low on the coverage scale and have no 
tests for some pretty gnarly UTF8 stuff.
* Continuing on that theme there is a lot of unused code to satisfy the 
interfaces being implemented, would be nice to avoid that.
* By hashing the key yourself you prevent caching the hash code in the POJO. 
Maybe hashes should be 32-bits and provided by the POJO?
* If an allocation fails maybe throw OutOfMemoryError with a message
* If an entry is too large maybe return an error of some sort? Seems like 
caller should decide if not caching is OK.
* put on allocation failure calls removeInternal, but the key doesn't appear to 
be in the map yet? Is that to handle the put invalidating the previous entry?
* In put, why catch VirtualMachineError and not error?  Seems like it wants a 
finally, and it shouldn't throw checked exceptions.
* If a key serializer is necessary throw in the constructor and remove other 
checks
* Hot N could use a more thorough test?
* In practice how is hot N used in C*? When people save the cache to disk do 
they save the entire cache?
* In the value loading case, I think there is some subtlety to the concurrency 
of invocations to the loader in that it doesn't call it on all of them in a 
race. It might be a minor change in behavior compared to Guava.
* Maybe do the value loading timing in nanoseconds? Performance is the same but 
precision is better.
* OffHeapMap.Table.removeLink(long,long) has no test coverage of the second 
branch that walks a bucket to find the previous entry
* I don't think storage for 16 million keys is enough? For 128 bytes per entry 
that is only 2 gigabytes. You would have to run a lot of segments which is 
probably fine, but that presents a configuration issue. Maybe allow more than 
24 bits of buckets in each segment?
* SegmentedCacheImpl contains duplicate code fro dereferencing and still has to 
delegate part of the work to the OffHeapMap. Maybe keep it all in OffHeapMap?
* Unit test wise there are some things not tested. The value loader interface, 
various things like putAll or invalidateAll.
* Release is not synchronized. Release should null pointers out so you get a 
good clean segfault. Close should maybe lock and close one segment 

[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-12-01 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231132#comment-14231132
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

[~aweisberg], thanks for the review :)

Some of the changes you suggested are already in. Much of the code has been 
moved to OffHeapMap.
Batch cleanup has been completely removed - it's now handled inside OffHeapMap.
It makes runtime and code much nicer.

I've delayed descent unit tests to later - the stuff changed to often.
And there would be a big change when merging the stuff into C* code base, 
removing all that unused code and Cache interface implementation.

All of the duplicated stuff has been removed - we don't need that - even for a 
general-purpose cache it would have been not useful.

bq. Key buffer might be worth a thread local sized to a high watermark
Hm - do you mean sth like {{static final ThreadLocalKeyBuffer 
perThreadBuffer;}} inside SegmentedCacheImpl?

Regarding the line level code review (it's fine the way you did it IMO):

bq. Don't catch exceptions
Done.

bq. 8 bytes at a time, how does it handle trailing bytes and alignment?
Tailing bytes: fails back to per-byte comparison. Alignment: key and value are 
aligned on 8-byte boundaries.

bq. Agrona has an Unsafe ByteBuffer implementation that looks like it makes a 
little better use of various intrinsics then AbstractDataOutput.
Good hint! Will definitely take a look at it!

bq. I don't see why the cache can't use CPU endianness since the key/value are 
just copied.
Ah - you mean that stuff in HashEntryInput/Output. No - you can't always copy 
it using unsafe API.
I don't recall exactly why I removed that optimization (had that implemented 
before), but it had sth to do with data serialized for KeyBuffer and putting it 
into off-heap.
But it makes sense for values (since these are always directly serialized to 
off-heap).

bq. UTF encoded string stuff ... get that via inheritance?
Yes, basically from JDK. Could not get that via inheritance.

bq. hashing the key yourself ... 32-bits
Thought about it (and had that previously). Yes - if we have a good hash code, 
we can use it.
But I don't know whether the calling code has a hash code.
IMO hash code should be 64 bits because 32 bits might not be sufficient.

bq. allocation fails maybe throw OutOfMemoryError
That would shut down C* daemon ;) Maybe. Not sure about that.
I think if you run into such a situation (out of off-heap/system memory) you 
are completely lost.
It just ignores that put() and removes the old entry.

bq. entry is too large maybe return an error of some sort
No. The calling code cannot do anything meaningful with it. But the calling 
could could check for that in advance (before constructing any object related 
to caching), if it has enough information.

bq. catch VirtualMachineError and not error
done

bq. hotN()
I _think_ it is used to persist the hot set of the cache.

bq. concerned about materializing the full list on heap
Agree. Thought about patching cache off-heap addresses into DirectByteBuffer 
and using that for serialization.

bq. I don't think storage for 16 million keys is enough?
Nope - would not be. But it's 2^27 (limited by a stupid constant used for both 
max# of segments and max# of buckets). Worth taking a look at it - it's weird, 
yes.

bq. value loading case,
Don't think we need that API.

bq. Release is not synchronized.
Yep - will do that.

(Hope I caught all of your comments)

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-30 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14229284#comment-14229284
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

Have pushed the latest changes of OHC to https://github.com/snazy/ohc. It has 
been nearly completely rewritten.

Architecture (in brief):
* OHC consists of multiple segments (default: 2 x #CPUs). Less segments leads 
to more contention, more segments gives no measurable improvement.
* Each segment consists of an off-heap-hash-map (defaults: table-size=8192, 
load-factor=.75). (The hash table requires 8 bytes per bucket)
* Hash entries in a bucket are organized in a double-linked-list
* LRU replacement policy is built-in via its own double-linked-list
* Critical sections that mutually lock a segment are pretty short (code + CPU) 
- just a 'synchronized' keyword, no StampedLock/ReentrantLock
* Capacity for the cache is configured globally and managed locally in each 
segment
* Eviction (or replacement or cleanup) is triggered when free capacity goes 
below a trigger value and cleans up to a target free capacity
* Uses murmur hash on serialized key. Most significant bits are used to find 
the segment, least significant bits for the segment's hash map. 

Non-production relevant stuff:
* Allows to start off-heap access in debug mode, that checks for accesses 
outside of allocated region and produces exceptions instead of SIGSEGV or 
jemalloc errors
* ohc-benchmark updated to reflect changes

About replacement policy: Currently LRU is built in - but I'm not really sold 
on LRU as is. Alternatives could be
* timestamp (not sold on this either - basically the same as LRU)
* LIRS (https://en.wikipedia.org/wiki/LIRS_caching_algorithm), big overhead 
(space)
* 2Q (counts accesses, divides counter regularly)
* LRU+random (50/50) (may give the same result than LIRS, but without LIRS' 
overhead)
But replacement of LRU with something else is out of scope of this ticket and 
should be done with real workloads in C* - although the last one is just a 
additional config parameter.

IMO we should add a per-table option that configures whether the row cache 
receives data on reads+writes or just on reads. Might prevent garbage in the 
cache caused by write heavy tables.

{{Unsafe.allocateMemory()}} gives about 5-10% performance improvement compared 
to jemalloc. Reason fot it might be that JNA library (which has some 
synchronized blocks in it).

IMO OHC is ready to be merged into C* code base.


 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-29 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228693#comment-14228693
 ] 

Benedict commented on CASSANDRA-7438:
-

Invert those two statements and the behaviour is still broken.

B: 154 :map.get()
A: 187: map.remove()
A: 191: queue.deleteFromQueue()
B: 158: queue.addToQueue()

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-29 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228871#comment-14228871
 ] 

Vijay commented on CASSANDRA-7438:
--

Should be taken care of too, it should become a duplicate delete to the queue 
and should work normally (by itemUnlinkQueue). Here is the adjusted test case 
for it.
https://github.com/Vijay2win/lruc/blob/master/src/test/java/com/lruc/unsafe/UnsafeQueueTest.java#L81

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-28 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228306#comment-14228306
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

Pauseless resizing is a worthy design goal, but might not be necessary if you 
call it a warmup cost. I would break out the performance comparison with and 
without warming up the cache so we know how it performs when you aren't 
measuring the resize pauses. Those should only happen at startup when the cache 
is populated.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228397#comment-14228397
 ] 

Benedict commented on CASSANDRA-7438:
-

I suspect segmenting the table at a finer granularity, so that each segment is 
maintained with mutual exclusivity, would achieve better percentiles in both 
cases due to keeping the maximum resize cost down. We could settle for a 
separate LRU-q per segment, even, to keep the complexity of this code down 
significantly - it is unlikely having a global LRU-q is significantly more 
accurate at predicting reuse than ~128 of them. It would also make it much 
easier to improve the replacement strategy beyond LRU, which would likely yield 
a bigger win for performance than any potential loss from reduced concurrency. 
The critical section for reads could be kept sufficiently small that 
competition would be very unlikely with the current state of C*, by performing 
the deserialization outside of it. There's a good chance this would yield a net 
positive performance impact, by reducing the cost per access without increasing 
the cost due to contention measurably (because contention would be infrequent).

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-28 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228515#comment-14228515
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

+1 To what Benedict suggests.

One minor nit. Resize pauses will happen across stripes at almost exactly the 
same time. I know with say 12 stripes it's very bad. With more than that it 
might start to spread them out, but I haven't seen that in action.

We can iterate on resize pause issues later if necessary. It's a warmup issue 
which will be a problem for some, but might not cripple the feature.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-28 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228523#comment-14228523
 ] 

Vijay commented on CASSANDRA-7438:
--

{quote}I would break out the performance comparison with and without warming up 
the cache so we know how it performs when you aren't measuring the resize 
pauses.{quote} 
Yep and in stedy state it is similar to get and I have verified that the 
latency is due to rehash. Better benchmarks on bug machines will be done on 
Monday.

Unfortunately -1 on partitions, it will be a lot more complex and will be hard 
to understand for users. If we have to expand the partitions, we have to figure 
out a better consistent hashing algo. Cassandra within Cassandra may be. More 
over we will end up having the current code as is to move maps and queues 
offheap. Sorry I don't understand the argument of code complexity.

If we are talking about code complexity. The unsafe code is 1000 lines 
including the license headers :)

The current contention topic is weather to use cas for locks. Which is showing 
higher cpu cost and I agree with Pavel on latencies as shown in the numbers.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-28 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228524#comment-14228524
 ] 

Vijay commented on CASSANDRA-7438:
--

PS: all the latency spikes are in 100' of micros. It's day and night comparison 
to current cache :)

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228563#comment-14228563
 ] 

Benedict commented on CASSANDRA-7438:
-

[~aweisberg]: In my experience segments tend to be imperfectly distributed, so 
whilst there is bunching of resizes simply because they take so long, with real 
work going on at the same time they should be a _little_ spread out. Though 
with murmur3 the distribution may be significantly more uniform than my prior 
experiments. Either way, they're performed in parallel (without coordination) 
if they coincide, so it's still an improvement.

[~vijay2...@yahoo.com]: When I talk about complexity, I mean the difficulties 
of concurrent programming magnified without the normal tools. For instance, 
there are the following concerns:

* We have a spin-lock - admittedly one that should _generally_ be uncontended, 
but on a grow or a small map this is certainly not the case, which could result 
in really problematic behaviour. Pure spin locks should not be used outside of 
the kernel. 
* The queue is maintained by a separate thread that requires signalling if it 
isn't currently performing work - which, in a real C* instance where the cost 
of linking the queue item is a fraction of the other work done to service a 
request means we are likely to incur a costly unpark() for a majority of 
operations
* Reads can interleave with put/replace/remove and abort the removal of an item 
from the queue, resulting in a memory leak. 
* We perform the grow on a separate thread, but prevent all reader _or_ writer 
threads from making progress by taking the locks for all buckets immediately.
* Freeing of oldSegments is still dangerous, it's just probabilistically less 
likely to happen.
* During a grow, we can lose puts because we unlock the old segments, so with 
the right (again, unlikely) interleaving of events a writer can think the old 
table is still valid
* When growing, we only double the size of the backing table, however since 
grows happen in the background the updater can get ahead, meaning we remain 
behind and multiply the constant factor overheads, collisions and contention 
until total size tails off.

These are only the obvious problems that spring to mind from 15m perusing the 
code, I'm sure there are others. This kind of stuff is really hard, and the 
approach I'm suggesting is comparatively a doddle to get right, and is likely 
faster to boot.

I'm not sure I understand your concern with segmentation creating complexity 
with the hashing... I'm proposing the exact method used by CHM. We have an 
excellent hash algorithm to distribute the data over the segments: murmurhash3. 
Although we need to be careful to not use the bits that don't have the correct 
entropy for selecting a segment. It's really no more than a two-tier hash 
table. The user doesn't need to know anything about this.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-28 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228569#comment-14228569
 ] 

Vijay commented on CASSANDRA-7438:
--

{quote}The queue is maintained by a separate thread that requires 
signalling{quote}
Thread is only signalled if they are not performing operation. I am lost.
{quote}resulting in a memory leak{quote}
I am 100% sure that this is not true. Can you write a test case for it to make 
this happen plz?
{quote}but prevent all reader or writer threads from making progress by taking 
the locks for all buckets immediately{quote}
I am sure this cannot be done, if you don't write you loose coherence and 
consistency. 
{quote}During a grow, we can lose puts because we unlock the old segments{quote}
test case again plz. I don't think this can happen too. I spend a lot of time 
testing the exact scenario.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228575#comment-14228575
 ] 

Benedict commented on CASSANDRA-7438:
-

bq. I am 100% sure

Never be 100% sure with concurrency, please :)

bq. test case again plz. I don't think this can happen too. I spend a lot of 
time testing the exact scenario.

You have too much faith in tests. You are testing under ideal conditions - two 
of the race conditions I highlighted will only rear their heads infrequently, 
most likely when the system is under uncharacteristic load causing very choppy 
scheduling. Analysis of the code is paramount. I will not produce a test case 
as I do not have time, however I will give you an interleaving of events that 
would trigger one of them.

Thread A is deleting an item, and is in LRUC.invalidate(), Thread B is looking 
up the same item, in LRUC.get().
A: 187: map.remove()
B: 154 :map.get()
A: 191: queue.deleteFromQueue()
B: 158: queue.addToQueue()

In particular, addToQueue() sets the markAsDeleted flag to false, undoing the 
prior work of deleteFromQueue.

bq. Thread is only signalled if they are not performing operation. I am lost.

It will generally not be performing an operation, because its work will be 
faster than any of the producers can produce work in normal C* operation.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-28 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228577#comment-14228577
 ] 

Vijay commented on CASSANDRA-7438:
--

May be you know better than me, but map.remove cannot be followed by a 
sucessful map.get because the remove is within a lock on the segment... 

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch, tests.zip


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226474#comment-14226474
 ] 

Tupshin Harper commented on CASSANDRA-7438:
---

[~xedin] I'm lost in too many layers of snark and indirection (not just yours). 
Can you elaborate on what strategy you actually find appealling?

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226516#comment-14226516
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

Some short notes about the last changes in OHC:

* changed from block-oriented allocation to Unsafe or JEMalloc (if available)
* added stamped locks in off-heap (quite simple and very efficient)
* triggering cleanup + rehash via cas-side trigger works fine
* extended the benchmark tool to specify different workload chacteristics 
(read/write ratio, key distribution, value length distribution - distribution 
code taken from cassandra-stress)
* still working on a good (mostly contention free) LRU strategy

One thing I noticed during benchmarking is that (concurrent?) allocations of 
large areas (several MB) take up to 50/60ms (OSX 10.10, 2.6GHz Core i7 - no 
swap, of course) - small regions are allocated quite fast (total roundtrip for 
a put ~0.1ms for 98 percentile). It might be viable to implement some mixture 
for memory allocation: Unsafe/JEMalloc for small regions (e.g.  1MB) and 
pre-allocated blocks for large regions. A configuration value could determine 
the amount of large region blocks to keep immediately available. Just an idea...


 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226531#comment-14226531
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

When are large regions being allocated? How common is the use case? Large would 
normally only be for table resizing right? 

Could the row cache contain very large values with wide rows?

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226552#comment-14226552
 ] 

Vijay commented on CASSANDRA-7438:
--

{quote}One thing I noticed during benchmarking is that (concurrent?){quote}
Yes, use these options, feel free to make it more configurable if you need.
{code}
public static final String TYPE = c;
public static final String THREADS = t;
public static final String SIZE = s;
public static final String ITERATIONS = i;
public static final String PREFIX_SIZE = p;
{code} 

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226795#comment-14226795
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

Caching entire rows of very large rows seems like a problem workload for a 
variety of reasons. The overhead of repopulating each cache entry on insertion 
is not good.

Does the storage engine always materialize entire rows into memory for every 
query?

60 milliseconds is much longer than it takes to copy several megabytes so it is 
expensive even with large rows although the rest of the cost of materializing 
the row might dominate.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226776#comment-14226776
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

The row cache can contain very large rows AFAIK.
Idea is to pre-allocate some portion of the configured capacity for large 
blocks - new blocks could be allocated on demand (edge-trigger).
OTOH if it stores that amount of data on a cache, that amount of time 
(20...60ms) might be irrelevant compared to the time needed for serialization - 
so maybe it would be wasted effort. Not sure about that.
Table resizing may take as long as it takes - I do not really bother about 
allocation time for that, because no reads or writes are locked while 
allocating the new partition(segment) table.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226861#comment-14226861
 ] 

Pavel Yaskevich commented on CASSANDRA-7438:


[~tupshin] Original idea of this was to get the thing we know does the job, 
which is memcached, strip out some of the unnecessary parts and pack it is a 
lib we can use over JNI, the same way snappy and others do. But now we are 
getting into a business of re-inventing things that are pretty hard to get 
right and properly test, so if the argument against having lruc in it's 
original form was that it would be hard to test/maintain that, in my opinion, 
is no longer valid.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227248#comment-14227248
 ] 

Jonathan Ellis commented on CASSANDRA-7438:
---

bq. The row cache can contain very large rows [partitions] AFAIK

Well, it *can*, but it's almost always a bad idea.  Not something we should 
optimize for.  (http://www.datastax.com/dev/blog/row-caching-in-cassandra-2-1)

bq. Does the storage engine always materialize entire rows [partitions] into 
memory for every query?

Only when it's pulling them from the off-heap cache.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227250#comment-14227250
 ] 

Jonathan Ellis commented on CASSANDRA-7438:
---

Looking at the discussion, I wonder if we're overcomplicating things.  I think 
it got a bit lost in the noise when Ariel said earlier,

bq. I also wonder if splitting the cache into several instances each with a 
coarse lock per instance wouldn't result in simpler, fast-enough code. I don't 
want to advocate doing something different for performance, but rather that 
there is the possibility of a relatively simple implementation via Unsafe.

Why not start with something like that and see if it's Good Enough?  I suspect 
that at that point other bottlenecks will be much more important, so paying a 
high complexity cost to optimize the cache further would be a bad trade overall.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-25 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224611#comment-14224611
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

[~aweisberg] thanks for that write up! A lot of very good findings, ideas and 
recommendations. Already implemented some of them - short story:
* tend to move from fixed-block-allocation to {{Unsafe.alloc}} - quick 
benchmarks show similar results
* StampedLock and LongAdder in J8 are great
* will see how a to implement a better partition management and overall LRU 
story

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-25 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224904#comment-14224904
 ] 

Vijay commented on CASSANDRA-7438:
--

{quote}
sun.misc.Hashing doesn't seem to exist for me, maybe a Java 8 issue?
StatsHolder, same AtomicLongArray suggestion. Also consider LongAdder.
{quote}
Yep, and let me find alternatives for Java 8 (and until 8 for LongAdder).
{quote}
The queue really needs to be bounded, producer and consumer could proceed at 
different rates.
In Segment.java in the replace path AtomicLong.addAndGet is called back to 
back, could be called once with the math already done. I believe each of those 
stalls processing until the store buffers have flushed. The put path does 
something similar and could have the same optimization.
{quote}
Yeah those where a oversight.
{quote}
Tasks submitted to executor services via submit will wrap the result including 
exceptions in a future which silently discards them. 
The library might take at initialization time a listener for these errors, or 
if it is going to be C* specific it could use the wrapped runnable or similar.
{quote}
Are you suggesting a configurable logging/exception handling in case the 2 
threads throw exceptions? If yes sure. Other exceptions AFAIK are already 
propagated. (Still needs cleanup though).
{quote}
A lot of locking that was spin locking (which unbounded I don't think is great) 
is now blocking locking. There is no adaptive spinning if you don't use 
synchronized. If you are already using unsafe maybe you could do monitor 
enter/exit. Never tried it.
Having the table (segments) on heap is pretty undesirable to me. Happy to be 
proved wrong, but I think a flyweight over off heap would be better.
{quote}
Segments are small in memory so far in my tests, The spin lock is to make sure 
the lock checks the segment if rehash happened or not, this is better than 
having a seperate lock which will be central. (No different than java or 
memcached).
Not sure if i understand the UNSAFE lock any example will help. 
The segments are in heap mainly to handle the locking, I think we can do a bit 
of CAS but global lock on rehashing will be a problem (May be an alternate 
approach is required).
{quote}
It looks like concurrent calls to rehash could cause the table to rehash twice 
since the rebalance field is not CASed. You should do the volatile read, and 
then attempt the CAS (avoids putting the cache line in exclusive state every 
time).
{quote}
Nope it is Single threaded Executor and the rehash boolean is already volatile 
:)
Next commit will have conditions instead (similar to C implementation).
{quote}
If the expiration lock is already locked some other thread is doing the 
expiration work. You might keep a semaphore for puts that bypass the lock so 
other threads can move on during expiration. I suppose after the first few 
evictions new puts will move on anyways. This would show up in a profiler if it 
were happening.
{quote}
Good point… Or a tryLock to spin and check if some other thread released enough 
memory.
{quote}
hotN looks like it could lock for quite a while (hundreds of milliseconds, 
seconds) depending on the size of N. You don't need to use a linked list for 
the result just allocate an array list of size N. Maybe hotN should be able to 
yield, possibly leaving behind an iterator that evictors will have to repair. 
Maybe also depends on how top N handles duplicate or multiple versions of keys. 
Alternatively hotN could take a read lock, and writers could skip the cache?
{quote}
We cannot have duplicates in the Queue (remember it is a double linked list of 
items in cache). Read locks q_expiry_lock is all we need, let me fix it.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want 

[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-25 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225371#comment-14225371
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

bq. Are you suggesting a configurable logging/exception handling in case the 2 
threads throw exceptions? If yes sure. Other exceptions AFAIK are already 
propagated. (Still needs cleanup though).
Something has to happen to exceptions generated there. Since it is a library 
and there is no caller to propagate them to it implies that people need to 
provide a listener or a logger.

bq. Segments are small in memory so far in my tests, 
Segments are hash buckets correct? They aren't segments of several hash 
buckets. If the goal of the hash table is to have at most two or three entries 
per segment then having an on heap Java object would be a lot of overhead per. 
Just as a guess we are talking about two objects. There is the 
Segment/ReentrantLock and then the AbstractQueuedSynchronizer allocated by 
ReentrantLock which has three additional fields. It's 48 bytes without 
alignment or object headers. There is also the overhead of having an 
AtomicArray of pointers to each segment object. A hash table bucket only has to 
be a pointer plus a lock field if you are going to lock buckets. You could do 
that in 8-12 bytes.

Whether it's too much data on heap is a question of how big a cache you want 
and how small the values being cached are. The smaller the values being cached 
the more the metadata overhead of the cache (and the JVM overhead) matter.

Locking wise if you are only doing spin locks you can use unsafe compare and 
swap to implement a lock in off heap memory. You do have to be careful about 
alignment.

bq. Nope it is Single threaded Executor and the rehash boolean is already 
volatile. Next commit will have conditions instead (similar to C 
implementation).
The task submitted to the executor doesn't check whether another rehash is 
required it just does it. The check before submitting a task to do rehashing 
appears to have a race where two threads could submit the task at the same 
time. There is no isolation between the threads as they read the volatile field 
and then write to it. Two or more threads could read and see that no rehash is 
in progress, update the value to rehash in progress, and then submit the task.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-25 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225453#comment-14225453
 ] 

Vijay commented on CASSANDRA-7438:
--

{quote}
Segments are hash buckets correct? 
{quote}
Yes and the way memcached and lruc is does the rehashing is based on this 
algorithm, and hence yes... That was the argument earlier about JNI based 
solution. (Also another reason I was talking about of configurable hash 
expansion capability in my previous comment)
{code}
unsigned long current_size = cas_incr(stats.hash_items, 1);
if (current_size  (hashsize(hashpower) * 3) / 2) {
assoc_start_expand();
}
{code}
if we don't like the constant overhead of the cache in heap and If you are 
talking about CAS which we already do for ref counting, as mentioned before we 
need an alternative strategy for global locks for rebalance if we go with lock 
less strategy.

{quote}
The task submitted to the executor doesn't check whether another rehash is 
required it just does it.
{quote}
Until you complete a rehash you don't know if you need to hash again or not... 
Am i missing something?

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-25 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225478#comment-14225478
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

bq. if we don't like the constant overhead of the cache in heap and If you are 
talking about CAS which we already do for ref counting, as mentioned before we 
need an alternative strategy for global locks for rebalance if we go with lock 
less strategy.
Just take what you have and do it off heap. You don't need to change anything 
about how locking is done, just put the segments off heap so each segment would 
be a 4-byte lock field and an 8 byte pointer to the first entry.

bq. Until you complete a rehash you don't know if you need to hash again or 
not... Am i missing something?
https://github.com/Vijay2win/lruc/blob/master/src/main/java/com/lruc/unsafe/UnsafeConcurrentMap.java#L38

The check on line 38 races with the assignment on line 39. N threads could do 
the check and think a rehash is necessary. Each would submit a rehash task and 
the table size would be doubled N times instead of 1 time.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-25 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225531#comment-14225531
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

bq. alignment requirements for 4 or 8 byte CAS

Intel P6: must be 8-byte aligned (if cross cache line) or unaligned (if in same 
cache line) - but: _However, it is recommend that locked accesses be aligned on 
their natural boundaries for better system performance_
(https://stackoverflow.com/questions/1415256/alignment-requirements-for-atomic-x86-instructions).
Side note: heh - there's even support for 128bit atomic operations (cmpxchg16b) 
- but where's the primitive for that in Java... :(

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-25 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225532#comment-14225532
 ] 

Vijay commented on CASSANDRA-7438:
--

{quote}
so each segment would be a 4-byte lock
{quote}
Are you talking about setting Just setting 1 for lock and 0 for unlock?
Hmmm, alright thats doable... i am guessing you have already seen how 
ReentrantLock implements locking.

{quote}
The check on line 38 races with the assignment on line 39.
{quote}
I thought we discussed this already... Yeah that was suppose to take care by 
this comment Next commit will have conditions instead (similar to C 
implementation)., have not committed it yet :)

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-25 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225592#comment-14225592
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

bq. I thought we discussed this already... Yeah that was suppose to take care 
by this comment Next commit will have conditions instead (similar to C 
implementation)., have not committed it yet 
Sorry my bad.

Reentrant lock is just a counter of the number of acquisitions. You could do an 
8-byte lock field, store the thread-id in the first 4-byte sand the counter in 
the next 4-bytes. We probably want these to be 8-byte aligned so they don't 
cross cache lines. 

bq. Are you talking about setting Just setting 1 for lock and 0 for unlock? 
Hmmm, alright thats doable... i am guessing you have already seen how 
ReentrantLock implements locking.
You could do it in 8-bytes since pointers are actually only six bytes. The two 
higher order bytes are just the highest order bit sign extended on current 
Intel processors. CAS the pointer and use the highest order bit to represent 
locked/unlocked.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-25 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225616#comment-14225616
 ] 

Pavel Yaskevich commented on CASSANDRA-7438:


I guess the only thing that we haven't yet re-invented in Cassandra would be an 
off-heap lock based on architecture specific details, great that we finally 
about to change this historical injustice. {color:red}This all definitely 
sounds a lot more reasonable than having C code as a dependency.{color}

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-24 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222799#comment-14222799
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

rehashing: growing (x2) is already implemented, shrinking (/2) shouldn't be a 
big issue, too. The implementation only locks the currently processed 
partitions during rehash.
put operation: fixed (was definitely a bug), cleanup is running concurrently 
and trigger on out of memory condition
block sizes: will give it a try (fixed vs. different sizes vs. variable sized 
(no blocks))
per-partition locks: already thought about it - not sure whether it's worth the 
additional RW-lock overhead since partition lock time is very low during normal 
operation
metrics: some (very basic) metrics are already in it - will add some more timer 
metrics (configurable)

[~vijay2...@yahoo.com] can you catch {{OutOfMemoryError}} for Unsafe.allocate() 
? It should not go up the whole call stack.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-24 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222802#comment-14222802
 ] 

Pavel Yaskevich commented on CASSANDRA-7438:


bq. per-partition locks: already thought about it - not sure whether it's worth 
the additional RW-lock overhead since partition lock time is very low during 
normal operation

It depends on the operation mode if there are e.g. 75% reads and 25% writes it 
makes more sense to use locks, because RW lock is going to be optimized by JVM 
to CAS operation when there is no contention, anyhow it's a valid test to do 
with different modes to check CAS vs. RW.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-24 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223479#comment-14223479
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

I think that for caches the behavior you want to avoid most is slowly growing 
heap. People hate that because it's unpredictable and they don't know when it's 
going to stop. You can always start with jemalloc and get the feature working 
and then iterate on memory management.

Fixed block sizes is a baby and bath water scenario to get the desirable fixed 
memory utilization property. When you want to build everything out of fixed 
size pages you have to slot the pages or do some other internal page management 
strategy so you can pack multiple things and rewrite pages as they fragment. 
You also need size tiered free lists and fragmentation metadata for pages so 
you can find partial free pages. That kind of thing only makes sense in ye 
olden database land where rewriting an already dirty page is cheaper than more 
IOPs. In memory you can relocate objects. 

Memcached used to have the  problem that instead of the heap growing the cache 
would lose capacity to fragmentation. FB implemented slab rebalancing in their 
fork, and then Memcached did its own implementation. The issue was internal 
fragmentation due to having too many of the wrong size slabs. 

For Robert
* Executor service shutdown, never really got why it takes a timeout nor why 
there is no blocking version. 99% of the time if it doesn't shutdown within the 
timeout it's a bug and you don't want to ignore it. We are pedantic about 
everything else why not this? It's also unused right now.
* Stats could go into an atomic long array with padding. It really depends on 
the access pattern. You want data that is read/written at the same time on the 
same cache line. These are global counters so they will be contended by 
everyone accessing the cache, better that they only have to pull in one cache 
line with all counters then multiple and have to wait for exclusive access 
before writing to each one. Also consider LongAdder.
* If you want to do your own memory management strategy I think something like 
segregated storage as in boost pool with size tiers for powers of two and power 
of two plus previous power of two. You can CAS the head of the free list for 
each tier to make it thread safe, and lock when allocating out a new block 
instead of the free list. This won't adapt to changing size distributions. For 
that stuff needs to be relocatable
* I'll bet you could use a stamped lock pattern and readers might not have to 
lock all. I think getting it working with just a lock is fine.
* I am not sure shrinking is very important? The table is pretty dense and 
should be a small portion of total memory once all the other memory is 
accounted for. You would need a lot of tiny cache entries to really bloat the 
table and then the population distribution would need to change to make that a 
waste.
* LRU lists per segment seems like it's not viable. That isn't a close enough 
approximation to LRU since we want at most two or three entries per partition.
* Some loops of very similar byte munging in HashEntryAccess
* Periodic cleanup check is maybe not so nice. An edge trigger via a CAS field 
would be nicer and move that up to  80% since on a big-memory machine that is 
a lot of wasted cache space. Walking the entire LRU could take several seconds, 
but if it is amortized across a lot of expiration maybe it is ok.
* Some rehash required checking is duplicated in OHCacheImpl

For Vijay
* sun.misc.Hashing doesn't seem to exist for me, maybe a Java 8 issue?
* The queue really needs to be bounded, producer and consumer could proceed at 
different rates. With striped 
* Tasks submitted to executor services via submit will wrap the result 
including exceptions in a future which silently discards them. The library 
might take at initialization time a listener for these errors, or if it is 
going to be C* specific it could use the wrapped runnable or similar.
* A lot of locking that was spin locking (which unbounded I don't think is 
great) is now blocking locking. There is no adaptive spinning if you don't use 
synchronized. If you are already using unsafe maybe you could do monitor 
enter/exit. Never tried it.
* It looks like concurrent calls to rehash could cause the table to rehash 
twice since the rebalance field is not CASed. You should do the volatile read, 
and then attempt the CAS (avoids putting the cache line in exclusive state 
every time).
* StatsHolder, same AtomicLongArray suggestion. Also consider LongAdder.
* In Segment.java  in the replace path AtomicLong.addAndGet is called back to 
back, could be called once with the math already done. I believe each of those 
stalls processing until the store buffers have flushed. The put path does 
something similar and could have the same 

[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-23 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222365#comment-14222365
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

I've spent some evenings on an alternative approach for an off heap row cache, 
too.
It uses a different concept and architecture.
* Based on a big hash table
* Each hash partition (segment) has a reference to an LRU linked list to hash 
entries. Each get operation moves the accessed entry to the head of the LRU 
linked list.
* Data memory is divided into uniform blocks (few kB) and managed by multiple 
(8) free-block linked lists. Just one big memory allocation during 
initialization. Pro: no fragmentation of free memory, easier to handle. Con: 
fragmentation of data.
* Proactive eviction with the goal to keep a percentage of memory free.
* Put operation (currently) fails, if there's not enough memory available to 
store the data. Idea is not to block the calling code (don't put additional 
latency on an overloaded system)
* Locks (CAS based) exist on each hash partition, each hash entry and each free 
list and are held as short as possible (e.g. put allocates data blocks, fills 
these with the data of the new entry, acquires the lock on the hash partition, 
updates the LRU linked list pointers and finishes)
* To keep the linked lists on each hash partition (segment) short, large hash 
tables should be used
* No rehash yet - could be manageable by locking one hash partition at once and 
split it into two new partitions (more logic, but no global lock).
* No overhead in JVM heap for the cache itself (although accesses require short 
lived objects for serialization)
* Only stolen thing is Vijay's benchmark (asked him before ;) ).

Pushed here: https://github.com/snazy/ohc - more descriptive Readme, too

Other ideas:
* If we have off heap data, it might be possible to (de)serialize the hot set 
directly to/from that off heap data (zero-copy I/O). At the cost of changing 
the on-disk data format.


 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-23 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222660#comment-14222660
 ] 

Pavel Yaskevich commented on CASSANDRA-7438:


Personally I like what Vijay did a bit more just because main ideas where taken 
from the memcached which is proven to be working fine for the majority of the 
use-cases and is pretty simple inside.

Regarding Robert's implementation I have few comments which he'll have to 
address (if not already) before I would consider this for inclusion:

- rehashing is must have, we want to grow/shrink caches based on usage to 
lessen burden on users trying to size it appropriately from day 1;
- if put operation fails it should at least invalidate previously inserted 
value if any, and probably kick-off maintenance activities like LRU cleanup 
and/or rehashing;
- Fixed size data block create a lot of allocation slop which could be 
sometimes take majority of allocate memory (e.g. Firefox had that problem), 
cache should at least have blocks of different sizes to minimize that;
- would be great to have benchmarks for per-partition CAS vs. per-partition RW 
lock in different operation modes, cache invalidation could be noticeable 
factor for performance as well as CAS-races;
- metrics (if not yet added).

Also based on discussion [~snazy] had with [~vijay2...@yahoo.com], I would 
avoid using DirectByteBuffer because they are a problematic to GC.


 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-22 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1495#comment-1495
 ] 

Vijay commented on CASSANDRA-7438:
--

Alright the first version of pure Java version of LRUCache pushed, 
* Basically a port from the C version. (Most of the test cases pass and they 
are the same for both versions)
* As ariel mentioned before we can use disruptor for the ring buffer but this 
doesn't use it yet.
* Expiry in the queue thread is not implemented yet.
* Algorithm to start the rehash needs to be more configurable and based on the 
capacity will be pushing that soon.
* Overhead in JVM heap is just the segments array.

https://github.com/Vijay2win/lruc/tree/master/src/main/java/com/lruc/unsafe 

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-06 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14199976#comment-14199976
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

Debugging C code via JNI and debugging Unsafe code on large data structures is 
a nightmare.
And a simple, stupid bug in both kinds can quickly let the JVM core dump.

Advantage for the Unsafe approach is that all OS are directly supported.
Advantage for the JNI approach is that the code that handles the data 
structures is much easier to read.

Proposal:
* Extract the changes that support pluggable ICacheProvide from this ticket 
to a separate ticket and commit that stuff
* Let Vijay continue his work on this one
* Provide an alternative implementation using Unsafe
* Let both implementations compete in some long running tests
This is much effort to do - but I don't know how to validate either solution 
theoretically.


 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14198368#comment-14198368
 ] 

Jonathan Ellis commented on CASSANDRA-7438:
---

Here's me in July:

bq. I'm not wild about taking on the complexity of building and distributing 
native libraries if we have a reasonable alternative. Vijay what do we win with 
the native approach over using java unsafe?

The objection at the time was,

bq. The win is that we can now have Caches which can be bigger than JVM with 
zero GC overhead on the items. Unsafe approach will hold the references in 
memory and the overhead on them is reasonably high compared to the native 
approach (example of it is an integer key's) and in addition if we use hash map 
we have segments with locks (also there the references in the queue), so it is 
not a straight forward approach either.

... but as Ariel said, we can use the same technique to hold references 
off-heap with Unsafe, as with JNI.  Am I missing something?

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-05 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14198523#comment-14198523
 ] 

Vijay commented on CASSANDRA-7438:
--

Alright looks like the objection is not on the design but the language choice, 
if i knew the implementation details it would have been a easier choice in the 
first place (the argument earlier was that we don't have a way to lock and use 
the queue easier), for example the map vs queue etc The thing which we are 
missing is 4 months of dev, testing and reviewers time :). 

Its alright let me give it a shot and after all we have an alternative to 
benchmark on.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-04 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14196296#comment-14196296
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

.bq No we don't. We have locks per Segment, this is very similar to lock 
stripping/Java's concurrent hash map.
Thanks for clearing that up

.bq Not really we lock globally when we reach 100% of the space and we freeup 
to 80% of the space and we spread the overhead to other threads based on who 
ever has the item partition lock. It won't be hard to make this part of the 
queue thread and will try it for the next release of lruc.

OK, that make sense. 20% of the cache could be many milliseconds of work if you 
are using many gigabytes of cache. That's not a great thing to foist on a 
random victim thread. If you handed that to the queue thread, well I think you 
run into another issue which is that the ring buffer doesn't appear to check 
for queue full? The queue thread could go out to lunch for a while. Not a big 
deal, but finer grained scheduling will probably be necessary.

.bq If you look at the code closer to memcached. Actually I started of 
stripping memcached code so we can run it in process instead of running as a 
separate process and removing the global locks in queue reallocation etc and 
eventually diverged too much from it. The other reason it doesn't use slab 
allocators is because we wanted the memory allocators to do the right thing we 
already have tested Cassandra with Jemalloc.

Ah very cool.

jemalloc is not a moving allocator where as it looks like memcached slabs 
implement rebalancing to accommodate changes in size distribution. That would 
actually be one of the really nice things to keep IMO. On large memory systems 
with a cache that scales and performs you would end up dedicating as much RAM 
as possible to the row cache/key cache and not the page cache since the page 
cache is not as granular (correct me if the story for C* is different). If you 
dedicate 80% of RAM to the cache that doesn't leave a lot of space left for 
fragmentation. By using a heap allocator you also lose the ability to implement 
hard predictable limits on memory used by the cache since you didn't map it 
yourself. I could be totally off base and jemalloc might be good enough.

.bq There is some comments above which has the reasoning for it (please see the 
above comments). PS: I believe there was some tickets on Current RowCache 
complaining about the overhead.
I don't have a performance beef with JNI, especially the way you have done 
which I think is pretty efficient. I think the overhead of JNI (one or two 
slightly more expensive function calls) would be eclipsed by things like the 
cache misses, coherence, and pipeline stalls that are part of accessing and 
maintaining a concurrent cache (Java or C++). It's all just intuition without 
comparative microbenchmarks of the two caches. Java might look a little faster 
just due to allocator performance, but we know you pay for that in other ways.

I think what you have made scratches the itch for a large cache quite well, and 
beats the status quo. I don't agree that Unsafe couldn't do the exact same 
thing with no on heap references.

The hash table, ring buffer, and individual item entries are all being malloced 
and you can do that from Java using Unsafe. You don't need to implement a ring 
buffer because you can use Disruptor. I also wonder if splitting the cache into 
several instances each with a coarse lock per instance wouldn't result in 
simpler, and I know performance is not an issue, fast enough code. I don't want 
to advocate doing something different for performance, but rather that there is 
the possibility of a relatively simple implementation via Unsafe.

You could coalesce all the contended fields for each instance (stats, lock 
field, LRU head) into a single cache line, and then rely on a single barrier 
when releasing a coarse grained lock. The fine grained locking and CASing 
results in several pipeline stalls because the memory barriers that are 
implicit in each one require the store buffers to drain. There may even be a 
suitable off heap map implementation out there already.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache 

[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-04 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14196492#comment-14196492
 ] 

Vijay commented on CASSANDRA-7438:
--

{quote}
 well I think you run into another issue which is that the ring buffer doesn't 
appear to check for queue full? 
{quote}
Yeah i thought about it, we need to handle those and thats why didn't have it 
in the first place. Should not be really bad though.
{quote}
I don't agree that Unsafe couldn't do the exact same thing with no on heap 
references
{quote}
Probably, since we figured most of the implementation detail sure we can but 
still there is always many different ways to solve the problem (Even though it 
will be in efficient to copy multiple bytes to get to the next items in map 
etc... GC and CPU overhead would be more IMHO). For example Memcached used 
expiration time set by the clients to remove the items which made it easier for 
them to do the slab allocator but this is something we removed it in lruc and 
just a queue.
{quote}
I also wonder if splitting the cache into several instances each with a coarse 
lock per instance wouldn't result in simpler
{quote}
The problem there is how will you invalidate the last used items, since they 
are different partitions you really don't know which ones to invalidate... 
there is also a problem of load balancing when to expand the buckets etc which 
will bring us back to the current lock stripping solutions IMHO.

I can do some benchmarks if thats exactly what we need at this point Thanks!


 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-04 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14196885#comment-14196885
 ] 

Jonathan Ellis commented on CASSANDRA-7438:
---

bq. There is some comments above which has the reasoning for [why JNI is 
justified]. PS: I believe there was some tickets on Current RowCache 
complaining about the overhead.

Aren't all those objections to the current design and not to Unsafe per se?

Adding native libraries + JNI is a pretty huge step in build, QA, and runtime 
complexity.  I'd like to avoid it if at all possible.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-04 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14197029#comment-14197029
 ] 

Vijay commented on CASSANDRA-7438:
--

{quote}
Aren't all those objections to the current design
{quote}
I am fine to make it configurable and maintain it in a separate project but 
i didn't realize that was the case.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-03 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195418#comment-14195418
 ] 

Ariel Weisberg commented on CASSANDRA-7438:
---

RE refcount:

I think hazard pointers (never used them personally) are the no-gc no-refcount 
way of handling this. It also won't be fetched twice if it is uncontended which 
in many cases it will be since it should be decrefd as soon as the data is 
copied.

I think that with the right QA work this solves the problem of running 
arbitrarily large caches. That means running a validating workload in 
continuous integration that demonstrates the cache doesn't lock up, leak, or 
return the wrong answer. I would probably test directly against the cache to 
get more iterations in.

RE Implementation as a library via JNI:

We give up something by using JNI so it only makes sense if we get something 
else in return. The QA and release work created by JNI is pretty large. You 
really need a plan for running something like Valgrind or similar against a 
comprehensive suite of tests. Valgrind doesn't run well with Java AFAIK so you 
end up doing things like running the native code in a separate process, and 
have to write an interface amenable to that. Valgrind is also slow enough that 
if you try and run all your tests against a configuration using it a lot you 
end up with timeouts and many hours to run all the tests plus time spent 
interpreting results.

Unsafe is worse in that respect because there is no Valgrind and I can attest 
that debugging an off-heap red-black tree is not fun.

I am not clear on why the JNI is justified. It really seems like this could be 
written against Unsafe and then it would work on any platform. There are no 
libraries or system calls in use that are only accessible via JNI. I think JNI 
would make more sense if we were pulling in existing code like memcached that 
already handles memory pooling, fragmentation, and concurrency. If it were in 
Java you could use Disruptor for the queue and would only need to implement a 
thread safe off heap hash table.

RE Performance and implementation:

What kind of hardware was the benchmark run on? Server class NUMA? I am just 
wondering if there are enough cores to bring out any scalability issues in the 
cache implementation.

It would be nice to see a benchmark that showed the on heap cache falling over 
while the off heap cache provides good performance.

Subsequent comments aren't particularly useful if performance is satisfactory 
under relevant configurations.

Given the use of a heap allocator and locking it might not make sense to have a 
background thread do expiration. I think that splitting the cache into several 
instances with one lock around each instance might result in less contention 
overall and it would scale up in a more straightforward way.

It appears that some common operations will hit a global lock in may_expire() 
quite frequently? It seems like there are other globally shared frequently 
mutated cache lines in the write path like stats.

Is there something subtle in the locking that makes the use of the custom queue 
and maps necessary or could you use stuff from Intel TBB and still make it 
work? It is hypothetically less code to have to QA and maintain.

I still need to dig more, but I am also not clear on why locks are necessary 
for individual items. It looks like there is a table for all of them? Random 
intuition is that it could be done without a lock or at least a discrete lock. 
Striping against a padded pool of locks might make sense if that isn't going to 
cause deadlocks. Apparently every pthread_mutex_t is 40 bytes according to a 
random stack overflow post. It might make sense to use the same cache line as 
the refcount to store a lock field, or the bucket in the hash table?

Another implementation question is do we want to use C++11? It would remove a 
lot of platform and compiler specific code.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with 

[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-03 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195679#comment-14195679
 ] 

Vijay commented on CASSANDRA-7438:
--

Thanks for reviewing!
{quote}
 I am also not clear on why locks are necessary for individual items.
{quote}
No we don't. We have locks per Segment, this is very similar to lock stripping 
or the smiler to Java's concurrent hash map.
{quote}
 global lock in may_expire() quite frequently?
{quote}
Not really we lock globally when we reach 100% of the space and we freeup to 
80% of the space and we spread the overhead to other threads based on who ever 
has the item partition lock. It won't be hard to make this part of the queue 
thread and will try it for the next release of lruc.
{quote}
What kind of hardware was the benchmark run on?
{quote}
32 core 100GB RAM with numa and intel xeon. There is a benchmark util which is 
also checked in as a part of the lruc code which does exactly the same kind of 
test.
{quote}
You really need a plan for running something like Valgrind
{quote}
Good point, I was part way down that road and still have the code i can 
resuruct it for the next lruc version.
{quote}
I am not clear on why the JNI is justified
{quote}
There is some comments above which has the reasoning for it (please see the 
above comments). PS: I believe there was some tickets on Current RowCache 
complaining about the overhead.
{quote}
I think JNI would make more sense if we were pulling in existing code like 
memcached
{quote}
If you look at the code closer to memcached. Actually I started of stripping 
memcached code so we can run it in process instead of running as a separate 
process and removing the global locks in queue reallocation etc and eventually 
diverged too much from it. The other reason it doesn't use slab allocators is 
because we wanted the memory allocators to do the right thing we already have 
tested Cassandra with Jemalloc.

To confort a bit lruc is running in our production already :)

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-02 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193760#comment-14193760
 ] 

Vijay commented on CASSANDRA-7438:
--

Pushed, Thanks!
{quote}
We should ensure that changes in the serialized format of saved row caches are 
detected
{quote}
I don't think we changed the format, did i?
{quote}
 item.refcount - it refcount is updated, the whole cache line needs to be 
re-fetched (CPU)
{quote}
The refcount is per item in the cache, for every item inserted we track this in 
its memory location. 

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-02 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193785#comment-14193785
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

bq. I don't think we changed the format, did i?
Ah - no. Sorry - got confused with the in-memory serialization.

bq. item.refcount
What I mean is the the (Intel) CPU L1+L2 cache line size (64 bytes). If 
'refcount' is updated (e.g. just for a cache-get), the whole cache line is 
invalidated (twice) and needs to be re-fetched from RAM although its content 
did not change. It's just a point for optimization - if we find a viable 
solution for that, we should implement it.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-01 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193074#comment-14193074
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

bq. total 1 byte profit
bikeshed : could be two if serialized unencoded :)
I took some time to see whether it could be some lhf to change that - but it 
isn't really (because there are some uses of {{DataIn/Output(Plus)}} that would 
need to be changed, too - and it is used widely - even (if I saw that 
correctly) in SSTables (the point at which I stopped investigating ;) )

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-10-31 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192874#comment-14192874
 ] 

Jonathan Ellis commented on CASSANDRA-7438:
---

[~aweisberg], would be useful to get your take on this too.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-10-31 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192922#comment-14192922
 ] 

Pavel Yaskevich commented on CASSANDRA-7438:


I looked trough the patch and everything looks good, but one small thing:

- FBUtilities.newRowCacheProvider needs it's argument renaming, because it 
looks like it has been copied from FBUtilities.newPartitioner and has old 
names, so instead of paritioner it should be rowCacheClassName and rowCache 
as the last argument for FBUtilities.construct(...).

I just want to address Robert's comment regarding 
EncodedData{Input/Output}Stream: I agree that longs returned by version 1 UUID 
are not that compressible and vint is actually going to add 1 byte on top of 
long (which is pretty easy to test), but the good thing is that although we 
loose 2 bytes in long serialization we actually win back at least 2 bytes by 
vint encoding length of the key and, in best case, if key size is less than 127 
(which is highly likely) we are actually going to win 3 bytes which makes in 
total 1 byte profit from encoding :)

 

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-10-30 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190711#comment-14190711
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

Did a walkthrough to lruc 0.7, too...

Altogether +1 on the current state :)

Just one NIT:
* Move {{Preconditions.checkArgument(capacity  0 ...}} in {{LRUCache.java}} 
from {{capacity()}} to {{setCapacity}}

One thing regarding saved row caches: We should ensure that changes in the 
serialized format of saved row caches are detected (and either converted during 
load or just discarded)

Comments would be nice to have in a future version.

* Think you need to add the APLv2 license header to all source files ;)
* The NEWS, COPYING and AUTHORS files in {{lruc/src/native}} and {{lurk}} are 
blank
* {{stats}} struct is heavily used using CAS - maybe think of aligning the 
individual values to separate CPU cache lines to reduce CPU cache refreshes
* similar for {{item.refcount}} - it refcount is updated, the whole cache line 
needs to be re-fetched (CPU)
* {{o.a.c.cache.ICacheProvider.RowKeySerializer}} tries to „compress“ the two 
{{long}} values of UUID via 
{{EncodedDataOutputStream}}/{{EncodedDataInputStream}} - this is usually not 
possible for long values of a UUID resulting in bigger serialized 
representations than necessary (this is what the default serialization e.g. 
UUIDSerializer does)


 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-10-29 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189224#comment-14189224
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

LGTM - but some comments:
* the comments in cassandra.yaml could be more fleshy (see below)
* the version of lruc.jar should not be a SNAPSHOT version and a bit higher 
than 0.0.1 (although it's just a number, people usually don't trust something 
with a '0' in front :) ) - [lruc 
repo|https://github.com/Vijay2win/lruc/commits/master] shows v0.7 as current 
version - recommend to use the latest lruc release in C*
* would be very nice to have these released on maven central
* after lruc-0.7 is used for this ticket, we should run a stress test against a 
cluster using OffheapCacheProvider as some kind of smoke test

{code}
# Number of keys from the row cache to save.
# Disabled by default, meaning all keys are going to be saved.
# row_cache_keys_to_save: 100

# Row cache provider to use.
# Possible values are SerializingCacheProvider and OffheapCacheProvider.
# Default is no row cache.
# 
# SerializingCacheProvider is the one used in previous versions of Cassandra.
# It is available on all platforms and uses offheap memory for the rows but
# structures on the Java heap to manage the offheap row data.
#
# OffheapCacheProvider is new in Cassandra 3.0 and only available on
# Unix platforms (Linux and OSX).
# It uses a native code library to manage the whole row cache including
# management information in native memory thus reducing heap
# pressure compared to SerializingCacheProvider.
# 
# row_cache_provider: SerializingCacheProvider
{code}


 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-10-29 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189364#comment-14189364
 ] 

Vijay commented on CASSANDRA-7438:
--

rebased, pushed with latest binaries.
{quote}
the comments in cassandra.yaml could be more fleshy (see below)
{quote}
Sorry my bad missed it before and thanks for the write up i just copied it into 
the fork
{quote}
 recommend to use the latest lruc release in C*
{quote}
Yeah i did setup release and publishing to maven central, few weeks ago.


 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-10-15 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172078#comment-14172078
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

Will take a look at this this week.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-10-06 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160044#comment-14160044
 ] 

Vijay commented on CASSANDRA-7438:
--

Pushed most of the changes to 
https://github.com/Vijay2win/cassandra/commits/7438, not sure moving the tests 
and code into cassandra code base (since i am really neutral on that). Other 
related changes, tests and refactor is pushed as a part of 3 main commits in 
https://github.com/Vijay2win/lruc/commits/master.
cc [~xedin]

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-10-03 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14157989#comment-14157989
 ] 

Jonathan Ellis commented on CASSANDRA-7438:
---

Are you still working on this, [~vijay2...@gmail.com]?

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-10-03 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14158005#comment-14158005
 ] 

Vijay commented on CASSANDRA-7438:
--

Hi Jonathan, yes, i am adding more tests and fixing a test failure to lruc 
going to post the patch soon. 

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-09-23 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14145338#comment-14145338
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

(note: [~vijay2...@gmail.com], please use the other nick)

Some quick notes:
* Can you add the assertion for {{capacity = 0}} to 
{{OffheapCacheProvider.create}} - the current error message if 
{{row_cache_size_in_mb}} is not set (or invalid) capacity should be set could 
be more fleshy
* Additionally the {{capacity}} check should also check for negative values (it 
starts with a negative value - don't know what happens if it is negative...)
* {{org.apache.cassandra.db.RowCacheTest#testRowCacheCleanup}} fails at the 
last assertion - all other unit tests seem to work
* Documentation in cassandra.yaml for row_cache_provider could be a bit more 
verbose - just some abstract about the characteristics and limitation (e.g. 
Offheap does only work on Linux + OSX) of both implementations
* IMO it would be fine to have a general unit test for 
{{com.lruc.api.LRUCache}} in C* code, too
* Please add an adopted copy of {{RowCacheTest}} for OffheapCacheProvider
* unit tests using OffheapCacheProvider must not start on Windows builds - 
please add an assertion in OffHeapCacheProvider to assert that it runs on Linux 
or OSX

Sorry for the late reply

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-09-22 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14144260#comment-14144260
 ] 

Vijay commented on CASSANDRA-7438:
--

Hi [~rst...@pironet-ndh.com], I dont see a problem in copying the code or 
rewriting the code, once you complete the rest of the review we can see what we 
can do. I am guessing you where not waiting for my response :) Thanks!

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-07-31 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14080708#comment-14080708
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

bq. Yeah it works in unix, but the problem is i don't have a handle since its a 
temp file after restart. So it is a best effort for cleanups.

It's a really sick problem. I changed our Snappy integration a similar way. IMO 
there's no better solution than messing the temp dir.

bq. The problem is it produces a circular dependency

Ah - I meant that lruc code is copied to C* code base (if the others agree). 
But this could be a second step since it's only a bit of refactoring.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-07-27 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075611#comment-14075611
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

[~vijay2...@gmail.com] do you have a C* branch with lruc integrated? Or: what 
should I do to bring lruc+C* together? Is the patch up-to-date?

I've pushed a new branch 'native-plugin' with the changes for 
native-maven-plugin - separate from the other code. Windows stuff is bit more 
complicated - it doesn't compile. Have to dig a bit deeper. Maybe delay Win 
port...

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-07-27 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075630#comment-14075630
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

Surely not a complete list, but a start...

Java code:

* com.lruc.util.Utils.getUnsafe can be safely deleted
* com.lruc.util.Utils.extractLibraryFile
** should check return code of {{File.createNewFile}}
** Call to {{File.delete()}} for the extracted library file should be added to 
{{com.lruc.util.Utils.loadNative}} since unclean shutdown (kill -9) does not 
delete the so/dylib file. Possible for Unix systems - but not for Win.
* Classes com.lruc.jni.lruc, SWIGTYPE_p_item and SWIGTYPE_p_p_item are unused 
(refactoring relict?)
* Generally the lruc code could be more integrated in C* code.
** Let the lruc classes implement org.apache.cassandra.io.util.DataOutputPlus 
and java.io.DataInput so that they can be directly used by C* 
ColumnFamilySerializer (no temporary Input/OutputStreams necessary).
** Maybe {{DataOutputPlus.write(Memory)}} can be removed in C* when lruc is 
used - not sure about that.
** Implement most DataInput/Output methods in EntryInput/OutputStream to 
benefit from Unsafe (e.g. Unsafe.getLong/putLong) - have seen, that you've 
removed Abstract... some weeks ago ;)
** Using Unsafe for DataInput/Output of short/int/long/float/double has the 
drawback that Unsafe always uses the system's byte order - not (necessarily) 
the portable Java byte order.  There's of course no drawback, if all 
reads/writes are paired.
** {{Unsafe.copyMemory}} could be used for {{write(byte[])}}/{{read(byte[])}}.
* Naming of max_size, capacity - should use one common term which also makes 
sure that it's a maximum memory size - e.g. max_size_bytes. _Capacity_ is often 
used for the number of elements in a collection.
* Memory leak: {{com.lruc.api.LRUCache.hotN}} may keep references in native 
code (no {{lruc_deref}} calls), if not all items are retrieved from the 
iterator - remove _hotN_ or return an array/list instead?
* Generally I think all classes can be merged into a single package if only a 
few a are left (see above)

C code:

* {{#define item_lock(hv) while (item_trylock(hv)) continue;}} shouldn't there 
be something like a _yield_ ?
* Seems like the C code was not cleaned up after you began using 
Unsafe.allocateMemory :)
* I did not follow all possible code paths (due to the previous point)

Common:

* {{prefix_delimiter}} seems to be unused

Altogether I like that :)


 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-07-27 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075676#comment-14075676
 ] 

Vijay commented on CASSANDRA-7438:
--

Pushed the branch to https://github.com/Vijay2win/cassandra/tree/7438

{quote}Maybe delay Win port{quote}
We should be fine, lruc is configurable with the Serialization cache.
{quote}unclean shutdown (kill -9) does not delete the so/dylib file{quote}
Yeah it works in unix, but the problem is i don't have a handle since its a 
temp file after restart. So it is a best effort for cleanups.
{quote}SWIGTYPE_p_item and SWIGTYPE_p_p_item are unused {quote}
Auto generated and can be removed but will be generated every time swig is run.
{quote}Generally the lruc code could be more integrated in C* code{quote}
The problem is it produces a circular dependency, please look at 
df3857e4b9637ed6a5099506e95d84de15bf2eb7 where i removed those (the DOSP added 
back will still need to wrapped around by Cassandra's DOSP).
{quote}Naming of max_size, capacity{quote}
Yeah let me make it consistent, the problem was i was trying to fit everything 
into Guava interface.
{quote}remove hotN or return an array/list instead{quote}
Or may be do memcpy on keys, since this doesn't need optimization (will fix).
{quote}shouldn't there be something like a yield{quote}
Actually i removed it recently adding or removing doesn't give much performance 
gains, as a good citizen should add it back.
{quote}Seems like the C code was not cleaned up{quote}
This cannot be removed and needed for test cases.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-07-26 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075516#comment-14075516
 ] 

Vijay commented on CASSANDRA-7438:
--

{quote}
unsafe.memoryAllocate instead and replicate what we do with lruc_item_allocate()
{quote}
Done, Thanks!

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-07-25 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074452#comment-14074452
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

[~jbellis] Yes, I can review.

I agree with [~vijay2...@yahoo.com] - Unsafe can only work when big regions are 
allocated. Then use own malloc/free implementations to manage these big 
memory regions which are split into small blocks. On top of that we need to 
implement a concurrent map that stores data only in off-heap memory. I think 
we can manage that, but it takes time - need to prevent synchronization, use 
CAS, prevent fragmentation (best-effort).

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-07-24 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072935#comment-14072935
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

my username on github is snazy

Do you know {{org.codehaus.mojo:native-maven-plugin}}? It allows JNI 
compilation on almost all platforms directly from Maven and does not interfere 
with SWIG - have used it on OSX, Linux and Win.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-07-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072950#comment-14072950
 ] 

Benedict commented on CASSANDRA-7438:
-

bq. Not sure what we are talking about, this == lurc? if yes the RB is 
fronting the queue so we don't need a global lock.

I was referring to [~rst...@pironet-ndh.com]'s assertion of the need for some 
kind of memory management - you use no tools that aren't available through 
unsafe/NativeAllocator was my only point.

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   >