[jira] [Commented] (CASSANDRA-2864) Alternative Row Cache Implementation

Daniel Doubleday (Commented) (JIRA) Mon, 20 Feb 2012 10:36:04 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13212021#comment-13212021
 ]


Daniel Doubleday commented on CASSANDRA-2864:
---------------------------------------------

Ok for the curious I just wanted to report some findings 

Disclaimer: I ignored counters and super cols for the time being.

I did some testing on various machines with different CPU / Mem profiles.
I tried different read / write / overwrite scenarios with reads being 
normal-distributed so I could configure cache hit ratios while keeping 
everything else constant.
I also tried to test the impact of different io loads by controlled stressing 
of discs.
One of my initial mayor concerns was also memory footprint: how much rows can 
we fit into memory without getting in real trouble with gc.

Hm results are well ... difficult. In a way I tend to think that we (or maybe 
its just me) are looking in the wrong direction. Right now I believe that in 
the long run caching doesn't make sense at all but for now I just want to 
report some figures:

After the first real testing results looked ambivalent:

# The alternative cache is way superior in terms fo memory usage and gc. In 
general I found that I can put around 10x rows in the cache.
# On the other side performance on rather cpu restricted machines was worse 
than I hoped. In general it didn't really made a big difference whether I was 
using the cache or had only a few memtables fully cached in page cache

Since this sucked I looked where all that cpu was burned and decided to change 
the serialized row format and write custom name and slice filters. I figured 
that the problem was that lots of objects are deserialized right now and theres 
to much search scanning going on.

So now a row in mem loks like that:

|| Header || Column Data ||

Column offsets are encoded in the header. This way I can do binary searches and 
don't need to scan.
Also the filters only ever deserialize anything when it's really returned as 
relevant column.

Before I write a book... Below are some figures. These are only ment to give a 
broad idea. The total performance numbers dont mean anything. This was a 4-core 
server with the tester threads running on the same machine. Machine was CPU 
bound in all tests.

CPU bound? Yes - right now I still can't deliver anything really conclusive in 
terms of what all this means for throughput (other than that I think caching is 
the wrong answer). It's all about isolated cache performance so far.

h2. Memory Footprint

Note, the memory vals are from JProfiler. I'm not sure if they are bullet proof 
but should be in the right ball park.
Payload estimate derived as name (variable), nameLength (2) value (variable), 
valueLength (4), timestamp (8), local delete (4), type (1)

10k Rows, 500 Columns, 4byte names, 32byte value
Payload: 5M Columns: 275M

|| Cache     || Retained Size || Num Objects ||
| Standard   | 1,280 MB       | 10M          |
| ByteBuffer |   277 MB       | 20k          |

10k Rows, 50 Columns, 4byte names, 1byte value
Payload: 500k Columns: 28M

|| Cache     || Retained Size || Num Objects ||
| Standard   |   112 MB       | 900k         |
| ByteBuffer |    30 MB       |  20k         |


h2. Performance

All rows had 500 cols with 32bytes values and int names/keys.

For a starter the following are simple 'as fast as you can' stress tests.
Performance indicator is pages / sec.

Name Filter: Random get of one col
Slice Filter: Random slice of 10 cols

Comparisons: 

- No row cache but everything in page cache
- Alternative Cache File System Layout (V1)
- Standard Map Cache
- Alternative Cache New Layout (V2)

h3. No row cache, Non compacted, (average 2,5 SST reads for slices)

Get: 12k
Slice: 6.5k

h3. No row cache, Compacted

Get: 12k
Slice: 9.2k

h3. Alternative Cache V1

Get: 15.9k
Slice: 14.6k

h3. Good old non serializing row cache

Get: 25.4k
Slice: 23k

h3. Alternative Cache V2

Get: 25.5k
Slice: 24k

We still plan to take this live, but since I wrote more code than initially 
thought I need to write more unit tests.

So long.

                
> Alternative Row Cache Implementation
> ------------------------------------
>
>                 Key: CASSANDRA-2864
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2864
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Daniel Doubleday
>            Assignee: Daniel Doubleday
>            Priority: Minor
>         Attachments: rowcache.patch
>
>
> we have been working on an alternative implementation to the existing row 
> cache(s)
> We have 2 main goals:
> - Decrease memory -> get more rows in the cache without suffering a huge 
> performance penalty
> - Reduce gc pressure
> This sounds a lot like we should be using the new serializing cache in 0.8. 
> Unfortunately our workload consists of loads of updates which would 
> invalidate the cache all the time.
> The second unfortunate thing is that the idea we came up with doesn't fit the 
> new cache provider api...
> It looks like this:
> Like the serializing cache we basically only cache the serialized byte 
> buffer. we don't serialize the bloom filter and try to do some other minor 
> compression tricks (var ints etc not done yet). The main difference is that 
> we don't deserialize but use the normal sstable iterators and filters as in 
> the regular uncached case.
> So the read path looks like this:
> return filter.collectCollatedColumns(memtable iter, cached row iter)
> The write path is not affected. It does not update the cache
> During flush we merge all memtable updates with the cached rows.
> The attached patch is based on 0.8 branch r1143352
> It does not replace the existing row cache but sits aside it. Theres 
> environment switch to choose the implementation. This way it is easy to 
> benchmark performance differences.
> -DuseSSTableCache=true enables the alternative cache. It shares its 
> configuration with the standard row cache. So the cache capacity is shared. 
> We have duplicated a fair amount of code. First we actually refactored the 
> existing sstable filter / reader but than decided to minimize dependencies. 
> Also this way it is easy to customize serialization for in memory sstable 
> rows. 
> We have also experimented a little with compression but since this task at 
> this stage is mainly to kick off discussion we wanted to keep things simple. 
> But there is certainly room for optimizations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2864) Alternative Row Cache Implementation

Reply via email to