[jira] [Commented] (CASSANDRA-2864) Alternative Row Cache Implementation

Daniel Doubleday (JIRA) Thu, 07 Jul 2011 06:46:42 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061303#comment-13061303
 ]


Daniel Doubleday commented on CASSANDRA-2864:
---------------------------------------------

bq. However, I'm reluctant to add more special cases to the read path

Well I was more thinking of replacing the old row cache :-) 
In terms of "throughput over latency" this might be a winner. It seems read 
latencies increase only moderately combined with lowered mem usage ...

Also I think that implementing a variation of CASSANDRA-1956 will be pretty 
easy since we can work with the standard filters now.
So instead of putting toplevel columns back in the cache one could just cache 
the filtered columns. Plus a little logic that decides wether the cache can 
handle the request.

But I understand that this is quite a change and the patch is easy to maintain 
so we can always patch.

bq. It looks like the CASSANDRA-2498 + CASSANDRA-2503 approach might offer 
similar benefits

These look promising but it seems that they dont help for slicing 
(CASSANDRA-2503 might make the slicing case even worse) and FWIW we do slice a 
lot even in skinny cached rows. Looks like we have the worst case scenario 
there: lots of random updates (in terms of ranges) so even if there were cached 
range meta infos for sstables somewhere I doubt that it would really work for 
us.

Anyways I will look at CASSANDRA-2498 this weekend and check if I think that I 
can come up with a patch. Or else report that I cant.


> Alternative Row Cache Implementation
> ------------------------------------
>
>                 Key: CASSANDRA-2864
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2864
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.8.1
>            Reporter: Daniel Doubleday
>            Priority: Minor
>         Attachments: rowcache.patch
>
>
> we have been working on an alternative implementation to the existing row 
> cache(s)
> We have 2 main goals:
> - Decrease memory -> get more rows in the cache without suffering a huge 
> performance penalty
> - Reduce gc pressure
> This sounds a lot like we should be using the new serializing cache in 0.8. 
> Unfortunately our workload consists of loads of updates which would 
> invalidate the cache all the time.
> The second unfortunate thing is that the idea we came up with doesn't fit the 
> new cache provider api...
> It looks like this:
> Like the serializing cache we basically only cache the serialized byte 
> buffer. we don't serialize the bloom filter and try to do some other minor 
> compression tricks (var ints etc not done yet). The main difference is that 
> we don't deserialize but use the normal sstable iterators and filters as in 
> the regular uncached case.
> So the read path looks like this:
> return filter.collectCollatedColumns(memtable iter, cached row iter)
> The write path is not affected. It does not update the cache
> During flush we merge all memtable updates with the cached rows.
> The attached patch is based on 0.8 branch r1143352
> It does not replace the existing row cache but sits aside it. Theres 
> environment switch to choose the implementation. This way it is easy to 
> benchmark performance differences.
> -DuseSSTableCache=true enables the alternative cache. It shares its 
> configuration with the standard row cache. So the cache capacity is shared. 
> We have duplicated a fair amount of code. First we actually refactored the 
> existing sstable filter / reader but than decided to minimize dependencies. 
> Also this way it is easy to customize serialization for in memory sstable 
> rows. 
> We have also experimented a little with compaction but since this task at 
> this stage is mainly to kick off discussion we wanted to keep things simple. 
> But there is certainly room for optimizations.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2864) Alternative Row Cache Implementation

Reply via email to