[
https://issues.apache.org/jira/browse/CASSANDRA-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179876#comment-13179876
]
Jonathan Ellis commented on CASSANDRA-1956:
-------------------------------------------
bq. That is why I propose to combine current technique and filter-data and use
first for small rows and latter for wide ones
I'd rather avoid the complexity of keeping both implementations around. If the
rows are small enough that keeping the whole thing in memory is the right
tradeoff, then users can optimize that themselves by using "select *" instead
of "select x" and "select y" (i.e., the former would result in just one cache
entry for the row). I suspect it won't matter a great deal in real work
scenarios anyway.
How about this?
- Query cache replaces row cache, with on/off heap implementations based on
existing SC/CLHC. Use CLHM weight feature to rank by query result size.
- Cache key becomes (row key, query filter)
- When applying an update to row X, check query cache for filters on X. Update
cached CF with the new data for on-heap, invalidate for off-.
- New ticket for "pin CF in memory" feature
> Convert row cache to row+filter cache
> -------------------------------------
>
> Key: CASSANDRA-1956
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1956
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Stu Hood
> Assignee: Vijay
> Priority: Minor
> Fix For: 1.2
>
> Attachments: 0001-1956-cache-updates-v0.patch,
> 0001-re-factor-row-cache.patch, 0001-row-cache-filter.patch,
> 0002-1956-updates-to-thrift-and-avro-v0.patch, 0002-add-query-cache.patch
>
>
> Changing the row cache to a row+filter cache would make it much more useful.
> We currently have to warn against using the row cache with wide rows, where
> the read pattern is typically a peek at the head, but this usecase would be
> perfect supported by a cache that stored only columns matching the filter.
> Possible implementations:
> * (copout) Cache a single filter per row, and leave the cache key as is
> * Cache a list of filters per row, leaving the cache key as is: this is
> likely to have some gotchas for weird usage patterns, and it requires the
> list overheard
> * Change the cache key to "rowkey+filterid": basically ideal, but you need a
> secondary index to lookup cache entries by rowkey so that you can keep them
> in sync with the memtable
> * others?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira