[
https://issues.apache.org/jira/browse/CASSANDRA-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205528#comment-13205528
]
Sylvain Lebresne commented on CASSANDRA-1956:
---------------------------------------------
bq. The filter approach allows us to make slice-based queries more efficient
(somewhat clumsily)
What is so clumsy?
bq. but doesn't really address the inefficiency for name-based queries
Depends on what we're talking. The filter approach would allow to set a
name-based filter. But ok, that is less convenient. But the query cache is not
perfect either. If you do different name-based query, we will end up caching
the same data multiple times. We may be able to optimize this, but then it
becomes fairly complicated.
bq. while with a true query cache we could do write-through updates on 2I
queries as well
I'm not sure I understand, could you clarify your idea?
Don't get me wrong, I'm not totally closed to the idea of query cache or
something alike, but I do want to make sure we don't jump on it without a good
reasoning behind, because I do fear a query cache will come with a bunch of
complication (and while you may have good reasoning, I personally don't yet see
clearly that it's the best choice, so I'll need some convincing). The query
cache also has the risk of caching multiple time the same thing. Take a CF on
which you do some paging: provided the row receives a few update, we'll end up
re-caching the same things multiple times (unless we're really smart about it
but I'm pretty sure it's not a simple problem). I'm not sure how much of a
problem that'll be in practice but ...
Then there is also the fact that the way you model in C* is usually with one CF
per kind of query. So it does feel like keeping each query separately shouldn't
be necessary. But that's not a technical argument.
> Convert row cache to row+filter cache
> -------------------------------------
>
> Key: CASSANDRA-1956
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1956
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Stu Hood
> Assignee: Vijay
> Priority: Minor
> Fix For: 1.2
>
> Attachments: 0001-1956-cache-updates-v0.patch,
> 0001-commiting-block-cache.patch, 0001-re-factor-row-cache.patch,
> 0001-row-cache-filter.patch, 0002-1956-updates-to-thrift-and-avro-v0.patch,
> 0002-add-query-cache.patch
>
>
> Changing the row cache to a row+filter cache would make it much more useful.
> We currently have to warn against using the row cache with wide rows, where
> the read pattern is typically a peek at the head, but this usecase would be
> perfect supported by a cache that stored only columns matching the filter.
> Possible implementations:
> * (copout) Cache a single filter per row, and leave the cache key as is
> * Cache a list of filters per row, leaving the cache key as is: this is
> likely to have some gotchas for weird usage patterns, and it requires the
> list overheard
> * Change the cache key to "rowkey+filterid": basically ideal, but you need a
> secondary index to lookup cache entries by rowkey so that you can keep them
> in sync with the memtable
> * others?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira