[jira] [Commented] (CASSANDRA-1956) Convert row cache to row+filter cache

Sylvain Lebresne (Commented) (JIRA) Fri, 10 Feb 2012 08:37:32 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205528#comment-13205528
 ]


Sylvain Lebresne commented on CASSANDRA-1956:
---------------------------------------------

bq. The filter approach allows us to make slice-based queries more efficient 
(somewhat clumsily)

What is so clumsy?

bq. but doesn't really address the inefficiency for name-based queries

Depends on what we're talking. The filter approach would allow to set a 
name-based filter. But ok, that is less convenient. But the query cache is not 
perfect either. If you do different name-based query, we will end up caching 
the same data multiple times. We may be able to optimize this, but then it 
becomes fairly complicated.

bq. while with a true query cache we could do write-through updates on 2I 
queries as well

I'm not sure I understand, could you clarify your idea?

Don't get me wrong, I'm not totally closed to the idea of query cache or 
something alike, but I do want to make sure we don't jump on it without a good 
reasoning behind, because I do fear a query cache will come with a bunch of 
complication (and while you may have good reasoning, I personally don't yet see 
clearly that it's the best choice, so I'll need some convincing). The query 
cache also has the risk of caching multiple time the same thing. Take a CF on 
which you do some paging: provided the row receives a few update, we'll end up 
re-caching the same things multiple times (unless we're really smart about it 
but I'm pretty sure it's not a simple problem). I'm not sure how much of a 
problem that'll be in practice but ...

Then there is also the fact that the way you model in C* is usually with one CF 
per kind of query. So it does feel like keeping each query separately shouldn't 
be necessary. But that's not a technical argument.
                
> Convert row cache to row+filter cache
> -------------------------------------
>
>                 Key: CASSANDRA-1956
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1956
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Vijay
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: 0001-1956-cache-updates-v0.patch, 
> 0001-commiting-block-cache.patch, 0001-re-factor-row-cache.patch, 
> 0001-row-cache-filter.patch, 0002-1956-updates-to-thrift-and-avro-v0.patch, 
> 0002-add-query-cache.patch
>
>
> Changing the row cache to a row+filter cache would make it much more useful. 
> We currently have to warn against using the row cache with wide rows, where 
> the read pattern is typically a peek at the head, but this usecase would be 
> perfect supported by a cache that stored only columns matching the filter.
> Possible implementations:
> * (copout) Cache a single filter per row, and leave the cache key as is
> * Cache a list of filters per row, leaving the cache key as is: this is 
> likely to have some gotchas for weird usage patterns, and it requires the 
> list overheard
> * Change the cache key to "rowkey+filterid": basically ideal, but you need a 
> secondary index to lookup cache entries by rowkey so that you can keep them 
> in sync with the memtable
> * others?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-1956) Convert row cache to row+filter cache

Reply via email to