[
https://issues.apache.org/jira/browse/CASSANDRA-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871885#comment-13871885
]
Sylvain Lebresne commented on CASSANDRA-5357:
---------------------------------------------
bq. The case you describe of "wanting to cache a full table" is not dependent
on rows per partition but on cache size = number of partitions cached
But if you don't want to cache a full table, you still at least need to make
sure that for each partition, all rows are cached. You still need "rows per
partition = <n> where <n> > max number of rows per partition in that table" and
all I'm saying is that "rows per partition = all" is a bit more user friendly.
It's true you also need to make sure you cache is big enough if you want to
cache the table in full but that doesn't invalidate the first part (unless I'm
missing something).
bq. We're talking about static CFs aka partition key == primary key, right?
Then there is one row per partition, so there is no need for a special "rows
per partition = all" setting.
I guess I'm saying 2 things:
# I think that what user sometimes really want is "cache full partitions".
That's the basic intention. So what's the harm of adding a "all" alias that
express that intention better for user friendliness sake, provided adding that
don't require noticeable complexity? And given "all" can just be an alias for
Integer.MAX_VALUE, it doesn't add complexity so ...
# It's somewhat a detail, but I don't think that technically "rows per
partition = 1" will work equivalently to the current row cache behavior for
static table in practice, not always at least. More precisely, suppose you get
a query "select * from foo where pk=3", that "pk=3" is a cache hit and that
"rows_per_partition=1" on that table. Then, you can only serve the read from
the cache hit if you know *for sure* that this is a static table, i.e. that
there cannot be more rows in that partition that haven't been cache due to the
per-partition limitation. And, at least for thrift, we never really know for
sure if a table is a static one. I do note that "rows_per_partition=2" would
work, because if your cache hit has 1 row and you know you cache the 2 first
rows of the partition, then you can infer all rows of the partition are cached
without any more info, but at that point, I think it's a lot simpler to have a
"all" alias than to have to explain those implementation details.
Not saying it's a big deal, just that I think it's user friendly and has not
real downside that I can see.
> Query cache / partition head cache
> ----------------------------------
>
> Key: CASSANDRA-5357
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5357
> Project: Cassandra
> Issue Type: New Feature
> Reporter: Jonathan Ellis
> Assignee: Marcus Eriksson
> Fix For: 2.1
>
> Attachments: 0001-Cache-a-configurable-amount-of-columns.patch
>
>
> I think that most people expect the row cache to act like a query cache,
> because that's a reasonable model. Caching the entire partition is, in
> retrospect, not really reasonable, so it's not surprising that it catches
> people off guard, especially given the confusion we've inflicted on ourselves
> as to what a "row" constitutes.
> I propose replacing it with a true query cache.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)