[ https://issues.apache.org/jira/browse/CASSANDRA-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869423#comment-13869423 ]
Sylvain Lebresne commented on CASSANDRA-5357: --------------------------------------------- bq. If the newly cached data does not include all cells requested by user, we do another read. We cannot know if the requested cells will be included in the first N cells. Haven't checked the patch, but what I would imagine we'd want/have to do here is distinguish between 2 types of queries: # the "head-of-partition" type, so (non-reversed) slices where the start bound is empty (and that have just one slice). For those, we'd have 2 sub-category: ## those whose limit is <= N (where N is the number of cached rows-per-partition). For that, we can safely answer the query from a cache hit, and on a cache miss we can query the first N rows, cache them and return. ## those whose limit is > N. In that case, we can check the cache and see if what's cached covers the whole query (i.e. despite the bigger limit the slice end is before the last cached entry). But if it doesn't, we'd have to do a read (we can start that read where the cache ends though) but we wouldn't cache that 2nd read results since it doesn't fall into "the first N rows". # the other types of queries. Those can't serve be served from cache in general. That being said, we can still check the cache and see if by any chance we can guarantee it's enough, i.e. if the last possibly queried item sort before the last cached entry. But if it's a miss or if we can't guarantee the query is fully covered by the cache, I think we should just ignore caching and just read-and-return the user query without trying to cache anything. On a cache miss in particular, I'm not really convinced that it's worth reading the first N rows of the partition when we have a very good chance it won't cover our query anyway. Of course, on the longer run, maybe we can add heuristic for "querying the first N rows is almost sure to cover that query" (for instance, if the mean number of cells-per-partition for the table is < N), but I'd rather left that to later. Overall, I don't think caching should mean we may have to do 2 reads to answer queries, that feels wrong (and makes it easy to bit people in a way they don't expect). I'll finish by saying that for the sake of shipping sooner than later, I'd be absolutely fine with a simpler first version that would only ever consider the cache for "head-of-partition with limit < N" type of queries and ignore it completely for all other cases. After which we can incrementally cover more cases in follow up patches. In other words, if we only cache the N first rows per partition, it's perfectly ok imo to say that cache is only use when you query the first M < N rows of a partition initially. Btw, it does would make me really happy to preserve the current cache behavior being a "rows_per_partition: all" option (I doubt it'll be much code). I still think that for static CFs it's the "right" option and I'm sure a few users have built legitimate use of our existing cache-everything cache that won't be easily covered without that. > Query cache / partition head cache > ---------------------------------- > > Key: CASSANDRA-5357 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5357 > Project: Cassandra > Issue Type: New Feature > Reporter: Jonathan Ellis > Assignee: Marcus Eriksson > Fix For: 2.1 > > Attachments: 0001-Cache-a-configurable-amount-of-columns-v2.patch, > 0001-Cache-a-configurable-amount-of-columns.patch > > > I think that most people expect the row cache to act like a query cache, > because that's a reasonable model. Caching the entire partition is, in > retrospect, not really reasonable, so it's not surprising that it catches > people off guard, especially given the confusion we've inflicted on ourselves > as to what a "row" constitutes. > I propose replacing it with a true query cache. -- This message was sent by Atlassian JIRA (v6.1.5#6160)