[
https://issues.apache.org/jira/browse/CASSANDRA-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Ellis resolved CASSANDRA-1625.
---------------------------------------
Resolution: Duplicate
Storing the full cache tuple (and evicting entries that turn out to be obsolete
lazily) was done in CASSANDRA-1625
> make the row cache continuously durable
> ---------------------------------------
>
> Key: CASSANDRA-1625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1625
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Peter Schuller
> Priority: Minor
>
> I was looking into how the row cache worked today and realized only row keys
> were saved and later pre-populated on start-up.
> On the premise that row caches are typically used for small rows of which
> there may be many, this is highly likely to be seek bound on large data sets
> during pre-population.
> The pre-population could be made faster by increasing I/O queue depth (by
> concurrency or by libaio as in 1576), but especially on large data sets the
> performance would be nowhere near what could be achieved if a reasonably
> sized file containing the actual rows were to be read in a sequential fashion
> on start.
> On the one hand, Cassandra's design means that this should be possible to do
> efficiently much easier than in some other cases, but on the other hand it is
> still not entirely trivial.
> The key problem with maintaining a continuously durable cache is that one
> must never read stale data on start-up. Stale could mean either data that was
> later deleted, or an old version of data that was updated.
> In the case of Cassandra, this means that any cache restored on start-up must
> be up-to-date with whatever position in the commit log that commit log
> recovery will start at. (Because the row cache is for an entire row, we can't
> couple updating of an on-disk row cache with memtable flushes.)
> I can see two main approaches:
> (a) Periodically dump the entire row cache, deferring commit log eviction in
> synchronization with said dumping.
> (b) Keep a change log of sorts, similar to the commit log but filtered to
> only contain data written to the commit log that affects keys that were in
> the row cache at the time. Eviction of commit logs or updating positional
> markers that affect the point of commit log recovery start, would imply
> fsync():ing this change log. An incremental traversal, or alternatively a
> periodic full dump, would have to be used to ensure that old row change log
> segments can be evicted without loss of cache warmness.
> I like (b), but it is also the introduction of significant complexity (and
> potential write path overhead) for the purpose of the row cache. In the worst
> case where hotly read data is also hotly written, the overhead could be
> particularly significant.
> I am not convinced whether this is a good idea for Cassandra, but I have a
> use-case where a similar cache might have to be written in the application to
> achieve the desired effect (pre-population being too slow for a sufficiently
> large row cache). But there are reasons why, in an ideal world, having such a
> continuously durable cache in Cassandra would be much better than something
> at the application level. The primary reason is that it does not interact
> poorly with consistency in the cluster, since the cache is node-local and
> appropriate measures would be taken to make it consistent locally on each
> node. I.e., it would be entirely transparent to the application.
> Thoughts? Like/dislike/too complex/not worth it?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira