[ 
https://issues.apache.org/jira/browse/CASSANDRA-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-1625.
---------------------------------------

    Resolution: Duplicate

Storing the full cache tuple (and evicting entries that turn out to be obsolete 
lazily) was done in CASSANDRA-1625
                
> make the row cache continuously durable
> ---------------------------------------
>
>                 Key: CASSANDRA-1625
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1625
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Peter Schuller
>            Priority: Minor
>
> I was looking into how the row cache worked today and realized only row keys 
> were saved and later pre-populated on start-up.
> On the premise that row caches are typically used for small rows of which 
> there may be many, this is highly likely to be seek bound on large data sets 
> during pre-population.
> The pre-population could be made faster by increasing I/O queue depth (by 
> concurrency or by libaio as in 1576), but especially on large data sets the 
> performance would be nowhere near what could be achieved if a reasonably 
> sized file containing the actual rows were to be read in a sequential fashion 
> on start.
> On the one hand, Cassandra's design means that this should be possible to do 
> efficiently much easier than in some other cases, but on the other hand it is 
> still not entirely trivial.
> The key problem with maintaining a continuously durable cache is that one 
> must never read stale data on start-up. Stale could mean either data that was 
> later deleted, or an old version of data that was updated.
> In the case of Cassandra, this means that any cache restored on start-up must 
> be up-to-date with whatever position in the commit log that commit log 
> recovery will start at. (Because the row cache is for an entire row, we can't 
> couple updating of an on-disk row cache with memtable flushes.)
> I can see two main approaches:
> (a) Periodically dump the entire row cache, deferring commit log eviction in 
> synchronization with said dumping.
> (b) Keep a change log of sorts, similar to the commit log but filtered to 
> only contain data written to the commit log that affects keys that were in 
> the row cache at the time. Eviction of commit logs or updating positional 
> markers that affect the point of commit log recovery start, would imply 
> fsync():ing this change log. An incremental traversal, or alternatively a 
> periodic full dump, would have to be used to ensure that old row change log 
> segments can be evicted without loss of cache warmness.
> I like (b), but it is also the introduction of significant complexity (and 
> potential write path overhead) for the purpose of the row cache. In the worst 
> case where hotly read data is also hotly written, the overhead could be 
> particularly significant.
> I am not convinced whether this is a good idea for Cassandra, but I have a 
> use-case where a similar cache might have to be written in the application to 
> achieve the desired effect (pre-population being too slow for a sufficiently 
> large row cache). But there are reasons why, in an ideal world, having such a 
> continuously durable cache in Cassandra would be much better than something 
> at the application level. The primary reason is that it does not interact 
> poorly with consistency in the cluster, since the cache is node-local and 
> appropriate measures would be taken to make it consistent locally on each 
> node. I.e., it would be entirely transparent to the application.
> Thoughts? Like/dislike/too complex/not worth it?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to