[
https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202145#comment-13202145
]
Sylvain Lebresne commented on CASSANDRA-3862:
---------------------------------------------
To be precise, what I'm saying is that (at least in theory) the following
scenario would be possible:
* A read-for-cache read the memtables grabing updates
* then it start reading the sstables
* while the previous happens, a new update arrives. The memtable is then
flushed and happens to be fully flushed *before* our read-for-cache completes.
In that case, the new update won't be part of the cached row (ever) because
during the flush (when we would merge the memtable to the cache) the row was
not in the cache yet. That may seem far fetched but consider a simple
implementation of you proposition, where the 'upon flush merge memtables with
cache' phase happens in the same loop over rows that is used for flushing. It
is actually possible for a new write to be "flushed" within a few milliseconds
of being received by the node: if the update triggers the memtable threshold
*and* sorts at the very beginning of the memtable. But don't get me wrong, it
would probably be possible to deal with that problem, but it feels a bit
complicated and error prone.
> RowCache misses Updates
> -----------------------
>
> Key: CASSANDRA-3862
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 1.0.7
> Reporter: Daniel Doubleday
> Attachments: include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I
> guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only
> reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not
> eagerly read during write (because we want fast writes) so the writer cannot
> perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately
> didn't dig after I got no answer since I assumed that I just missed
> something. In a way I still do but haven't found any locking mechanism that
> makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I
> restart the server the expected column is there. It's just missing from the
> cache.
> To test I have created a patch that merges memtables with the row cache. With
> the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any
> relevant change their either so I assume the same aplies there.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira