[ 
https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201519#comment-13201519
 ] 

Sylvain Lebresne commented on CASSANDRA-3862:
---------------------------------------------

{quote}
How about adopting the strategy we apply with CASSANDRA-2864:

* Writers dont update the cache at all
* Readers merge cache with memtables
* Upon flush merge memtables with cache
{quote}
The problem with that is that I don't see how we can make that work for 
counters at all. I also think it would be nice not having to merge on reads if 
we can avoid it (even if it's in-memory, it still uses CPU).

As a side note, I also suspect it's not bulletproof in theory, as a memtable 
could be fully flushed while a 'read to be cached' happens and with a bad 
timing during that, we could still miss an update. Of course, that kind of 
timing have almost no chance to happen. But in the case where a user triggers a 
flush manually, a memtable with only a handful of columns could be flushed very 
quickly, and I suspect the behavior could be observed. However unlikely that 
is, it'd be better if we can fix this problem once and for all.

I'll probably give a shot to my 'sentinel' proposal described above, I don't 
think it's too much code.
                
> RowCache misses Updates
> -----------------------
>
>                 Key: CASSANDRA-3862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.7
>            Reporter: Daniel Doubleday
>         Attachments: include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I 
> guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only 
> reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not 
> eagerly read during write (because we want fast writes) so the writer cannot 
> perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately 
> didn't dig after I got no answer since I assumed that I just missed 
> something. In a way I still do but haven't found any locking mechanism that 
> makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I 
> restart the server the expected column is there. It's just missing from the 
> cache.
> To test I have created a patch that merges memtables with the row cache. With 
> the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any 
> relevant change their either so I assume the same aplies there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to