[
https://issues.apache.org/jira/browse/CASSANDRA-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217127#comment-13217127
]
Sylvain Lebresne commented on CASSANDRA-3862:
---------------------------------------------
* In SerializingCache, remove misses a "not" in the while condition (this date
back from one of my earlier patch). We don't need that new remove method
anymore though so it's probably as simple to just remove it from the patch.
* In CFS.getThroughCache, the following line
{noformat}
boolean sentinelSuccess = !CacheService.instance.rowCache.putIfAbsent(key,
sentinel);
{noformat}
should not be negated.
* Also in CFS.getThroughCache, we won't remove the sentinel if there is an
exception during the read. It's not a big deal but it doesn't cost much to
prevent it from happening.
bq. True, but when I tried this I ended up with a LOT of casting cache values
to CF. I think it might be the lesser of evils the way it is.
In that case, I think I wouldn't mind too much casts and I would prefer getting
the type safety of knowing that a method that take a ColumnFamily can't ever
get a sentinel (and to make it explicit when you need to care about sentinel or
not), but that's a bit subjective. There would also be some small wins like the
fact we wouldn't need to save the cfId when serializing a sentinel.
bq. I'd rather have the reduced contention on instantiation than the 8 bytes of
space
My point was that the UUID don't reduce contention. UUID.randomUUID() uses
SecureRandom.nextBytes() that is synchronized (and thus likely entail a much
bigger degradation in face of contention than an AtomicLong) and probably a bit
CPU intensive. For reference, I did a quick micro-benchmark having 50 threads
generating 10,000 ids simultaneously using both methods, using an AtomicLong is
two orders of magnitude faster.
> RowCache misses Updates
> -----------------------
>
> Key: CASSANDRA-3862
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3862
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.6
> Reporter: Daniel Doubleday
> Assignee: Sylvain Lebresne
> Fix For: 1.1.0
>
> Attachments: 3862-7.txt, 3862-cleanup.txt, 3862-v2.patch,
> 3862-v4.patch, 3862-v5.txt, 3862-v6.txt, 3862.patch, 3862_v3.patch,
> include_memtables_in_rowcache_read.patch
>
>
> While performing stress tests to find any race problems for CASSANDRA-2864 I
> guess I (re-)found one for the standard on-heap row cache.
> During my stress test I hava lots of threads running with some of them only
> reading other writing and re-reading the value.
> This seems to happen:
> - Reader tries to read row A for the first time doing a getTopLevelColumns
> - Row A which is not in the cache yet is updated by Writer. The row is not
> eagerly read during write (because we want fast writes) so the writer cannot
> perform a cache update
> - Reader puts the row in the cache which is now missing the update
> I already asked this some time ago on the mailing list but unfortunately
> didn't dig after I got no answer since I assumed that I just missed
> something. In a way I still do but haven't found any locking mechanism that
> makes sure that this should not happen.
> The problem can be reproduced with every run of my stress test. When I
> restart the server the expected column is there. It's just missing from the
> cache.
> To test I have created a patch that merges memtables with the row cache. With
> the patch the problem is gone.
> I can also reproduce in 0.8. Haven't checked 1.1 but I haven't found any
> relevant change their either so I assume the same aplies there.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira