[
https://issues.apache.org/jira/browse/CASSANDRA-15153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413548#comment-17413548
]
Aleksei Zotov commented on CASSANDRA-15153:
-------------------------------------------
I looked to the issue a bit further and can confirm there is a bug in the
library. Basically the explanation is like that:
- records are physically removed from the cache asynchronously
- in order to not return stale records the library relies onto {{writeTime}}
field
- it checks whether a record is expired based on {{writeTime}} on every
{{get}} call
- if the record is expired it tries to load the actual value using
{{CacheLoader}}
- if the value is available, it is cached and returned; if not, the expired
data is physically removed and {{null}} is returned; if an exception occurs,
there is a problem
Basically it happens that the code updates {{writeTime}} on expiration
([https://github.com/ben-manes/caffeine/blob/v2.3.5/caffeine/src/main/java/com/github/benmanes/caffeine/cache/BoundedLocalCache.java#L2002])
before checking whether it can load the actual value. And if while looking for
the actual record
([https://github.com/ben-manes/caffeine/blob/v2.3.5/caffeine/src/main/java/com/github/benmanes/caffeine/cache/BoundedLocalCache.java#L2008])
an exception occurs, the method is interrupted (meaning an exception is
re-thrown to the upper level). However, the expired record has the
{{writeTime}} updated! Basically the expired record resurrects.
As [~eperott] correctly mentioned, they re-wrote the whole expiration approach
in [2.5.0|https://github.com/ben-manes/caffeine/releases/tag/v2.5.0] and the
issue is probably fixed (I did not test it though). So the only fix is to
update _Caffeine_ to a newer version. As I can see they went ahead and the
current version is 3.0.3. I believe it makes sense to move to the latest
version because it has a bunch of fixes and perf improvements.
[~blerer] [~mck]
I have a could of questions:
# It is clearly a minor bug. Currently expiration logic is used in
{{AuthCache}} and {{ActiveRepairService}} classes. There are a few more classes
that use _Caffeine_, but they probably won't be changed. In fact, the library
upgrade seems to be hard to test since possible issues (in the library itself)
can be probably re-produced in a concurrent environment and under a certain
load only (aka production). So for me it is hard to asses the risk level of
backporting it to the old versions. At the moment all versions starting from
3.0 seem to be affected. With that being said, I'm wondering to what versions
we need to apply the fix.
# Are we good to go with the latest _Caffeine_ version? Alternatively, we can
use a hybrid approach - update old versions to 2.5.0 (the first version where
the problem seems to be fixed) and 4.1 to 3.0.3 (the latest version) - if that
seems to be safer.
Please, share your thoughts.
> Caffeine cache return stale entries
> -----------------------------------
>
> Key: CASSANDRA-15153
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15153
> Project: Cassandra
> Issue Type: Bug
> Components: Feature/Authorization
> Reporter: Per Otterström
> Priority: Normal
> Labels: security
>
> Version 2.3.5 of the Caffeine cache that we're using in various places can
> hand out stale entries in some cases. This seem to happen when an update
> fails repeatedly, in which case Caffeine may return a previously loaded
> value. For instance, the AuthCache may hand out permissions even though the
> reload operation is failing, see CASSANDRA-15041.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]