[
https://issues.apache.org/jira/browse/CASSANDRA-18433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Huapeng Yuan updated CASSANDRA-18433:
-------------------------------------
Description:
We found the issue in our production system which has the version 3.11.6. When
we did an update and then read after update successfully, we may read the stale
data sometimes. Same issue for writeAll + readOne consistency and
writeQuorm+readQuorum. The issue is gone once we disabled the row cache.
The config for row cache:
caching = \{'keys': 'ALL', 'rows_per_partition': 'ALL'}
After some investigations, we think there is a race condition during read/write
path. Problems:
When two threads are reading and writing the same partition (for example, two
rows with same partition key), the read thread may load the stale data into row
cache for the row which is being updated.
{{}}
{panel:title=The steps of write-thread inserting a row to partition p}
{{W-Step }}{{{}1{}}}{{{}: inserts the value v1 to memtable.{}}}
{{W-Step }}{{{}2{}}}{{{}: invalidates the row cache using partition key.{}}}
{panel}
{{}}
{panel:title=The steps of read-thread reading a row from partition p}
{{R-Step }}{{{}1{}}}{{{}: Checks row cache and finds whether the row is not
present in cache. If not, goes to '{}}}{{{}R-Step {}}}{{{}2'{}}}{{{}.{}}}
{{R-Step }}{{{}2{}}}{{{}: Insert a sentinel (timestamp) as the row value into
row cache to tell other read threads should skip the row cache.{}}}
{{R-Step }}{{{}3{}}}{{{}: Read from storage layer and get value v0 which can be
older than v1.{}}}
{{R-Step }}{{{}4{}}}{{{}: Insert v0 to row cache {}}}{{for}} {{the row by
checking }}{{if}} {{the row doesn't exist or it has the same sentinel. *The
inconsistency is caused by this step. Should not insert the stale value if the
sentinel doesn't exist in row cache any more.*}}
{panel}
{{}}
{panel:title=The sequence to reproduce the issue}
{{R-Step }}{{1}}
{{R-Step }}{{2}}
{{R-Step }}{{3}}
{{W-Step }}{{1}}
{{W-Step }}{{2}}
{{R-Step }}{{4}}
{panel}
{{}}
was:
We found the issue in our production system which has the version 3.11.6. We
did a update and then read immediately, we may read the stale data sometimes.
Same issue for writeAll + readOne consistency and writeQuorm+readQuorum. The
issue is gone once we disabled the row cache.
The config for row cache:
caching = \{'keys': 'ALL', 'rows_per_partition': 'ALL'}
After some investigations, we think there is a race condition during read/write
path. Problems:
When two threads are reading and writing the same partition (for example, two
rows with same partition key), the read thread may load the stale data into row
cache for the row which is being updated.
{{}}
{panel:title=The steps of write-thread inserting a row to partition p}
{{W-Step }}{{{}1{}}}{{{}: inserts the value v1 to memtable.{}}}
{{W-Step }}{{{}2{}}}{{{}: invalidates the row cache using partition
key.{}}}{panel}
{{}}
{panel:title=The steps of read-thread reading a row from partition p}
{{R-Step }}{{{}1{}}}{{{}: Checks row cache and finds whether the row is not
present in cache. If not, goes to '{}}}{{{}R-Step {}}}{{{}2'{}}}{{{}.{}}}
{{R-Step }}{{{}2{}}}{{{}: Insert a sentinel (timestamp) as the row value into
row cache to tell other read threads should skip the row cache.{}}}
{{R-Step }}{{{}3{}}}{{{}: Read from storage layer and get value v0 which can be
older than v1.{}}}
{{R-Step }}{{{}4{}}}{{{}: Insert v0 to row cache {}}}{{for}} {{the row by
checking }}{{if}} {{the row doesn't exist or it has the same sentinel. *The
inconsistency is caused by this step. Should not insert the stale value if the
sentinel doesn't exist in row cache any more.*}}{panel}
{{}}
{panel:title=The sequence to reproduce the issue}
{{R-Step }}{{1}}
{{R-Step }}{{2}}
{{R-Step }}{{3}}
{{W-Step }}{{1}}
{{W-Step }}{{2}}
{{R-Step }}{{4}}{panel}
{{}}
> Row cache inconsistency issue: A read can put stale data into row cache in a
> race condition
> -------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-18433
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18433
> Project: Cassandra
> Issue Type: Bug
> Components: Local/Caching
> Reporter: Huapeng Yuan
> Priority: Normal
>
> We found the issue in our production system which has the version 3.11.6.
> When we did an update and then read after update successfully, we may read
> the stale data sometimes. Same issue for writeAll + readOne consistency and
> writeQuorm+readQuorum. The issue is gone once we disabled the row cache.
> The config for row cache:
> caching = \{'keys': 'ALL', 'rows_per_partition': 'ALL'}
>
> After some investigations, we think there is a race condition during
> read/write path. Problems:
> When two threads are reading and writing the same partition (for example, two
> rows with same partition key), the read thread may load the stale data into
> row cache for the row which is being updated.
> {{}}
> {panel:title=The steps of write-thread inserting a row to partition p}
> {{W-Step }}{{{}1{}}}{{{}: inserts the value v1 to memtable.{}}}
> {{W-Step }}{{{}2{}}}{{{}: invalidates the row cache using partition key.{}}}
> {panel}
> {{}}
> {panel:title=The steps of read-thread reading a row from partition p}
> {{R-Step }}{{{}1{}}}{{{}: Checks row cache and finds whether the row is not
> present in cache. If not, goes to '{}}}{{{}R-Step {}}}{{{}2'{}}}{{{}.{}}}
> {{R-Step }}{{{}2{}}}{{{}: Insert a sentinel (timestamp) as the row value into
> row cache to tell other read threads should skip the row cache.{}}}
> {{R-Step }}{{{}3{}}}{{{}: Read from storage layer and get value v0 which can
> be older than v1.{}}}
> {{R-Step }}{{{}4{}}}{{{}: Insert v0 to row cache {}}}{{for}} {{the row by
> checking }}{{if}} {{the row doesn't exist or it has the same sentinel. *The
> inconsistency is caused by this step. Should not insert the stale value if
> the sentinel doesn't exist in row cache any more.*}}
> {panel}
> {{}}
> {panel:title=The sequence to reproduce the issue}
> {{R-Step }}{{1}}
> {{R-Step }}{{2}}
> {{R-Step }}{{3}}
> {{W-Step }}{{1}}
> {{W-Step }}{{2}}
> {{R-Step }}{{4}}
> {panel}
> {{}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]