[
https://issues.apache.org/jira/browse/CASSANDRA-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005834#comment-13005834
]
Jeffrey Wang commented on CASSANDRA-2305:
-----------------------------------------
I actually don't have row cache enabled (I just checked cfstats to make sure),
so I don't think that's the cause of my problem in particular. Here's some more
info that may or may not be correct:
- When I run the compaction, in ColumnFamilyStore.removeDeletedStandard() I see
that columns are being removed because of the c.timestamp() <=
cf.getMarkedForDeleteAt() condition, which makes sense since I issued a delete
on the entire row.
- However, after the compaction, I do the insert, and if I flush/compact again,
I still see the columns being removed because of that condition. It seems like
the markedForDeleteAt field on the ColumnFamily is persisting across the major
compaction which I believe is hiding the newly inserted column.
Also, my initial steps to repro were not correct, which made it hard to figure
out the root cause. Here is a proper repro:
- Create a CF with gc_grace_seconds = 0 and no row cache.
- Insert row X, col A with timestamp 0.
- Insert row X, col B with timestamp 2.
- Remove row X with timestamp 1 (expect col A to disappear, col B to stay).
- Wait 1 second.
- Force flush and compaction.
- Insert row X, col A with timestamp 0.
- Read row X, col A (see nothing).
Inserting row X, col B is necessary for this to repro because if all the
columns in a row disappear, the ColumnFamily object goes away and the
markedForDeleteAt field is reset. Only when a column still exists does the
field persist across the compaction. Hope this helps!
> Tombstoned rows not purged from cache after gcgraceseconds
> ----------------------------------------------------------
>
> Key: CASSANDRA-2305
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2305
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.7.0
> Reporter: Jeffrey Wang
> Assignee: Sylvain Lebresne
> Priority: Minor
> Fix For: 0.7.4
>
> Attachments: 0001-Compaction-test.patch,
> 0002-Invalidate-row-cache-on-compaction-purge.patch
>
> Original Estimate: 2h
> Time Spent: 2h
> Remaining Estimate: 0h
>
> From email to list:
> I was wondering if this is the expected behavior of deletes (0.7.0). Let's
> say I have a 1-node cluster with a single CF which has gc_grace_seconds = 0.
> The following sequence of operations happens (in the given order):
> insert row X with timestamp T
> delete row X with timestamp T+1
> force flush + compaction
> insert row X with timestamp T
> My understanding is that the tombstone created by the delete (and row X) will
> disappear with the flush + compaction which means the last insertion should
> show up. My experimentation, however, suggests otherwise (the last insertion
> does not show up).
> I believe I have traced this to the fact that the markedForDeleteAt field on
> the ColumnFamily does not get reset after a compaction (after
> gc_grace_seconds has passed); is this desirable? I think it introduces an
> inconsistency in how tombstoned columns work versus tombstoned CFs. Thanks.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira