[ 
https://issues.apache.org/jira/browse/CASSANDRA-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005834#comment-13005834
 ] 

Jeffrey Wang commented on CASSANDRA-2305:
-----------------------------------------

I actually don't have row cache enabled (I just checked cfstats to make sure), 
so I don't think that's the cause of my problem in particular. Here's some more 
info that may or may not be correct:

- When I run the compaction, in ColumnFamilyStore.removeDeletedStandard() I see 
that columns are being removed because of the c.timestamp() <= 
cf.getMarkedForDeleteAt() condition, which makes sense since I issued a delete 
on the entire row.
- However, after the compaction, I do the insert, and if I flush/compact again, 
I still see the columns being removed because of that condition. It seems like 
the markedForDeleteAt field on the ColumnFamily is persisting across the major 
compaction which I believe is hiding the newly inserted column.

Also, my initial steps to repro were not correct, which made it hard to figure 
out the root cause. Here is a proper repro:

- Create a CF with gc_grace_seconds = 0 and no row cache.
- Insert row X, col A with timestamp 0.
- Insert row X, col B with timestamp 2.
- Remove row X with timestamp 1 (expect col A to disappear, col B to stay).
- Wait 1 second.
- Force flush and compaction.
- Insert row X, col A with timestamp 0.
- Read row X, col A (see nothing).

Inserting row X, col B is necessary for this to repro because if all the 
columns in a row disappear, the ColumnFamily object goes away and the 
markedForDeleteAt field is reset. Only when a column still exists does the 
field persist across the compaction. Hope this helps!

> Tombstoned rows not purged from cache after gcgraceseconds
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-2305
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2305
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Jeffrey Wang
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 0.7.4
>
>         Attachments: 0001-Compaction-test.patch, 
> 0002-Invalidate-row-cache-on-compaction-purge.patch
>
>   Original Estimate: 2h
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> From email to list:
> I was wondering if this is the expected behavior of deletes (0.7.0). Let's 
> say I have a 1-node cluster with a single CF which has gc_grace_seconds = 0. 
> The following sequence of operations happens (in the given order):
> insert row X with timestamp T
> delete row X with timestamp T+1
> force flush + compaction
> insert row X with timestamp T
> My understanding is that the tombstone created by the delete (and row X) will 
> disappear with the flush + compaction which means the last insertion should 
> show up. My experimentation, however, suggests otherwise (the last insertion 
> does not show up).
> I believe I have traced this to the fact that the markedForDeleteAt field on 
> the ColumnFamily does not get reset after a compaction (after 
> gc_grace_seconds has passed); is this desirable? I think it introduces an 
> inconsistency in how tombstoned columns work versus tombstoned CFs. Thanks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to