Hi Mike,

Have you checked to make sure you’re not a victim of timestamp overlap?

From: Mike Torra [mailto:mto...@salesforce.com.INVALID]
Sent: Thursday, May 02, 2019 11:09 AM
To: user@cassandra.apache.org
Subject: Re: TWCS sstables not dropping even though all data is expired

I'm pretty stumped by this, so here is some more detail if it helps.

Here is what the suspicious partition looks like in the `sstabledump` output 
(some pii etc redacted):
```
{
    "partition" : {
      "key" : [ "some_user_id_value", "user_id", "demo-test" ],
      "position" : 210
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 1132,
        "clustering" : [ "2019-01-22 15:27:45.000Z" ],
        "liveness_info" : { "tstamp" : "2019-01-22T15:31:12.415081Z" },
        "cells" : [
          { "some": "data" }
        ]
      }
    ]
  }
```

And here is what every other partition looks like:
```
{
    "partition" : {
      "key" : [ "some_other_user_id", "user_id", "some_site_id" ],
      "position" : 1133
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 1234,
        "clustering" : [ "2019-01-22 17:59:35.547Z" ],
        "liveness_info" : { "tstamp" : "2019-01-22T17:59:35.708Z", "ttl" : 
86400, "expires_at" : "2019-01-23T17:59:35Z", "expired" : true },
        "cells" : [
          { "name" : "activity_data", "deletion_info" : { "local_delete_time" : 
"2019-01-22T17:59:35Z" }
          }
        ]
      }
    ]
  }
```

As expected, almost all of the data except this one suspicious partition has a 
ttl and is already expired. But if a partition isn't expired and I see it in 
the sstable, why wouldn't I see it executing a CQL query against the CF? Why 
would this sstable be preventing so many other sstable's from getting cleaned 
up?

On Tue, Apr 30, 2019 at 12:34 PM Mike Torra 
<mto...@salesforce.com<mailto:mto...@salesforce.com>> wrote:
Hello -

I have a 48 node C* cluster spread across 4 AWS regions with RF=3. A few months 
ago I started noticing disk usage on some nodes increasing consistently. At 
first I solved the problem by destroying the nodes and rebuilding them, but the 
problem returns.

I did some more investigation recently, and this is what I found:
- I narrowed the problem down to a CF that uses TWCS, by simply looking at disk 
space usage
- in each region, 3 nodes have this problem of growing disk space (matches 
replication factor)
- on each node, I tracked down the problem to a particular SSTable using 
`sstableexpiredblockers`
- in the SSTable, using `sstabledump`, I found a row that does not have a ttl 
like the other rows, and appears to be from someone else on the team testing 
something and forgetting to include a ttl
- all other rows show "expired: true" except this one, hence my suspicion
- when I query for that particular partition key, I get no results
- I tried deleting the row anyways, but that didn't seem to change anything
- I also tried `nodetool scrub`, but that didn't help either

Would this rogue row without a ttl explain the problem? If so, why? If not, 
does anyone have any other ideas? Why does the row show in `sstabledump` but 
not when I query for it?

I appreciate any help or suggestions!

- Mike

Reply via email to