Thanks for the insights Jeff! I did go through the tickets around dropping
expired sstables that have overlaps - based on what I understand, the only
undesirable impact of that would be possible data resurrection.
I have now attached the output of sstableslicer with the mail. Will submit
a patch for review.
On Tue, Aug 8, 2017 at 9:49 PM, Jeff Jirsa <jji...@gmail.com> wrote:
> The most likely cause is read repairs due to consistency level repairs
> (digest mismatch). The only way to actually eliminate read repair is to
> read with CL:ONE, which almost nobody does (at least in time series use
> cases, because it implies you probably write with ALL, or run repair which
> - as you've noted - often isn't necessary in ttl-only use cases).
> I can't see the image, but more tools for understanding sstable state are
> never a bad thing (as long as they're generally useful and maintainable).
> For what it's worth, there are tickets in flight for being more aggressive
> at dropping overlaps, but there are companies that use tools that stop the
> cluster, use sstablemetadata to identify sstables we knew should be fully
> expired, and manually remove them (/bin/rm) before starting cassandra
> again. It works reasonably well IF (and only if) you write all data with
> TTLs, and you can identify fully expired sstables based on maximum
> On Tue, Aug 8, 2017 at 8:51 PM, Sumanth Pasupuleti <
> sumanth.pasupuleti...@gmail.com> wrote:
> > Hi,
> >> We use TWCS in a few of the column families that have TTL based
> >> time-series data, and no explicit deletes are issued. Over the time, we
> >> observed the disk usage has been increasing beyond the expected levels.
> >> Data directory in a particular node shows SSTables that are more than
> >> 16days old, while the bucket size is configured at 12hours, TTL is at
> >> 15days and GC grace at 1hour.
> >> Upon using sstableexpiredblockers, we got quite a few sets of blocking
> >> and blocked SSTables. SSTableMetadata that is shown in the output
> >> there is an overlap in the MinTS-MaxTS period among the blocking SSTable
> >> and the blocked SSTables, which is preventing the older SSTables from
> >> getting dropped/deleted.
> >> Following are the possible root causes we considered
> >> 1. Hints - old data hints getting replayed from the coordinator node.
> >> We ruled this out since hints live for no more than 1 day based on
> >> configuration.
> >> 2. External compactions - no external compactions were run, that
> >> could cause compaction of SSTables across the TWCS buckets.
> >> 3. Read repairs - this is ruled out as well, since we never ran
> >> external repairs, and read repair chance on the TWCS column families
> >> been set to 0.
> >> 4. Application team writing data with older timestamp (in newer
> >> SSTables).
> >> 1. We wanted to identify the specific row keys with older timestamps
> >> in the blocking SSTable, that could be causing this issue to
> occur. We
> >> considered using SSTable2Keys/json, however, since both the tools
> >> outputting the entire content/keys of the SSTable in the order of
> the keys,
> >> they were not helpful in this case.
> >> 2. Since we wanted to get data on a few oldest cells with
> >> timestamps, we created a tool mostly based off of sstable2json,
> >> sstableslicer, to output 'n' top/bottom cells in an SSTable,
> ordered either
> >> on writetime/localDeletionTime. This helped us identify the
> specific cells
> >> in new SSTables with older timestamps, which further helped in
> debugging on
> >> the application end. From application team perspective, however,
> >> data with old timestamp is not a possible scenario.
> >> 3. Below is a sample output of sstableslicer
> > [image: Inline image 2]
> >> Looking for suggestions, especially around following two things:
> >> 1. Did we miss any other case in TWCS that could be causing such
> >> overlap?
> >> 2. Does sstableslicer seem valuable, to be included in Apache C*? If
> >> yes, I shall create a JIRA and submit a PR/patch for review.
> >> C* version we use is 2.1.17.
> > Thanks,
> >> Sumanth
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org