[
https://issues.apache.org/jira/browse/CASSANDRA-14941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcus Eriksson reassigned CASSANDRA-14941:
-------------------------------------------
Assignee: Marcus Eriksson
> Expired secondary index sstables are not promptly discarded under TWCS
> ----------------------------------------------------------------------
>
> Key: CASSANDRA-14941
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14941
> Project: Cassandra
> Issue Type: Bug
> Components: Feature/2i Index
> Reporter: Samuel Klock
> Assignee: Marcus Eriksson
> Priority: Normal
>
> We have a table in a cluster running 3.0.17 storing roughly time-series data
> using TWCS with a secondary index. We've noticed that while expired sstables
> for the table are discarded mostly when we expect them to be, the expired
> sstables for the secondary index would linger for weeks longer than expected
> – essentially indefinitely. Eventually the sstables would fill disks, which
> would require manual steps (deleting ancient index sstables) to address. We
> verified with {{sstableexpiredblockers}} that there wasn't anything on disk
> blocking the expired sstables from being dropped, so this looks like a bug.
> Through some debugging, we traced the problem to the index's memtables, which
> were consistently (except _just_ after node restarts) reporting a minimum
> timestamp from September 2015 – much older than any of our live data – which
> causes {{CompactionController.getFullyExpiredSSTables()}} to consistently
> return an empty set. The reason that the index sstables report this minimum
> timestamp is because of how index updates are created, using
> {{PartitionUpdate.singleRowUpdate()}}:
> {code:java}
> public static PartitionUpdate singleRowUpdate(CFMetaData metadata,
> DecoratedKey key, Row row, Row staticRow)
> {
> MutableDeletionInfo deletionInfo = MutableDeletionInfo.live();
> Holder holder = new Holder(
> new PartitionColumns(
> staticRow == null ? Columns.NONE :
> Columns.from(staticRow.columns()),
> row == null ? Columns.NONE : Columns.from(row.columns())
> ),
> row == null ? BTree.empty() : BTree.singleton(row),
> deletionInfo,
> staticRow == null ? Rows.EMPTY_STATIC_ROW : staticRow,
> EncodingStats.NO_STATS
> );
> return new PartitionUpdate(metadata, key, holder, deletionInfo,
> false);
> }
> {code}
> The use of {{EncodingStats.NO_STATS}} makes it appear as though the earliest
> timestamp in the resulting {{PartitionUpdate}} is from September 2015. That
> timestamp becomes the minimum for the memtable.
> Modifying this version of {{PartitionUpdate.singleRowUpdate()}} to:
> {code:java}
> public static PartitionUpdate singleRowUpdate(CFMetaData metadata,
> DecoratedKey key, Row row, Row staticRow)
> {
> MutableDeletionInfo deletionInfo = MutableDeletionInfo.live();
> staticRow = (staticRow == null ? Rows.EMPTY_STATIC_ROW : staticRow);
> EncodingStats stats = EncodingStats.Collector.collect(staticRow,
> (row == null ?
>
> Collections.emptyIterator() :
>
> Iterators.singletonIterator(row)),
> deletionInfo);
> Holder holder = new Holder(
> new PartitionColumns(
> staticRow == Rows.EMPTY_STATIC_ROW ? Columns.NONE :
> Columns.from(staticRow.columns()),
> row == null ? Columns.NONE : Columns.from(row.columns())
> ),
> row == null ? BTree.empty() : BTree.singleton(row),
> deletionInfo,
> staticRow,
> stats
> );
> return new PartitionUpdate(metadata, key, holder, deletionInfo,
> false);
> }
> {code}
> (i.e., computing an {{EncodingStats}} from the contents of the update) seems
> to fix the problem. However, we're not certain whether A) there's a
> functional reason the method was using {{EncodingStats.NO_STATS}} previously
> or B) whether the {{EncodingStats}} the revised version creates is correct
> (in particular, the use of {{deletionInfo}} feels a little suspect). We're
> also not sure whether there's a more appropriate fix (e.g., changing how the
> memtables compute the minimum timestamp, particularly in the {{NO_STATS}}
> case).
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]