On Tue, Dec 19, 2017 at 11:07 PM, Alexander Dejanovski < a...@thelastpickle.com> wrote:
> Hi folks, > > I'm working on CASSANDRA-8527 > <https://issues.apache.org/jira/browse/CASSANDRA-8527> and would need to > discuss a few things. > > The ticket makes it visible in tracing and metrics that rows shadowed by > range tombstones were scanned during reads. > Currently, scanned range tombstones aren't reported anywhere which hides > the cause of performance issues during reads when the users perform primary > key deletes. > As such, they do not count in the warn and failure thresholds. > > While the number of live rows and tombstone cells is counted in the > ReadCommand class, it is currently not possible to count the number of > range tombstones there as they are merged with the rows they shadow before > reaching the class. > Instead, we can count the number of deleted rows that were read , which > already improves diagnosis and show that range tombstones were scanned : > > if (row.hasLiveData(ReadCommand.this.nowInSec(), enforceStrictLiveness)) > ++liveRows; > else if (!row.primaryKeyLivenessInfo().isLive(ReadCommand.this. > nowInSec())) > { > // We want to detect primary key deletions only. > // If all cells have expired they will count as tombstones. > ++deletedRows; > } > > Deleted rows would be part of the warning threshold so that we can spot the > range tombstone scans in the logs and tracing would look like this : > > WARN [ReadStage-2] 2017-12-18 18:22:31,352 ReadCommand.java:491 - > Read 2086 live rows, 2440 deleted rows and 0 tombstone cells for > query.. > > > Are there any caveats to that approach ? > Should we include the number of deleted rows in the failure threshold or > make it optional, knowing that it could make some queries fail while they > were passing before ? > > Thanks for sending the email. Unfortunately, it's a horrible time of year to get much feedback. This is the type of question we should give folks a chance to answer before rushing to merge. I'm not personally sure of the answer without re-reading that code, and it's the week before Christmas, so it's not going to happen until the new year. I wouldn't mind seeing the deleted rows in the log, but I suspect the number of tombstones matters more than the number of deleted rows - 1 RT that shadows a thousand deleted rows is very different than a thousand RTs shadowing a thousand rows. If we end up with something only marginally useful, we should make sure the changes to the hot path justify that marginal gain.