Hi,

While working on CASSANDRA-13120,
<https://issues.apache.org/jira/browse/CASSANDRA-13120> I came to realize
that since 3.0 we are using for SSTablesPerReadHistogram, Tracing and
SSTableReader::readCount the number of SSTables for which we did a
partition lookup and not the number of SSTables that were actually merged
to produce the response.

I also noted that SSTablesPerReadHistogram and SSTableReader::readCount are
not updated for partition range queries. It makes sense for internal range
queries but not for the user queries.

As SSTableReader::readCount seems to be used to avoid compacting cold
SSTables with SizedTierd compaction, having an invalid readCound will
probably result in un-optimal compactions.

In the same way, in SinglePartitionReadCommand (https://github.com/apache/
cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/db/
SinglePartitionReadCommand.java#L816) the number used to trigger compaction
is the number of SSTables for which we did a partition lookup and not the
one of the merged SSTables.

As, partition lookups are not cheap (especially index lookups) , Jake
suggested, in an offline discussion, to keep the
SSTablesPerReadHistogram metric
as it is and to add a new one mergedSSTablePerReadHistogram to track how
many SSTables have been actually merged.

The number of actually merged SSTables should also be the one used for the
Trace message and for determining if the SSTables must be compacted.
Before doing these changes I wanted to check if one of you had some concern
with them as I might have underestimated their impact.

Benjamin

Reply via email to