Todd Lipcon created KUDU-2047:
---------------------------------
Summary: Lazy cfile open and maintenance op stat caching cause
fruitful delta compaction ops to never run
Key: KUDU-2047
URL: https://issues.apache.org/jira/browse/KUDU-2047
Project: Kudu
Issue Type: Bug
Components: perf, tablet
Affects Versions: 1.4.0
Reporter: Todd Lipcon
I was just looking at a cluster which has a large amount of REDO data on some
of its tablets, and wasn't sure why it wasn't ever compacting it. The issue
appears to be the following:
- in DiskRowSet::DeltaStoresCompactionPerfImprovementScore(), we call through
to GetColumnIdsWithUpdates() to see which columns may need compaction
-- if the REDO delta block is not open (eg when the server has recently
started), this will skip the unopened delta file stats and not include them in
the result
-- we thus determine that the compaction is not fruitful
This was a conscious decision to avoid the MM from eagerly opening every delta
on its first pass through computing compaction stats. We figured that, if it
were worth compacting, then probably someone would scan the data, forcing the
deltas to get opened and thus made eligible for compaction.
However, the MM tries to be smart about caching the statistics (see
e7fe0c1a94cac364522c09b8208c98480947d794). In particular, if it sees that the
tablet has not run any flushes or compactions, it won't bother to recalculate
the stats, assuming they haven't changed.
So, if you have a completely read-only tablet with some uncompacted deltas, the
MM op will never run.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)