[ 
https://issues.apache.org/jira/browse/KUDU-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-2047:
------------------------------
    Component/s: compaction

> Lazy cfile open and maintenance op stat caching cause fruitful delta 
> compaction ops to never run
> ------------------------------------------------------------------------------------------------
>
>                 Key: KUDU-2047
>                 URL: https://issues.apache.org/jira/browse/KUDU-2047
>             Project: Kudu
>          Issue Type: Bug
>          Components: compaction, perf, tablet
>    Affects Versions: 1.4.0
>            Reporter: Todd Lipcon
>            Assignee: William Berkeley
>            Priority: Major
>
> I was just looking at a cluster which has a large amount of REDO data on some 
> of its tablets, and wasn't sure why it wasn't ever compacting it. The issue 
> appears to be the following:
> - in DiskRowSet::DeltaStoresCompactionPerfImprovementScore(), we call through 
> to GetColumnIdsWithUpdates() to see which columns may need compaction
> -- if the REDO delta block is not open (eg when the server has recently 
> started), this will skip the unopened delta file stats and not include them 
> in the result
> -- we thus determine that the compaction is not fruitful
> This was a conscious decision to avoid the MM from eagerly opening every 
> delta on its first pass through computing compaction stats. We figured that, 
> if it were worth compacting, then probably someone would scan the data, 
> forcing the deltas to get opened and thus made eligible for compaction.
> However, the MM tries to be smart about caching the statistics (see 
> e7fe0c1a94cac364522c09b8208c98480947d794). In particular, if it sees that the 
> tablet has not run any flushes or compactions, it won't bother to recalculate 
> the stats, assuming they haven't changed.
> So, if you have a completely read-only tablet with some uncompacted deltas, 
> the MM op will never run.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to