weizuo93 opened a new issue #4834:
URL: https://github.com/apache/incubator-doris/issues/4834
A large number of small segment files will lead to low efficiency for scan
operations. Multiple small files can be merged into a large file by compaction
operation. So we could take the tablet scan frequency into consideration when
selecting an tablet for compaction and preferentially do compaction for those
tablets which are scanned frequently during a latest period of time at the
present.
Using the compaction strategy of `Kudu`for reference, `scan frequency` can
be calculated for tablet during a latest period of time at the present and be
taken into consideration when calculating compaction score. New compaction
score can be calculated like this:
`new_compaction_score = k1 * tablet_scan_frequency + k2 *
old_compaction_score `
`k1`and`k2`can be set dynamically through http interface
`/api/update_config`.
We can add a metric `query_scan_count` for each tablet which records the
scan count of the tablet. Thus, tablet scan frequency can be calculated like
this:
`tablet_scan_frequency = (now_query_scan_count - last_query_scan_count) /
(now_time - last_time)`
`last_query_scan_count` will be updated every time an `interval` passes and
`interval`can be config (such as `300` second).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]