[
https://issues.apache.org/jira/browse/HDDS-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450958#comment-17450958
]
Mark Gui commented on HDDS-6053:
--------------------------------
Hi [~adoroszlai] , thanks for reminding me of the throttling, I agree that with
the throttler, data scanner should not bother normal IO too much, so I should
correct my words.
Actually I just found that by witnessing that "metadata.scan.interval" defaults
to 3h, but "data.scan.interval" is 1m.
IMHO, data scan should be at least heavier than metadata scan even we got the
throttler at hand. So I think a heavier scan should be scheduled at larger
intervals.
HDFS has got similar scanners: directory scanner and block scanner, one for
metadata and one for data, comparatively. And the directory scanner is
scheduled at an interval of 6h, while the block scanner schedules at an
interval of 3 weeks.
FYI:
[https://blog.cloudera.com/hdfs-datanode-scanners-and-disk-checker-explained/]
[https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml]
Theoretical, data scan with checksum checking is mainly to prevent from bit
flip problems of hard disks while linux local filesystems like ext4 does not
provide such built-in checksum mechanism, so a datanode need to do it itself.
But this problem does not happen so frequently, so we don't have to keep the
data scanner running at an interval like 1minute.
Since Ozone is a "younger" project than HDFS, I think a shorter schedule
interval is good for it now, since it may not be have the same data integrity
insurance as HDFS, thus the values:
* 3h interval for metadata scaner
* 1d interval for data scaner
should be reasonable for ozone datanode.
> Fix too short container scrubber data scan interval.
> ----------------------------------------------------
>
> Key: HDDS-6053
> URL: https://issues.apache.org/jira/browse/HDDS-6053
> Project: Apache Ozone
> Issue Type: Improvement
> Reporter: Mark Gui
> Assignee: Mark Gui
> Priority: Minor
> Labels: pull-request-available
>
> I've found that the datanode container data scrubber defaults to have an scan
> interval if 1minute:
> {code:java}
> // ContainerScrubberConfiguration.java
> @Config(key = "data.scan.interval",
> type = ConfigType.TIME,
> defaultValue = "1m",
> ...{code}
> It is too short since data scan is a heavy read load which may make normal IO
> slow,
> so a longer interval should be used as default, for example 1d.
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]