[jira] [Commented] (HDDS-6053) Fix too short container scrubber data scan interval.

Mark Gui (Jira) Tue, 30 Nov 2021 00:49:05 -0800


    [ 
https://issues.apache.org/jira/browse/HDDS-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450958#comment-17450958
 ]


Mark Gui commented on HDDS-6053:
--------------------------------

Hi [~adoroszlai] , thanks for reminding me of the throttling, I agree that with 
the throttler, data scanner should not bother normal IO too much, so I should 
correct my words.

 

Actually I just found that by witnessing that "metadata.scan.interval" defaults 
to 3h, but "data.scan.interval" is 1m.

IMHO, data scan should be at least heavier than metadata scan even we got the 
throttler at hand. So I think a heavier scan should be scheduled at larger 
intervals.

HDFS has got similar scanners: directory scanner and block scanner, one for 
metadata and one for data, comparatively. And the directory scanner is 
scheduled at an interval of 6h, while the block scanner schedules at an 
interval of 3 weeks.

FYI: 
[https://blog.cloudera.com/hdfs-datanode-scanners-and-disk-checker-explained/]

[https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml]

 

Theoretical, data scan with checksum checking is mainly to prevent from bit 
flip problems of hard disks while linux local filesystems like ext4 does not 
provide such built-in checksum mechanism, so a datanode need to do it itself. 
But this problem does not happen so frequently, so we don't have to keep the 
data scanner running at an interval like 1minute.

 

Since Ozone is a "younger" project than HDFS, I think a shorter schedule 
interval is good for it now, since it may not be have the same data integrity 
insurance as HDFS, thus the values:
 * 3h interval for metadata scaner
 * 1d interval for data scaner

should be reasonable for ozone datanode.

> Fix too short container scrubber data scan interval.
> ----------------------------------------------------
>
>                 Key: HDDS-6053
>                 URL: https://issues.apache.org/jira/browse/HDDS-6053
>             Project: Apache Ozone
>          Issue Type: Improvement
>            Reporter: Mark Gui
>            Assignee: Mark Gui
>            Priority: Minor
>              Labels: pull-request-available
>
> I've found that the datanode container data scrubber defaults to have an scan 
> interval if 1minute:
> {code:java}
> // ContainerScrubberConfiguration.java
> @Config(key = "data.scan.interval",
>     type = ConfigType.TIME,
>     defaultValue = "1m", 
> ...{code}
> It is too short since data scan is a heavy read load which may make normal IO 
> slow,
> so a longer interval should be used as default, for example 1d.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDDS-6053) Fix too short container scrubber data scan interval.

Reply via email to