[ 
https://issues.apache.org/jira/browse/HDDS-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16782756#comment-16782756
 ] 

Yiqun Lin commented on HDDS-1163:
---------------------------------

Thanks for providing the initial patch, [~sdeka]. Some comments from me:

Can we define two independent executor to run full-check and fast-check for the 
containers? We know the full-check will be a more expensive behaviour. 
Currently, the check behaviour is directly invoked by container check scanner.

Why not to persist the info of last-checked container id? So that the node 
doesn't need to check from the beginning again when doing the restart.

Current throttle mechanism is a little coarse-grained and only in container 
level, no more design for throttle in block level? There will be millions of 
blocks be accessed by checking.


> Basic framework for Ozone Data Scrubber
> ---------------------------------------
>
>                 Key: HDDS-1163
>                 URL: https://issues.apache.org/jira/browse/HDDS-1163
>             Project: Hadoop Distributed Data Store
>          Issue Type: Sub-task
>          Components: Ozone Datanode
>            Reporter: Supratim Deka
>            Assignee: Supratim Deka
>            Priority: Major
>         Attachments: HDDS-1163.000.patch
>
>
> Included in the scope:
> 1. Background scanner thread to iterate over container set and dispatch check 
> tasks for individual containers
> 2. Fixed rate scheduling - dispatch tasks at a pre-determined rate (for 
> example 1 container/s)
> 3. Check disk layout of Container - basic check for integrity of the 
> directory hierarchy inside the container, include chunk directory and 
> metadata directories
> 4. Check container file - basic sanity checks for the container metafile
> 5. Check Block Database - iterate over entries in the container block 
> database and check for the existence and accessibility of the chunks for each 
> block.
> Not in scope (will be done as separate subtasks):
> 1. Dynamic scheduling/pacing of background scan based on system load and 
> available resources.
> 2. Detection and handling of orphan chunks
> 3. Checksum verification for Chunks
> 4. Corruption handling - reporting (to SCM) and subsequent handling of any 
> corruption detected by the scanner. The current subtask will simply log any 
> corruption which is detected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to