[jira] [Commented] (HDDS-1163) Basic framework for Ozone Data Scrubber

Supratim Deka (JIRA) Sun, 03 Mar 2019 08:18:17 -0800


    [ 
https://issues.apache.org/jira/browse/HDDS-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16782768#comment-16782768
 ]


Supratim Deka commented on HDDS-1163:
-------------------------------------

Thanks a lot for taking a look at the patch Yiqun. You have pointed out 3 
important issues. Trying to address them as follows: 
1. Perfectly valid point regarding the coarse granularity of scheduling. 5GB is 
too large to effectively control the scanner.

However, I was thinking of the scan as two mostly independent parts - a 
metadata scan and a data(checksum) scan. The current patch only covers the 
metadata scan. The container check scanner is the metadata scan and will not 
load the chunk contents. It will load and verify only the metadata in every 
container. So I thought container granularity might be good enough. Yes?

For the chunk checksum scanner, your point is correct, we will have to rework 
the job dispatch to a finer granularity. I have a separate sub-task jira for 
the chunk checksum scanner.

2. Also I will file a sub-task for checkpointing the scanner state. That was in 
the plan, but I forgot to file the jira. thanks for pointing that out. But I 
will prefer not to include that in the current patch. sounds reasonable?

3. Regarding separate executors for full-check and fast-check. Actually, 
fast-check is a subset of the full-check. And the idea is to restrict to 
fast-check for Containers which are still Open. Because this is simple. For 
everything else always do full-check. Are you suggesting we have an executor 
which keeps running fast-check on all Containers independently(and at a higher 
frequency than full-check)?

   

> Basic framework for Ozone Data Scrubber
> ---------------------------------------
>
>                 Key: HDDS-1163
>                 URL: https://issues.apache.org/jira/browse/HDDS-1163
>             Project: Hadoop Distributed Data Store
>          Issue Type: Sub-task
>          Components: Ozone Datanode
>            Reporter: Supratim Deka
>            Assignee: Supratim Deka
>            Priority: Major
>         Attachments: HDDS-1163.000.patch
>
>
> Included in the scope:
> 1. Background scanner thread to iterate over container set and dispatch check 
> tasks for individual containers
> 2. Fixed rate scheduling - dispatch tasks at a pre-determined rate (for 
> example 1 container/s)
> 3. Check disk layout of Container - basic check for integrity of the 
> directory hierarchy inside the container, include chunk directory and 
> metadata directories
> 4. Check container file - basic sanity checks for the container metafile
> 5. Check Block Database - iterate over entries in the container block 
> database and check for the existence and accessibility of the chunks for each 
> block.
> Not in scope (will be done as separate subtasks):
> 1. Dynamic scheduling/pacing of background scan based on system load and 
> available resources.
> 2. Detection and handling of orphan chunks
> 3. Checksum verification for Chunks
> 4. Corruption handling - reporting (to SCM) and subsequent handling of any 
> corruption detected by the scanner. The current subtask will simply log any 
> corruption which is detected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDDS-1163) Basic framework for Ozone Data Scrubber

Reply via email to