[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14245106#comment-14245106
 ] 

Colin Patrick McCabe commented on HDFS-7430:
--------------------------------------------

bq. Any reason BlockScanner is contained in DataNode rather than FsDatasetImpl? 
It seems like their lifecycles are the same. Might also move to the fsdataset 
package if you agree.

I think the logic is sufficiently general that this could be used in other 
{{FsDataset}} implementations than {{FsDatasetImpl}}.  So I'd prefer to keep it 
in {{DataNode}}.  The block iterator abstracts away the details of reading the 
blocks from a volume, and could be implemented by other {{Volume}} 
implementations.  Actually I think the abstraction is better now because we cut 
the link to reading paths.

bq. There's a bunch of time conversion scattered about, it'd be better to use 
TimeUnit.MILLISECONDS.toHours(millis) and similar where we can. I like this 
form better than TimeUnit#convert since it's very obvious.

Good idea.

bq. Javadoc, need to put a <p/> tag to actually get line-breaks.

added... hopefully I got all the spots

bq. Can add a log when removeVolumeScanner is called and not enabled

added

I have to close this window now, will address the other comments in a bit

> Refactor the BlockScanner to use O(1) memory and use multiple threads
> ---------------------------------------------------------------------
>
>                 Key: HDFS-7430
>                 URL: https://issues.apache.org/jira/browse/HDFS-7430
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.7.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
> HDFS-7430.004.patch, HDFS-7430.005.patch, memory.png
>
>
> We should update the BlockScanner to use a constant amount of memory by 
> keeping track of what block was scanned last, rather than by tracking the 
> scan status of all blocks in memory.  Also, instead of having just one 
> thread, we should have a verification thread per hard disk (or other volume), 
> scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to