[
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14245106#comment-14245106
]
Colin Patrick McCabe commented on HDFS-7430:
--------------------------------------------
bq. Any reason BlockScanner is contained in DataNode rather than FsDatasetImpl?
It seems like their lifecycles are the same. Might also move to the fsdataset
package if you agree.
I think the logic is sufficiently general that this could be used in other
{{FsDataset}} implementations than {{FsDatasetImpl}}. So I'd prefer to keep it
in {{DataNode}}. The block iterator abstracts away the details of reading the
blocks from a volume, and could be implemented by other {{Volume}}
implementations. Actually I think the abstraction is better now because we cut
the link to reading paths.
bq. There's a bunch of time conversion scattered about, it'd be better to use
TimeUnit.MILLISECONDS.toHours(millis) and similar where we can. I like this
form better than TimeUnit#convert since it's very obvious.
Good idea.
bq. Javadoc, need to put a <p/> tag to actually get line-breaks.
added... hopefully I got all the spots
bq. Can add a log when removeVolumeScanner is called and not enabled
added
I have to close this window now, will address the other comments in a bit
> Refactor the BlockScanner to use O(1) memory and use multiple threads
> ---------------------------------------------------------------------
>
> Key: HDFS-7430
> URL: https://issues.apache.org/jira/browse/HDFS-7430
> Project: Hadoop HDFS
> Issue Type: Improvement
> Affects Versions: 2.7.0
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
> Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch,
> HDFS-7430.004.patch, HDFS-7430.005.patch, memory.png
>
>
> We should update the BlockScanner to use a constant amount of memory by
> keeping track of what block was scanned last, rather than by tracking the
> scan status of all blocks in memory. Also, instead of having just one
> thread, we should have a verification thread per hard disk (or other volume),
> scanning at a configurable rate of bytes per second.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)