[ 
https://issues.apache.org/jira/browse/HADOOP-3732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12612231#action_12612231
 ] 

Konstantin Shvachko commented on HADOOP-3732:
---------------------------------------------

# Right now the DataBlockScanner is initialized during data-node startup, and 
reads block information from the data-node storage directory at that time. See 
init() in DataBlockScanner(). A better time to obtain block information would 
be when the block scanner actually starts scanning and verifying, that is in 
DataBlockScanner.run(). During regular startup the difference in timing is 
negligible, but if there is distributed upgrade then it can take a while until 
the scanner will have a chance to run, besides it may read incorrect 
information since it will be reading the pre-upgrade block files.
# The message itself is confusing for administrators mostly because it is 
repeated 80,000 times (2 * #blocks). I am not sure how to deal with this. If we 
downgrade it to DEBUG level we risk to miss it during regular dn operation. May 
be we should postpone creating the dn block map until the upgrade is done. It 
is the same thins as with the bscanner, before the upgrade is finished block 
information might not be correct from the new software point of view.

> Block scanner should read block information during initialization.
> ------------------------------------------------------------------
>
>                 Key: HADOOP-3732
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3732
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.17.0
>            Reporter: Konstantin Shvachko
>            Assignee: Raghu Angadi
>            Priority: Blocker
>             Fix For: 0.18.0
>
>
> We see a lot of warning messages on each data-node during startup and 
> upgrading from 0.17 to 0.18 saying:
> {code}
> 2008-07-08 22:22:15,711 WARN org.apache.hadoop.dfs.DataNode: Block 
> /grid/3/hadoop/var/hdfs/data/current/blk_3359714082706415785 does not have a 
> metafile!
> {code}
> The message received twice for each block, because the block information is 
> first read buy the data-node itself and then by the block scanner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to