[
https://issues.apache.org/jira/browse/HDFS-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wellington Chevreuil updated HDFS-12182:
----------------------------------------
Description:
Currently, *BlockManager.metaSave* method (which is called by "-metasave" dfs
CLI command) reports both "under replicated" and "missing" blocks under same
metric *Metasave: Blocks waiting for reconstruction:* as shown on below code
snippet:
{noformat}
synchronized (neededReconstruction) {
out.println("Metasave: Blocks waiting for reconstruction: "
+ neededReconstruction.size());
for (Block block : neededReconstruction) {
dumpBlockMeta(block, out);
}
}
{noformat}
*neededReconstruction* is an instance of *LowRedundancyBlocks*, which actually
wraps 5 priority queues currently. 4 of these queues store different under
replicated scenarios, but the 5th one is dedicated for corrupt blocks.
Thus, metasave report may suggest some corrupt blocks are just under
replicated. This can be misleading for admins and operators trying to track
block missing/corruption issues, and/or other issues related to *BlockManager*
metrics.
I would like to propose a patch with trivial changes that would report corrupt
blocks separately.
was:
Currently, *BlockManager.metaSave* method (which is called by "-metasave" dfs
CLI command) reports both "under replicated" and "corrupt" blocks under same
metric *Metasave: Blocks waiting for reconstruction:* as shown on below code
snippet:
{noformat}
synchronized (neededReconstruction) {
out.println("Metasave: Blocks waiting for reconstruction: "
+ neededReconstruction.size());
for (Block block : neededReconstruction) {
dumpBlockMeta(block, out);
}
}
{noformat}
*neededReconstruction* is an instance of *LowRedundancyBlocks*, which actually
wraps 5 priority queues currently. 4 of these queues store different under
replicated scenarios, but the 5th one is dedicated for corrupt blocks.
Thus, metasave report may suggest some corrupt blocks are just under
replicated. This can be misleading for admins and operators trying to track
block corruption issues, and/or other issues related to *BlockManager* metrics.
I would like to propose a patch with trivial changes that would report corrupt
blocks separately.
> BlockManager.metaSave does not distinguish between "under replicated" and
> "missing" blocks
> ------------------------------------------------------------------------------------------
>
> Key: HDFS-12182
> URL: https://issues.apache.org/jira/browse/HDFS-12182
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs
> Reporter: Wellington Chevreuil
> Assignee: Wellington Chevreuil
> Priority: Trivial
> Labels: newbie
> Fix For: 3.0.0-alpha3
>
>
> Currently, *BlockManager.metaSave* method (which is called by "-metasave" dfs
> CLI command) reports both "under replicated" and "missing" blocks under same
> metric *Metasave: Blocks waiting for reconstruction:* as shown on below code
> snippet:
> {noformat}
> synchronized (neededReconstruction) {
> out.println("Metasave: Blocks waiting for reconstruction: "
> + neededReconstruction.size());
> for (Block block : neededReconstruction) {
> dumpBlockMeta(block, out);
> }
> }
> {noformat}
> *neededReconstruction* is an instance of *LowRedundancyBlocks*, which
> actually wraps 5 priority queues currently. 4 of these queues store different
> under replicated scenarios, but the 5th one is dedicated for corrupt blocks.
> Thus, metasave report may suggest some corrupt blocks are just under
> replicated. This can be misleading for admins and operators trying to track
> block missing/corruption issues, and/or other issues related to
> *BlockManager* metrics.
> I would like to propose a patch with trivial changes that would report
> corrupt blocks separately.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]