[jira] [Updated] (HDFS-12182) BlockManager.metaSave does not distinguish between "under replicated" and "missing" blocks

Wellington Chevreuil (JIRA) Fri, 21 Jul 2017 08:55:08 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Wellington Chevreuil updated HDFS-12182:
----------------------------------------
    Description: 
Currently, *BlockManager.metaSave* method (which is called by "-metasave" dfs 
CLI command) reports both "under replicated" and "missing" blocks under same 
metric *Metasave: Blocks waiting for reconstruction:* as shown on below code 
snippet:

{noformat}
   synchronized (neededReconstruction) {
      out.println("Metasave: Blocks waiting for reconstruction: "
          + neededReconstruction.size());
      for (Block block : neededReconstruction) {
        dumpBlockMeta(block, out);
      }
    }
{noformat}

*neededReconstruction* is an instance of *LowRedundancyBlocks*, which actually 
wraps 5 priority queues currently. 4 of these queues store different under 
replicated scenarios, but the 5th one is dedicated for corrupt blocks. 

Thus, metasave report may suggest some corrupt blocks are just under 
replicated. This can be misleading for admins and operators trying to track 
block missing/corruption issues, and/or other issues related to *BlockManager* 
metrics.

I would like to propose a patch with trivial changes that would report corrupt 
blocks separately.

  was:
Currently, *BlockManager.metaSave* method (which is called by "-metasave" dfs 
CLI command) reports both "under replicated" and "corrupt" blocks under same 
metric *Metasave: Blocks waiting for reconstruction:* as shown on below code 
snippet:

{noformat}
   synchronized (neededReconstruction) {
      out.println("Metasave: Blocks waiting for reconstruction: "
          + neededReconstruction.size());
      for (Block block : neededReconstruction) {
        dumpBlockMeta(block, out);
      }
    }
{noformat}

*neededReconstruction* is an instance of *LowRedundancyBlocks*, which actually 
wraps 5 priority queues currently. 4 of these queues store different under 
replicated scenarios, but the 5th one is dedicated for corrupt blocks. 

Thus, metasave report may suggest some corrupt blocks are just under 
replicated. This can be misleading for admins and operators trying to track 
block corruption issues, and/or other issues related to *BlockManager* metrics.

I would like to propose a patch with trivial changes that would report corrupt 
blocks separately.


> BlockManager.metaSave does not distinguish between "under replicated" and 
> "missing" blocks
> ------------------------------------------------------------------------------------------
>
>                 Key: HDFS-12182
>                 URL: https://issues.apache.org/jira/browse/HDFS-12182
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>            Reporter: Wellington Chevreuil
>            Assignee: Wellington Chevreuil
>            Priority: Trivial
>              Labels: newbie
>             Fix For: 3.0.0-alpha3
>
>
> Currently, *BlockManager.metaSave* method (which is called by "-metasave" dfs 
> CLI command) reports both "under replicated" and "missing" blocks under same 
> metric *Metasave: Blocks waiting for reconstruction:* as shown on below code 
> snippet:
> {noformat}
>    synchronized (neededReconstruction) {
>       out.println("Metasave: Blocks waiting for reconstruction: "
>           + neededReconstruction.size());
>       for (Block block : neededReconstruction) {
>         dumpBlockMeta(block, out);
>       }
>     }
> {noformat}
> *neededReconstruction* is an instance of *LowRedundancyBlocks*, which 
> actually wraps 5 priority queues currently. 4 of these queues store different 
> under replicated scenarios, but the 5th one is dedicated for corrupt blocks. 
> Thus, metasave report may suggest some corrupt blocks are just under 
> replicated. This can be misleading for admins and operators trying to track 
> block missing/corruption issues, and/or other issues related to 
> *BlockManager* metrics.
> I would like to propose a patch with trivial changes that would report 
> corrupt blocks separately.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HDFS-12182) BlockManager.metaSave does not distinguish between "under replicated" and "missing" blocks

Reply via email to