[ 
https://issues.apache.org/jira/browse/HDFS-10512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-10512:
-----------------------------------
    Description: 
VolumeScanner may terminate due to unexpected NullPointerException thrown in 
{{DataNode.reportBadBlocks()}}. This is different from HDFS-8850/HDFS-9190

I observed this bug in a production CDH 5.5.1 cluster and the same bug still 
persist in upstream trunk.

{noformat}
2016-04-07 20:30:53,830 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
BP-1800173197-10.204.68.5-1444425156296:blk_1170134484_96468685 on /dfs/dn
2016-04-07 20:30:53,831 ERROR 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/dfs/dn, 
DS-89b72832-2a8c-48f3-8235-48e6c5eb5ab3) exiting because of exception
java.lang.NullPointerException
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.reportBadBlocks(DataNode.java:1018)
        at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner$ScanResultHandler.handle(VolumeScanner.java:287)
        at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:443)
        at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:547)
        at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:621)
2016-04-07 20:30:53,832 INFO 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/dfs/dn, 
DS-89b72832-2a8c-48f3-8235-48e6c5eb5ab3) exiting.
{noformat}

I think the NPE comes from the volume variable in the following code snippet. 
Somehow the volume scanner know the volume, but the datanode can not lookup the 
volume using the block.
{code}
public void reportBadBlocks(ExtendedBlock block) throws IOException{
    BPOfferService bpos = getBPOSForBlock(block);
    FsVolumeSpi volume = getFSDataset().getVolume(block);
    bpos.reportBadBlocks(
        block, volume.getStorageID(), volume.getStorageType());
  }
{code}

  was:
VolumeScanner may terminate due to unexpected NullPointerException thrown in 
{{DataNode.reportBadBlocks()}}. This is difference from HDFS-8850/HDFS-9190

I observed this bug in a production CDH 5.5.1 cluster and the same bug still 
persist in upstream trunk.

{noformat}
2016-04-07 20:30:53,830 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
BP-1800173197-10.204.68.5-1444425156296:blk_1170134484_96468685 on /dfs/dn
2016-04-07 20:30:53,831 ERROR 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/dfs/dn, 
DS-89b72832-2a8c-48f3-8235-48e6c5eb5ab3) exiting because of exception
java.lang.NullPointerException
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.reportBadBlocks(DataNode.java:1018)
        at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner$ScanResultHandler.handle(VolumeScanner.java:287)
        at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:443)
        at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:547)
        at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:621)
2016-04-07 20:30:53,832 INFO 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/dfs/dn, 
DS-89b72832-2a8c-48f3-8235-48e6c5eb5ab3) exiting.
{noformat}

I think the NPE comes from the volume variable in the following code snippet. 
Somehow the volume scanner know the volume, but the datanode can not lookup the 
volume using the block.
{code}
public void reportBadBlocks(ExtendedBlock block) throws IOException{
    BPOfferService bpos = getBPOSForBlock(block);
    FsVolumeSpi volume = getFSDataset().getVolume(block);
    bpos.reportBadBlocks(
        block, volume.getStorageID(), volume.getStorageType());
  }
{code}


> VolumeScanner may terminate to due NPE in DataNode.reportBadBlocks
> ------------------------------------------------------------------
>
>                 Key: HDFS-10512
>                 URL: https://issues.apache.org/jira/browse/HDFS-10512
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>            Reporter: Wei-Chiu Chuang
>
> VolumeScanner may terminate due to unexpected NullPointerException thrown in 
> {{DataNode.reportBadBlocks()}}. This is different from HDFS-8850/HDFS-9190
> I observed this bug in a production CDH 5.5.1 cluster and the same bug still 
> persist in upstream trunk.
> {noformat}
> 2016-04-07 20:30:53,830 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> BP-1800173197-10.204.68.5-1444425156296:blk_1170134484_96468685 on /dfs/dn
> 2016-04-07 20:30:53,831 ERROR 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/dfs/dn, 
> DS-89b72832-2a8c-48f3-8235-48e6c5eb5ab3) exiting because of exception
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.reportBadBlocks(DataNode.java:1018)
>         at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner$ScanResultHandler.handle(VolumeScanner.java:287)
>         at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:443)
>         at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:547)
>         at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:621)
> 2016-04-07 20:30:53,832 INFO 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/dfs/dn, 
> DS-89b72832-2a8c-48f3-8235-48e6c5eb5ab3) exiting.
> {noformat}
> I think the NPE comes from the volume variable in the following code snippet. 
> Somehow the volume scanner know the volume, but the datanode can not lookup 
> the volume using the block.
> {code}
> public void reportBadBlocks(ExtendedBlock block) throws IOException{
>     BPOfferService bpos = getBPOSForBlock(block);
>     FsVolumeSpi volume = getFSDataset().getVolume(block);
>     bpos.reportBadBlocks(
>         block, volume.getStorageID(), volume.getStorageType());
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to