[ 
https://issues.apache.org/jira/browse/HDFS-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14530591#comment-14530591
 ] 

Walter Su commented on HDFS-8113:
---------------------------------

patch is good.

Hi, [~chengbing.liu]. Have you tried restart NN? I think so. Fsimage saves 
Files, and block Ids belonging to some files.
when fsimage is loaded, before first block report, and NN in safe mode. Each 
block should belong to some file. Because stored BlockInfo is created according 
to INodeFile proto. It's impossible NN has orphan blocks. should we try find 
null bc here?
When first block reports finished, I think we should try looking for null bc 
again.


> NullPointerException in BlockInfoContiguous causes block report failure
> -----------------------------------------------------------------------
>
>                 Key: HDFS-8113
>                 URL: https://issues.apache.org/jira/browse/HDFS-8113
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.6.0
>            Reporter: Chengbing Liu
>            Assignee: Chengbing Liu
>              Labels: BB2015-05-TBR
>         Attachments: HDFS-8113.02.patch, HDFS-8113.patch
>
>
> The following copy constructor can throw NullPointerException if {{bc}} is 
> null.
> {code}
>   protected BlockInfoContiguous(BlockInfoContiguous from) {
>     this(from, from.bc.getBlockReplication());
>     this.bc = from.bc;
>   }
> {code}
> We have observed that some DataNodes keeps failing doing block reports with 
> NameNode. The stacktrace is as follows. Though we are not using the latest 
> version, the problem still exists.
> {quote}
> 2015-03-08 19:28:13,442 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> RemoteException in offerService
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
> at org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.(BlockInfo.java:80)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockToMarkCorrupt.(BlockManager.java:1696)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.checkReplicaCorrupt(BlockManager.java:2185)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReportedBlock(BlockManager.java:2047)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1950)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1823)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1750)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:1069)
> at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReport(DatanodeProtocolServerSideTranslatorPB.java:152)
> at 
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26382)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1623)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to