[ 
https://issues.apache.org/jira/browse/HDFS-14372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16792663#comment-16792663
 ] 

lujie edited comment on HDFS-14372 at 3/14/19 1:51 PM:
-------------------------------------------------------

I have checked other place that use "shouldRun", others are ok, like 
BPServiceActor#retrieveNamespaceInfo
{code:java}
while (shouldRun()) {
  try {
    nsInfo = bpNamenode.versionRequest();
  }
 }

if (nsInfo != null) {
    checkNNVersion(nsInfo);
} else {//check is here
    throw new IOException("DN shut down before block pool connected");
}
{code}
I follow BPServiceActor#retrieveNamespaceInfo, give the patch:
{code:java}
+ if (bpRegistration != null) {
+    throw new IOException("DN shut down before block pool registered");
+ }

{code}
Besides:

The  the log statement at line 811:
{code:java}
LOG.info("Block pool " + this + " successfully registered with NN");
{code}
will print "Block pool Block pool BP-1037925912-10.3.1.11-1552404387201 ", the 
string "Block pool " are printed twice, so it is redundancy.  

I remove the "Block pool " at line 811


was (Author: xiaoheipangzi):
I have checked other place that use "shouldRun", others are ok, like 
BPServiceActor#retrieveNamespaceInfo

 
{code:java}
while (shouldRun()) {
  try {
    nsInfo = bpNamenode.versionRequest();
  }
 }

if (nsInfo != null) {
    checkNNVersion(nsInfo);
} else {//check is here
    throw new IOException("DN shut down before block pool connected");
}
{code}
I follow BPServiceActor#retrieveNamespaceInfo, give the patch:

 
{code:java}
+ if (bpRegistration != null) {
+    throw new IOException("DN shut down before block pool registered");
+ }

{code}
 

Besides:

The  the log statement at line 811:

 
{code:java}
LOG.info("Block pool " + this + " successfully registered with NN");
{code}
will print "Block pool Block pool BP-1037925912-10.3.1.11-1552404387201 ", the 
string "Block pool " are printed twice, so it is redundancy.  

 

I remove the "Block pool " at line 811

 

> NPE while DN is shutting down
> -----------------------------
>
>                 Key: HDFS-14372
>                 URL: https://issues.apache.org/jira/browse/HDFS-14372
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: lujie
>            Assignee: lujie
>            Priority: Major
>         Attachments: HDFS-14372_0.patch
>
>
> Take the code BPServiceActor#register:
> {code:java}
> while (shouldRun()) {
> try {
>    // Use returned registration from namenode with updated fields
>     newBpRegistration = bpNamenode.registerDatanode(newBpRegistration);
>     newBpRegistration.setNamespaceInfo(nsInfo);
>     bpRegistration = newBpRegistration;
>     break;
> } catch(EOFException e) { // namenode might have just restarted
>     ....
> }
> LOG.info("Block pool " + this + " successfully registered with NN");
> bpos.registrationSucceeded(this, bpRegistration);
> {code}
> if DN is shutdown, then above code will skip the loop, and bpRegistration  == 
> null, the null value will be used  in DataNode#bpRegistrationSucceeded:
> {code:java}
> if(!storage.getDatanodeUuid().equals(bpRegistration.getDatanodeUuid()))
> {code}
> hence NPE happens
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.bpRegistrationSucceeded(DataNode.java:1583)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.registrationSucceeded(BPOfferService.java:425)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.register(BPServiceActor.java:807)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:294)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:840)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to