[ 
https://issues.apache.org/jira/browse/HDFS-17099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ConfX updated HDFS-17099:
-------------------------
    Description: 
h2. What happend:

Got NullPointerException when stop namesystem in HDFS.
h2. Buggy code:

 
{code:java}
  void stopActiveServices() {
    ...
    if (dir != null && getFSImage() != null) {
      if (getFSImage().editLog != null) {    // <--- Check whether editLog is 
null
        getFSImage().editLog.close();
      }
      // Update the fsimage with the last txid that we wrote
      // so that the tailer starts from the right spot.
      getFSImage().updateLastAppliedTxIdFromWritten(); // <--- BUG: Even if 
editLog is null, this line will still be executed and cause nullpointer 
exception
    }
    ...
  }  public void updateLastAppliedTxIdFromWritten() {
    this.lastAppliedTxId = editLog.getLastWrittenTxId();  // <---- This will 
cause nullpointer exception if editLog is null
  } {code}
h2. StackTrace:

 
{code:java}
java.lang.NullPointerException
        at 
org.apache.hadoop.hdfs.server.namenode.FSImage.updateLastAppliedTxIdFromWritten(FSImage.java:1553)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.stopActiveServices(FSNamesystem.java:1463)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.close(FSNamesystem.java:1815)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:1017)
        at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:248)
        at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init>(SecondaryNameNode.java:194)
        at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init>(SecondaryNameNode.java:181)
 {code}
h2. How to reproduce:

(1) Set {{dfs.namenode.top.windows.minutes}} to {{{}37914516,32,0{}}}; or set 
{{dfs.namenode.top.window.num.buckets}} to {{{}244111242{}}}.
(2) Run test: 
{{org.apache.hadoop.hdfs.server.namenode.TestNameNodeHttpServerXFrame#testSecondaryNameNodeXFrame}}
h2. What's more:

I'm still investigating how the parameter {{dfs.namenode.top.windows.minutes}} 
triggered the buggy code.

 

For an easy reproduction, run the reproduce.sh in the attachment.

We are happy to provide a patch if this issue is confirmed.

  was:
h2. What happend:

Got NullPointerException when stop namesystem in HDFS.
h2. Buggy code:

 
{code:java}
  void stopActiveServices() {
    ...
    if (dir != null && getFSImage() != null) {
      if (getFSImage().editLog != null) {    // <--- Check whether editLog is 
null
        getFSImage().editLog.close();
      }
      // Update the fsimage with the last txid that we wrote
      // so that the tailer starts from the right spot.
      getFSImage().updateLastAppliedTxIdFromWritten(); // <--- BUG: Even if 
editLog is null, this line will still be executed and cause nullpointer 
exception
    }
    ...
  }  public void updateLastAppliedTxIdFromWritten() {
    this.lastAppliedTxId = editLog.getLastWrittenTxId();  // <---- This will 
cause nullpointer exception if editLog is null
  } {code}
h2. StackTrace:

 
{code:java}
java.lang.NullPointerException
        at 
org.apache.hadoop.hdfs.server.namenode.FSImage.updateLastAppliedTxIdFromWritten(FSImage.java:1553)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.stopActiveServices(FSNamesystem.java:1463)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.close(FSNamesystem.java:1815)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:1017)
        at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:248)
        at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init>(SecondaryNameNode.java:194)
        at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init>(SecondaryNameNode.java:181)
 {code}
h2. How to reproduce:

(1) Set {{dfs.namenode.top.windows.minutes}} to {{{}37914516,32,0{}}}; or set 
{{dfs.namenode.top.window.num.buckets}} to {{{}244111242{}}}.
(2) Run test: 
{{org.apache.hadoop.hdfs.server.namenode.TestNameNodeHttpServerXFrame#testSecondaryNameNodeXFrame}}
h2. What's more:

I'm still investigating how the parameter {{dfs.namenode.top.windows.minutes}} 
triggered the buggy code.

 

For an easy reproduction, run the reproduce.sh in the attachment.

We are happy to provide a patch if this issue is confirmed.

{{}}

 

 


> Null Pointer Exception when stop namesystem in HDFS
> ---------------------------------------------------
>
>                 Key: HDFS-17099
>                 URL: https://issues.apache.org/jira/browse/HDFS-17099
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: ConfX
>            Priority: Critical
>         Attachments: reproduce.sh
>
>
> h2. What happend:
> Got NullPointerException when stop namesystem in HDFS.
> h2. Buggy code:
>  
> {code:java}
>   void stopActiveServices() {
>     ...
>     if (dir != null && getFSImage() != null) {
>       if (getFSImage().editLog != null) {    // <--- Check whether editLog is 
> null
>         getFSImage().editLog.close();
>       }
>       // Update the fsimage with the last txid that we wrote
>       // so that the tailer starts from the right spot.
>       getFSImage().updateLastAppliedTxIdFromWritten(); // <--- BUG: Even if 
> editLog is null, this line will still be executed and cause nullpointer 
> exception
>     }
>     ...
>   }  public void updateLastAppliedTxIdFromWritten() {
>     this.lastAppliedTxId = editLog.getLastWrittenTxId();  // <---- This will 
> cause nullpointer exception if editLog is null
>   } {code}
> h2. StackTrace:
>  
> {code:java}
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateLastAppliedTxIdFromWritten(FSImage.java:1553)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.stopActiveServices(FSNamesystem.java:1463)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.close(FSNamesystem.java:1815)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:1017)
>         at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:248)
>         at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init>(SecondaryNameNode.java:194)
>         at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init>(SecondaryNameNode.java:181)
>  {code}
> h2. How to reproduce:
> (1) Set {{dfs.namenode.top.windows.minutes}} to {{{}37914516,32,0{}}}; or set 
> {{dfs.namenode.top.window.num.buckets}} to {{{}244111242{}}}.
> (2) Run test: 
> {{org.apache.hadoop.hdfs.server.namenode.TestNameNodeHttpServerXFrame#testSecondaryNameNodeXFrame}}
> h2. What's more:
> I'm still investigating how the parameter 
> {{dfs.namenode.top.windows.minutes}} triggered the buggy code.
>  
> For an easy reproduction, run the reproduce.sh in the attachment.
> We are happy to provide a patch if this issue is confirmed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to