ConfX created HADOOP-18811:
------------------------------

             Summary: Buggy ZKFCRpcServer constructor creates null object and 
crashes the rpcServer
                 Key: HADOOP-18811
                 URL: https://issues.apache.org/jira/browse/HADOOP-18811
             Project: Hadoop Common
          Issue Type: Bug
            Reporter: ConfX


h2. What happened:

In ZKFailoverController.java, initRPC() function gets ZKFC RpcServer binding 
address and create a new ZKFCRpcServer object rpcServer. However rpcServer may 
be null when the ZKFCRpcServer constructor accepts a null policy provider and 
cause any later rpcServer usage a null pointer exception.
h2. Buggy code:

In ZKFailoverController.java
{code:java}
protected void initRPC() throws IOException {
  InetSocketAddress bindAddr = getRpcAddressToBindTo();
  LOG.info("ZKFC RpcServer binding to {}", bindAddr);
  rpcServer = new ZKFCRpcServer(conf, bindAddr, this, getPolicyProvider());  // 
<-- Here getpolicyProvider might be null
}
{code}
ZKFCRpcServer() eventually calls refreshWithLoadedConfiguration() function 
below. This function directly use provider without check null and this turns 
out making rpcServer above to be a null object.

In ServiceAuthorizationManager.java
{code:java}
  @Private
  public void refreshWithLoadedConfiguration(Configuration conf, PolicyProvider 
provider) {
    ...
    // Parse the config file
    Service[] services = provider.getServices();   // <--- provider might be 
null here
    ... {code}
h2. How to trigger this bug:

(1) Set hadoop.security.authorization to true

(2) Run test 
org.apache.hadoop.ha.TestZKFailoverControllerStress#testRandomExpirations

(3) You will see the following stack trace:
{code:java}
java.lang.NullPointerException                                                  
        
        at 
org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:258)  
                                                                                
                            
        at 
org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:63)
      
        at 
org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:181)  
                                                                                
                            
        at 
org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:177)  
                                                                                
                            
        at 
org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:503)
                                                                                
                         
        at 
org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:177)    
        
        at 
org.apache.hadoop.ha.MiniZKFCCluster$DummyZKFCThread.doWork(MiniZKFCCluster.java:301)
   
        at 
org.apache.hadoop.test.MultithreadedTestUtil$TestingThread.run(MultithreadedTestUtil.java:189){code}
(4) The null pointer exception here is due to the null {{rpcServer}} object 
caused by the bug described above.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to