ConfX created HADOOP-18811: ------------------------------ Summary: Buggy ZKFCRpcServer constructor creates null object and crashes the rpcServer Key: HADOOP-18811 URL: https://issues.apache.org/jira/browse/HADOOP-18811 Project: Hadoop Common Issue Type: Bug Reporter: ConfX
h2. What happened: In ZKFailoverController.java, initRPC() function gets ZKFC RpcServer binding address and create a new ZKFCRpcServer object rpcServer. However rpcServer may be null when the ZKFCRpcServer constructor accepts a null policy provider and cause any later rpcServer usage a null pointer exception. h2. Buggy code: In ZKFailoverController.java {code:java} protected void initRPC() throws IOException { InetSocketAddress bindAddr = getRpcAddressToBindTo(); LOG.info("ZKFC RpcServer binding to {}", bindAddr); rpcServer = new ZKFCRpcServer(conf, bindAddr, this, getPolicyProvider()); // <-- Here getpolicyProvider might be null } {code} ZKFCRpcServer() eventually calls refreshWithLoadedConfiguration() function below. This function directly use provider without check null and this turns out making rpcServer above to be a null object. In ServiceAuthorizationManager.java {code:java} @Private public void refreshWithLoadedConfiguration(Configuration conf, PolicyProvider provider) { ... // Parse the config file Service[] services = provider.getServices(); // <--- provider might be null here ... {code} h2. How to trigger this bug: (1) Set hadoop.security.authorization to true (2) Run test org.apache.hadoop.ha.TestZKFailoverControllerStress#testRandomExpirations (3) You will see the following stack trace: {code:java} java.lang.NullPointerException at org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:258) at org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:63) at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:181) at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:177) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:503) at org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:177) at org.apache.hadoop.ha.MiniZKFCCluster$DummyZKFCThread.doWork(MiniZKFCCluster.java:301) at org.apache.hadoop.test.MultithreadedTestUtil$TestingThread.run(MultithreadedTestUtil.java:189){code} (4) The null pointer exception here is due to the null {{rpcServer}} object caused by the bug described above. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org