[
https://issues.apache.org/jira/browse/ZOOKEEPER-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Enrico Olivelli reassigned ZOOKEEPER-4296:
------------------------------------------
Assignee: Enrico Olivelli
> NullPointerException when ClientCnxnSocketNetty is closed without being opened
> ------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-4296
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4296
> Project: ZooKeeper
> Issue Type: Bug
> Affects Versions: 3.5.9, 3.5.3, 3.6.3, 3.6.2
> Reporter: Colvin Cowie
> Assignee: Enrico Olivelli
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
> I believe this bug was originally reported as ZOOKEEPER-2966 but that was
> closed as not reproducible in February 2019. I left a comment with these
> details on that issue in December. I can create a PR with a fix at some point
> this week.
>
> In ZooKeeper 3.6.2, in the context of the SolrJ client, we hit the NPE
> reported on ZOOKEEPER-2966 when a DNS error causes an exception after the
> SolrZkClient trys to connect to ZooKeeper, but then immediately calls close
> on the {{ClientCnxn}}
> [https://github.com/apache/solr/blob/releases/lucene-solr%2F8.7.0/solr/solrj/src/java/org/apache/solr/common/cloud/SolrZkClient.java#L158-L204].
> {noformat}
> java.lang.NullPointerException: null
> at
> org.apache.zookeeper.ClientCnxnSocketNetty.onClosing(ClientCnxnSocketNetty.java:247)
> ~[zookeeper-3.6.2.jar:3.6.2]
> at
> org.apache.zookeeper.ClientCnxn$SendThread.close(ClientCnxn.java:1445)
> ~[zookeeper-3.6.2.jar:3.6.2]
> at org.apache.zookeeper.ClientCnxn.disconnect(ClientCnxn.java:1488)
> ~[zookeeper-3.6.2.jar:3.6.2]
> at org.apache.zookeeper.ClientCnxn.close(ClientCnxn.java:1517)
> ~[zookeeper-3.6.2.jar:3.6.2]
> at org.apache.zookeeper.ZooKeeper.close(ZooKeeper.java:1614)
> ~[zookeeper-3.6.2.jar:3.6.2]
> at
> org.apache.solr.common.cloud.SolrZooKeeper.close(SolrZooKeeper.java:97)
> ~[solr-solrj-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 -
> atrisharma - 2020-10-29 19:39:18]
> at
> org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:198)
> ~[solr-solrj-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 -
> atrisharma - 2020-10-29 19:39:18]
> at
> org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:127)
> ~[solr-solrj-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 -
> atrisharma - 2020-10-29 19:39:18]
> at
> org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:122)
> ~[solr-solrj-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 -
> atrisharma - 2020-10-29 19:39:18]
> at
> org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:109)
> ~[solr-solrj-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 -
> atrisharma - 2020-10-29 19:39:18]
> {noformat}
> This happens if the {{ClientCnxnSocketNetty}}'s {{onClosing()}} is called
> before {{connect(...)}} (or if connect isn't called at all) because the
> {{firstConnect}} {{CountDownLatch}} is only initialized in {{connect(...)}}.
>
> [https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxnSocketNetty.java#L129]
>
> [https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxnSocketNetty.java#L247]
> A null check in {{onClosing()}} will fix it, but I don't know if there's any
> greater change required, e.g. some synchronization around connect and
> onClosing.
> The code in
> [3.5.3|https://github.com/apache/zookeeper/blame/1507f67a06175155003722297daeb60bc912af1d/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxnSocketNetty.java#L206]
> looks very similar, it looks like it's been present since the initial commit
> of {{ClientCnxnSocketNetty}}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)