Joe Witt created NIFI-10414:
-------------------------------
Summary: Zookeeper/Curator NPE during pod restarts in K8S
environment/Openshift
Key: NIFI-10414
URL: https://issues.apache.org/jira/browse/NIFI-10414
Project: Apache NiFi
Issue Type: Bug
Affects Versions: 1.16.1
Environment: NiFi 1.16.1 on RedHat OpenShift 4.9
Reporter: Joe Witt
Walter Moar
10:51 AM
Hello, I'm running NiFi 1.16.1 on RedHat OpenShift 4.9. I'm seeing the
behaviour that when OpenShift operations cause Zookeeper pod restarts, the NiFi
cluster loses its controller. When this happens I have to scale down the NiFi
pods and then back up again. The error in the logs is:
2022-08-30 17:02:19,452 ERROR [main-EventThread]
o.a.c.f.imps.CuratorFrameworkImpl Background exception was not retry-able or
retry gave up
java.lang.NullPointerException: null
at
org.apache.curator.utils.Compatibility.getHostAddress(Compatibility.java:116)
at
org.apache.curator.framework.imps.EnsembleTracker.configToConnectionString(EnsembleTracker.java:185)
at
org.apache.curator.framework.imps.EnsembleTracker.processConfigData(EnsembleTracker.java:206)
at
org.apache.curator.framework.imps.EnsembleTracker.access$300(EnsembleTracker.java:50)
at
org.apache.curator.framework.imps.EnsembleTracker$2.processResult(EnsembleTracker.java:150)
at
org.apache.curator.framework.imps.CuratorFrameworkImpl.sendToBackgroundCallback(CuratorFrameworkImpl.java:926)
at
org.apache.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:683)
at
org.apache.curator.framework.imps.WatcherRemovalFacade.processBackgroundOperation(WatcherRemovalFacade.java:152)
at
org.apache.curator.framework.imps.GetConfigBuilderImpl$2.processResult(GetConfigBuilderImpl.java:222)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:598)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510)
The NPE should be fixed by https://issues.apache.org/jira/browse/CURATOR-538
Once CURATOR-538 lands we need to pull in the new curator version. This should
address the issue seen above but may expose something else.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)