Hi Andy,

Thanks for the additional info.  I think I saw a link to that while searching 
but was wary since it was such an old version.

I have two VMs (nifi1, and nifi2) both running NiFi with identical configs, and 
trying to use the inbuilt ZK to cluster them.

If I only mention a single machine within the config (eg: if nifi1 doesn’t 
refer to nifi2, or visa versa) I don’t get any start up errors.

Phil

From: Andy LoPresto
Sent: Tuesday, 2 October 2018 1:00 PM
To: [email protected]
Subject: Re: Zookeeper - help!

Hi Phil, 

Nathan’s advice is correct but I think he was assuming all other configurations 
are correct as well. Are you trying to run both NiFi nodes and ZK instances on 
the same machine? In that case you will have to ensure that the ports in use 
are different for each service so they don’t conflict. Setting them all to the 
same value only works if each service is running on an independent physical 
machine, virtual machine, or container. 

I find Pierre’s guide [1] to be a helpful step-by-step instruction list as well 
as a good explanation of how the clustering concepts work in practice. When you 
get that working, and you’re ready to set up a secure cluster, he has a 
follow-on guide for that as well [2]. Even as someone who has set up many 
clustered instances of NiFi, I use his guides regularly to ensure I haven’t 
forgotten a step. 

They were originally written for versions 1.0.0 and 1.1.0, but the only thing 
that has changed is the authorizer configuration for the secure instances 
(you’ll need to put the Initial Admin Identity and Node Identities in two 
locations in the authorizers.xml file instead of just once). 

Hopefully this helps you get a working cluster up and running so you can 
experiment. Good luck. 

[1] https://pierrevillard.com/2016/08/13/apache-nifi-1-0-0-cluster-setup/
[2] 
https://pierrevillard.com/2016/11/29/apache-nifi-1-1-0-secured-cluster-setup/


Andy LoPresto
[email protected]
[email protected]
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Oct 1, 2018, at 2:45 PM, Phil H <[email protected]> wrote:

Thanks Nathan,

I changed the protocol.port to 10002 on both servers.

On server 1, I now just see endless copies of the second error from my original 
message (“KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss”) – I don’t know if that’s normal when there’s only a single 
member of a cluster alive and running?  Seems like the logs will fill up very 
quickly if it is!

On server 2, I get a bind exception on the Zookeeper client port.  It doesn’t 
matter what I set it to (In this example, I changed it to 10500) I always get 
the same result.  If I run netstat when nifi isn’t running, there’s nothing 
listening on the port.  It’s like NiFi is starting two Zookeeper instances?!  
There’s no repeat of this in the start up sequence though.  Both servers are 
running completely vanilla 1.6.0 – I don’t even have any flow defined yet, this 
is purely for teaching myself clustering config – so I don’t know why one is 
behaving differently to the other.

2018-10-02 17:36:31,610 INFO [QuorumPeer[myid=2]/0.0.0.0:10500] 
o.a.zookeeper.server.ZooKeeperServer Created server with tickTime 2000 
minSessionTimeout 4000 maxSessionTimeout 40000 datadir 
./state/zookeeper/version-2 snapdir ./state/zookeeper/version-2
2018-10-02 17:36:31,612 ERROR [QuorumPeer[myid=2]/0.0.0.0:10500] 
o.apache.zookeeper.server.quorum.Leader Couldn't bind to 
nifi2.domain/192.168.10.102:10500
java.net.BindException: Address already in use (Bind failed)
        at java.net.PlainSocketImpl.socketBind(Native Method)
        at 
java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:387)
        at java.net.ServerSocket.bind(ServerSocket.java:375)
        at java.net.ServerSocket.bind(ServerSocket.java:329)
        at org.apache.zookeeper.server.quorum.Leader.<init>(Leader.java:193)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.makeLeader(QuorumPeer.java:605)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:798)




From: Nathan Gough
Sent: Tuesday, 2 October 2018 2:22 AM
To: [email protected]
Subject: Re: Zookeeper - help!

Hi Phil,

One thing I notice with your config is that the cluster.node.protol.port and 
the zookeeper ports are the same - these should not be the same. 
Node.protocol.port is used by NiFi cluster to communicate between nodes, the 
zookeeper.connect.string port should be the port that zookeeper service is 
listening on. The zookeeper port is configured by the clientPort property in 
the zookeeper.properties file. This would make your connect string: 
'nifi.zookeeper.connect.string=nifi1.domain:2180,nifi2.domain:2180', where 2180 
is whatever clientPort is configured.

You can read more about how NiFi uses Zookeeper and how to configure it here: 
https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#state_management.

Let us know what happens once these properties are configured correctly.

Nathan


On 9/30/18, 11:07 PM, "Phil H" <[email protected]> wrote:

   Hi guys,

   Pulling my hair out trying to solve my Zookeeper problems.  I have two 1.6.0 
servers that I am trying to cluster.

   Here is the except from the properties files – all other properties are 
default so omitted for clarity.   The servers are set up to run HTTPS, and the 
interface works via the browser, so I believe the certificates are correctly 
installed.

   Server nifi1.domain:
   nifi.cluster.is.node=true
   nifi.cluster.node.address=nifi1.domain
   nifi.cluster.node.protocol.port=10000

   nifi.zookeeper.connect.string=nifi2.domain:10000,nifi1.domain:10000
   nifi.zookeeper.root.node=/nifi

   Server nifi2.domain:
   nifi.cluster.is.node=true
   nifi.cluster.node.address=nifi2.domain
   nifi.cluster.node.protocol.port=10000

   nifi.zookeeper.connect.string=nifi1.domain:10000,nifi2.domain:10000
   nifi.zookeeper.root.node=/nifi

   I am getting these errors (this is from server 2, but seeing the same on 
server 1 apart from a different address, of course):

   2018-10-01 20:54:16,332 INFO [main] org.apache.nifi.io.socket.SocketListener 
Now listening for connections from nodes on port 10000
   2018-10-01 20:54:16,381 INFO [main] o.apache.nifi.controller.FlowController 
Successfully synchronized controller with proposed flow
   2018-10-01 20:54:16,435 INFO [main] o.a.nifi.controller.StandardFlowService 
Connecting Node: nifi2.domain:443
   2018-10-01 20:54:16,769 ERROR [Process Cluster Protocol Request-1] 
o.a.nifi.security.util.CertificateUtils The incoming request did not contain 
client certificates and thus the DN cannot be extracted. Check that the other 
endpoint is providing a complete client certificate chain
   2018-10-01 20:54:16,771 WARN [Process Cluster Protocol Request-1] 
o.a.n.c.p.impl.SocketProtocolListener Failed processing protocol message from 
nifi2 due to org.apache.nifi.cluster.protocol.ProtocolException: 
java.security.cert.CertificateException: 
javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated
   org.apache.nifi.cluster.protocol.ProtocolException: 
java.security.cert.CertificateException: 
javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated
           at 
org.apache.nifi.cluster.protocol.impl.SocketProtocolListener.getRequestorDN(SocketProtocolListener.java:225)
           at 
org.apache.nifi.cluster.protocol.impl.SocketProtocolListener.dispatchRequest(SocketProtocolListener.java:131)
           at 
org.apache.nifi.io.socket.SocketListener$2$1.run(SocketListener.java:136)
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:748)
   Caused by: java.security.cert.CertificateException: 
javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated
           at 
org.apache.nifi.security.util.CertificateUtils.extractPeerDNFromClientSSLSocket(CertificateUtils.java:314)
           at 
org.apache.nifi.security.util.CertificateUtils.extractPeerDNFromSSLSocket(CertificateUtils.java:269)
           at 
org.apache.nifi.cluster.protocol.impl.SocketProtocolListener.getRequestorDN(SocketProtocolListener.java:223)
           ... 5 common frames omitted
   Caused by: javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated
           at 
sun.security.ssl.SSLSessionImpl.getPeerCertificates(SSLSessionImpl.java:440)
           at 
org.apache.nifi.security.util.CertificateUtils.extractPeerDNFromClientSSLSocket(CertificateUtils.java:299)
           ... 7 common frames omitted



   2018-10-01 20:54:32,249 INFO [Curator-Framework-0] 
o.a.c.f.state.ConnectionStateManager State change: SUSPENDED
   2018-10-01 20:54:32,250 ERROR [Curator-Framework-0] 
o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave up
   org.apache.zookeeper.KeeperException$ConnectionLossException: 
KeeperErrorCode = ConnectionLoss
           at 
org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
           at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:728)
           at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:857)
           at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:809)
           at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:64)
           at 
org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:267)
           at java.util.concurrent.FutureTask.run(FutureTask.java:266)
           at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
           at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:748)






Reply via email to