Hi Bryan,

Yes, I plan on having an odd number of Zookeepers, but for now I am just
trying to get the secure cluster setup across multiple VM's (in this case,
2). For whatever reason, one of my two machines is not able to ping or be
pinged. I am investigating the cause of this now. I will retry the cluster
once that has been resolved. I will follow up once that has been worked out
and whether or not that resolves the issue (makes sense that it would).

Cheers,

Ryan H.

On Tue, Mar 7, 2017 at 11:00 AM, Bryan Bende <[email protected]> wrote:

> Hi Ryan,
>
> Is each instance able to successfully ping the hostname of the other
> instance? (using the hostname you are using in the nifi config)
>
> Are ports 2181, 2888, and 3888 definitely open on both nodes?
>
> Also, I don't think this is the cause of your problem, but in general
> you should have an odd number of ZooKeeper nodes, so typically 1, 3,
> or 5 nodes.
>
> -Bryan
>
>
> On Tue, Mar 7, 2017 at 9:39 AM, Ryan H
> <[email protected]> wrote:
> >
> > Hi All,
> >
> > I am running into another issue setting up a secure NiFi cluster across 2
> > EC2 instances in AWS. Shortly after starting up the two nodes, the
> > nifi-app.log is completely spammed with the following error message(s):
> >
> > 2017-03-06 13:48:06,029 ERROR [Curator-Framework-0]
> > o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave up
> > org.apache.zookeeper.KeeperException$ConnectionLossException:
> > KeeperErrorCode = ConnectionLoss
> >     at org.apache.zookeeper.KeeperException.create(
> KeeperException.java:99)
> > ~[zookeeper-3.4.6.jar:3.4.6-1569965]
> >     at
> > org.apache.curator.framework.imps.CuratorFrameworkImpl.
> checkBackgroundRetry(CuratorFrameworkImpl.java:728)
> > [curator-framework-2.11.0.jar:na]
> >     at
> > org.apache.curator.framework.imps.CuratorFrameworkImpl.
> performBackgroundOperation(CuratorFrameworkImpl.java:857)
> > [curator-framework-2.11.0.jar:na]
> >     at
> > org.apache.curator.framework.imps.CuratorFrameworkImpl.
> backgroundOperationsLoop(CuratorFrameworkImpl.java:809)
> > [curator-framework-2.11.0.jar:na]
> >     at
> > org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(
> CuratorFrameworkImpl.java:64)
> > [curator-framework-2.11.0.jar:na]
> >     at
> > org.apache.curator.framework.imps.CuratorFrameworkImpl$4.
> call(CuratorFrameworkImpl.java:267)
> > [curator-framework-2.11.0.jar:na]
> >     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> > [na:1.8.0_121]
> >     at
> > java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> > [na:1.8.0_121]
> >     at
> > java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> > [na:1.8.0_121]
> >     at
> > java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> > [na:1.8.0_121]
> >     at
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> > [na:1.8.0_121]
> >     at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> > 2017-03-06 13:48:06,029 ERROR [Curator-Framework-0]
> > o.a.c.f.imps.CuratorFrameworkImpl Background retry gave up
> > org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
> > ConnectionLoss
> >     at
> > org.apache.curator.framework.imps.CuratorFrameworkImpl.
> performBackgroundOperation(CuratorFrameworkImpl.java:838)
> > [curator-framework-2.11.0.jar:na]
> >     at
> > org.apache.curator.framework.imps.CuratorFrameworkImpl.
> backgroundOperationsLoop(CuratorFrameworkImpl.java:809)
> > [curator-framework-2.11.0.jar:na]
> >     at
> > org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(
> CuratorFrameworkImpl.java:64)
> > [curator-framework-2.11.0.jar:na]
> >     at
> > org.apache.curator.framework.imps.CuratorFrameworkImpl$4.
> call(CuratorFrameworkImpl.java:267)
> > [curator-framework-2.11.0.jar:na]
> >     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> > [na:1.8.0_121]
> >     at
> > java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> > [na:1.8.0_121]
> >     at
> > java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> > [na:1.8.0_121]
> >     at
> > java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> > [na:1.8.0_121]
> >     at
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> > [na:1.8.0_121]
> >     at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> >
> >
> > The error looks very similar to the following post:
> > http://apache-nifi-developer-list.39713.n7.nabble.com/
> Zookeeper-error-td13915.html
> >
> > However, the resolution employed with the solution provided did not
> resolve
> > the issue (clearing state and flow from the NiFi nodes). As an aside and
> > FWIW, when stopping the nodes using ./bin/nifi.sh stop I will get the
> > following shut-down message:
> >
> > ERROR [main] org.apache.nifi.bootstrap.Command Failed to send shutdown
> > command to port 32993 due to java.net.SocketTimeoutException: Read timed
> > out. Will kill the NiFi Process with PID 5984.
> >
> > This issue looks extremely close to the following Bug as documented on
> > Apache:
> > https://issues.apache.org/jira/browse/CURATOR-209
> >
> > Here is what I have previously successfully done during my development
> > efforts:
> >
> > Setup single standalone Unsecured NiFi.
> > Setup multiple nodes Unsecured (clustered) on single EC2 instance.
> > Setup multiple nodes Unsecured across multiple EC2 instances.
> > Setup single standalone Secured NiFi.
> > Setup multiple nodes Secured (clustered) on single EC2 instance.
> >
> > Below are the relevant config files for my 2 nodes. Any help is greatly
> > appreciated!
> >
> > Update: I am thinking that there may be an issue with the hostname that
> I am
> > using, but not sure. I am using the hostname that is the result of
> issuing
> > the "hostname" command from the terminal. From what I understand, this is
> > just the internal ec2 hostname and not a FQDN. I am not sure if this is
> > relevant or not.
> >
> > Cheers,
> >
> > Ryan H.
> >
> >
> > -----------------------------------------
> >
> > EC2 Instance 1
> >
> > -----------------------------------------
> >
> > nifi.properties
> >
> > nifi.state.management.embedded.zookeeper.start=true
> >
> > # Site to Site properties
> > nifi.remote.input.host=my-host-name-1
> > nifi.remote.input.secure=true
> > nifi.remote.input.socket.port=10443
> > nifi.remote.input.http.enabled=true
> > nifi.remote.input.http.transaction.ttl=30 sec
> >
> > # web properties #
> > nifi.web.war.directory=./lib
> > nifi.web.http.host=
> > nifi.web.http.port=
> > nifi.web.https.host=my-host-name-1
> > nifi.web.https.port=443
> > nifi.web.jetty.working.directory=./work/jetty
> > nifi.web.jetty.threads=200
> >
> > # cluster common properties (all nodes must have same values) #
> > nifi.cluster.protocol.heartbeat.interval=5 sec
> > nifi.cluster.protocol.is.secure=true
> >
> > # cluster node properties (only configure for cluster nodes) #
> > nifi.cluster.is.node=true
> > nifi.cluster.node.address=my-host-name-1
> > nifi.cluster.node.protocol.port=11443
> > nifi.cluster.node.protocol.threads=10
> > nifi.cluster.node.event.history.size=25
> > nifi.cluster.node.connection.timeout=5 sec
> > nifi.cluster.node.read.timeout=5 sec
> > nifi.cluster.firewall.file=
> > nifi.cluster.flow.election.max.wait.time=1 mins
> > nifi.cluster.flow.election.max.candidates=
> >
> > # zookeeper properties, used for cluster management #
> > nifi.zookeeper.connect.string=my-host-name-1:2181,my-host-name-2:2181
> > nifi.zookeeper.connect.timeout=3 secs
> > nifi.zookeeper.session.timeout=3 secs
> > nifi.zookeeper.root.node=/nifi
> >
> > -----------------------------------------
> >
> > state-management.xml
> >
> > <cluster-provider>
> >         <id>zk-provider</id>
> >
> > <class>org.apache.nifi.controller.state.providers.zookeeper.
> ZooKeeperStateProvider</class>
> >         <property name="Connect
> > String">my-host-name-1:2181,my-host-name-2:2181</property>
> >         <property name="Root Node">/nifi</property>
> >         <property name="Session Timeout">10 seconds</property>
> >         <property name="Access Control">Open</property>
> >     </cluster-provider>
> >
> > -----------------------------------------
> >
> > zookeeper.properties
> >
> > clientPort=2181
> > initLimit=10
> > autopurge.purgeInterval=24
> > syncLimit=5
> > tickTime=2000
> > dataDir=./state/zookeeper
> > autopurge.snapRetainCount=30
> >
> > server.1=my-host-name-1:2888:3888
> > server.2=my-host-name-2:2888:3888
> >
> > -----------------------------------------
> >
> > authorizers.xml
> >
> > <authorizer>
> >         <identifier>file-provider</identifier>
> >         <class>org.apache.nifi.authorization.FileAuthorizer</class>
> >         <property name="Authorizations
> > File">./conf/authorizations.xml</property>
> >         <property name="Users File">./conf/users.xml</property>
> >         <property name="Initial Admin Identity">CN=admin,
> OU=NIFI</property>
> >         <property name="Legacy Authorized Users File"></property>
> >         <property name="Node Identity 1">CN=my-host-name-1,
> > OU=NIFI</property>
> >         <property name="Node Identity 2">CN=my-host-name-2,
> > OU=NIFI</property>
> >     </authorizer>
> >
> > -----------------------------------------
> >
> > EC2 Instance 2
> >
> > -----------------------------------------
> >
> > nifi.properties
> >
> > nifi.state.management.embedded.zookeeper.start=true
> >
> > # Site to Site properties
> > nifi.remote.input.host=my-host-name-2
> > nifi.remote.input.secure=true
> > nifi.remote.input.socket.port=10443
> > nifi.remote.input.http.enabled=true
> > nifi.remote.input.http.transaction.ttl=30 sec
> >
> > # web properties #
> > nifi.web.war.directory=./lib
> > nifi.web.http.host=
> > nifi.web.http.port=
> > nifi.web.https.host=my-host-name-2
> > nifi.web.https.port=443
> > nifi.web.jetty.working.directory=./work/jetty
> > nifi.web.jetty.threads=200
> >
> > # cluster common properties (all nodes must have same values) #
> > nifi.cluster.protocol.heartbeat.interval=5 sec
> > nifi.cluster.protocol.is.secure=true
> >
> > # cluster node properties (only configure for cluster nodes) #
> > nifi.cluster.is.node=true
> > nifi.cluster.node.address=my-host-name-2
> > nifi.cluster.node.protocol.port=11443
> > nifi.cluster.node.protocol.threads=10
> > nifi.cluster.node.event.history.size=25
> > nifi.cluster.node.connection.timeout=5 sec
> > nifi.cluster.node.read.timeout=5 sec
> > nifi.cluster.firewall.file=
> > nifi.cluster.flow.election.max.wait.time=1 mins
> > nifi.cluster.flow.election.max.candidates=
> >
> > # zookeeper properties, used for cluster management #
> > nifi.zookeeper.connect.string=my-host-name-1:2181,my-host-name-2:2181
> > nifi.zookeeper.connect.timeout=3 secs
> > nifi.zookeeper.session.timeout=3 secs
> > nifi.zookeeper.root.node=/nifi
> >
> > -----------------------------------------
> >
> > state-management.xml
> >
> > <cluster-provider>
> >         <id>zk-provider</id>
> >
> > <class>org.apache.nifi.controller.state.providers.zookeeper.
> ZooKeeperStateProvider</class>
> >         <property name="Connect
> > String">my-host-name-1:2181,my-host-name-2:2181</property>
> >         <property name="Root Node">/nifi</property>
> >         <property name="Session Timeout">10 seconds</property>
> >         <property name="Access Control">Open</property>
> >     </cluster-provider>
> >
> > -----------------------------------------
> >
> > zookeeper.properties
> >
> > clientPort=2181
> > initLimit=10
> > autopurge.purgeInterval=24
> > syncLimit=5
> > tickTime=2000
> > dataDir=./state/zookeeper
> > autopurge.snapRetainCount=30
> >
> > server.1=my-host-name-1:2888:3888
> > server.2=my-host-name-2:2888:3888
> >
> > -----------------------------------------
> >
> > authorizers.xml
> >
> > <authorizer>
> >         <identifier>file-provider</identifier>
> >         <class>org.apache.nifi.authorization.FileAuthorizer</class>
> >         <property name="Authorizations
> > File">./conf/authorizations.xml</property>
> >         <property name="Users File">./conf/users.xml</property>
> >         <property name="Initial Admin Identity">CN=admin,
> OU=NIFI</property>
> >         <property name="Legacy Authorized Users File"></property>
> >         <property name="Node Identity 1">CN=my-host-name-1,
> > OU=NIFI</property>
> >         <property name="Node Identity 2">CN=my-host-name-2,
> > OU=NIFI</property>
> >     </authorizer>
>

Reply via email to