Re: [EXT] Re: Cluster Warnings

Joe Witt Tue, 16 Oct 2018 07:04:02 -0700

karthik

understood.  do you have those logs?




On Tue, Oct 16, 2018, 9:59 AM Karthik Kothareddy (karthikk) [CONT - Type 2]
<[email protected]> wrote:

> Joe,
>
>
>
> The slow node is Node04 in this case but we get one such slow response
> from a random node(Node01, Node02,Node03) every time we see this warning.
>
>
>
> -Karthik
>
>
>
> *From:* Joe Witt [mailto:[email protected]]
> *Sent:* Tuesday, October 16, 2018 7:55 AM
> *To:* [email protected]
> *Subject:* [EXT] Re: Cluster Warnings
>
>
>
> the logs show the fourth node is the slowest by far in all cases.
> possibly a dns or other related issue?  but def focus on that node as the
> outlier and presuming nifi config is identical it suggest system/network
> differences from other nodes.
>
>
>
> thanks
>
>
>
> On Tue, Oct 16, 2018, 9:51 AM Karthik Kothareddy (karthikk) [CONT - Type
> 2] <[email protected]> wrote:
>
>
>
> Hello,
>
>
>
> We’re running a 4-node cluster on NiFi 1.7.1. The fourth node was added
> recently and as soon as we added the 4th node, we started seeing below
> warnings
>
>
>
> *Response time from NODE2 was slow for each of the last 3 requests made.
> To see more information about timing, enable DEBUG logging for
> org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator*
>
>
>
> Initially we though the problem was with the recent node added and cross
> checked all the configs on the box and everything seemed to be just fine.
> After enabling the DEBUG mode for cluster logging we noticed that the
> warning is not specific to any node and every-time we see a warning like
> above there is one slow node which takes forever to send a response like
> below (in this case the slow node is NIFI04). Sometimes these will lead to
> node-disconnects needing a manual intervention.
>
>
>
> *DEBUG [Replicate Request Thread-50]
> o.a.n.c.c.h.r.ThreadPoolRequestReplicator Node Responses for GET
> /nifi-api/site-to-site (Request ID b2c6e983-5233-4007-bd54-13d21b7068d5):*
>
> *NIFI04:8443: 1386 millis*
>
> *NIFI02:8443: 3 millis*
>
> *NIFI01:8443: 5 millis*
>
> *NIFI03:8443: 3 millis*
>
> *DEBUG [Replicate Request Thread-41]
> o.a.n.c.c.h.r.ThreadPoolRequestReplicator Node Responses for GET
> /nifi-api/site-to-site (Request ID d182fdab-f1d4-4ac9-97fd-e24c41dc4622):*
>
> *NIFI04:8443: 1143 millis*
>
> *NIFI02:8443: 22 millis*
>
> *NIFI01:8443: 3 millis*
>
> *NIFI03:8443: 2 millis*
>
> *DEBUG [Replicate Request Thread-31]
> o.a.n.c.c.h.r.ThreadPoolRequestReplicator Node Responses for GET
> /nifi-api/site-to-site (Request ID e4726027-27c7-4bbb-8ab6-d02bb41f1920):*
>
> *NIFI04:8443: 1053 millis*
>
> *NIFI02:8443: 3 millis*
>
> *NIFI01:8443: 3 millis*
>
> *NIFI03:8443: 2 millis*
>
>
>
> We tried changing the configurations in nifi.properties like bumping up
> the “nifi.cluster.node.protocol.max.threads” but none of them seems to be
> working and we’re still stuck with the slow communication between the
> nodes. We use an external zookeeper as this is our production server.
>
> Below are some of our configs
>
>
>
> *# cluster node properties (only configure for cluster nodes) #*
>
> *nifi.cluster.is.node=true*
>
> *nifi.cluster.node.address=fslhdppnifi01.imfs.micron.com
> <http://fslhdppnifi01.imfs.micron.com>*
>
> *nifi.cluster.node.protocol.port=11443*
>
> *nifi.cluster.node.protocol.threads=100*
>
> *nifi.cluster.node.protocol.max.threads=120*
>
> *nifi.cluster.node.event.history.size=25*
>
> *nifi.cluster.node.connection.timeout=90 sec*
>
> *nifi.cluster.node.read.timeout=90 sec*
>
> *nifi.cluster.node.max.concurrent.requests=1000*
>
> *nifi.cluster.firewall.file=*
>
> *nifi.cluster.flow.election.max.wait.time=30 sec*
>
> *nifi.cluster.flow.election.max.candidates=*
>
>
>
> Any thoughts on why this is happening?
>
>
>
>
>
> -Karthik
>
>

Re: [EXT] Re: Cluster Warnings

Reply via email to