[jira] [Created] (NIFI-7266) NIFI 1.4.0 gets unresponsive after heavy load

Manuel Loayza (Jira) Tue, 17 Mar 2020 12:25:41 -0700

Manuel Loayza created NIFI-7266:
-----------------------------------

             Summary: NIFI 1.4.0 gets unresponsive after heavy load
                 Key: NIFI-7266
                 URL: https://issues.apache.org/jira/browse/NIFI-7266
             Project: Apache NiFi
          Issue Type: Bug
          Components: Configuration
    Affects Versions: 1.4.0, 1.3.0, 1.2.0
            Reporter: Manuel Loayza
         Attachments: Screen Shot 2020-03-17 at 3.18.27 PM.png


We have 2 clusters (6 instances each one) running with NIFI 1.1.2 + JDK 8u121 + 
Linux CentOS

The traffic get divided between those 2 clusters:

1. TPS: 2700 - EAST cluster

2. TPS: 980. - WEST cluster

We have tried to migrate to NIFI 1.2.0, 1.3.0, and 1.4.0, but the cluster with 
higher TPS (EAST) got stuck after 4 hours of intensive traffic. Also it web 
console got unresponsive.

I've tried many things to fix this thing, but only thing I got was to increase 
the time from 4 to 6 hours before it fails

Our current instances are running on AWS and each EC2 instances has 8 cpus 
(c5.2xlarge), and 16GB RAM.

I've tried to use  c5.4xlarge (it doubles the cpu and ram), but I got the same 
outcome.

I don't have a clue to figure it out what the issue is.  Also I have a datadog 
dashboard to track some java head metrics but everything looks normal.

What should I do to find why those new better instances are failing? is it 
memory or disk space or threads got stuck? Why an old NIFI  cluster conf works 
better than a new NIFI?

Hope you can help me with this. 

Thanks

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (NIFI-7266) NIFI 1.4.0 gets unresponsive after heavy load

Reply via email to