Hi Joe, Thanks for your feedback.
It’s deployed by using Hortonworks stack, so if we have to upgrade our NIFI then we might have to upgrade entire stack. Also we have some integrations done to NIFI using REST API along with some custom processors. We have plans to upgrade down the lane, but not immediately due to above challenges. We have almost 5-10M flow files in queues while bringing up NIFI. However, we are observing this issue even with lesser number of flow files also. Thanks & Regards, Avinash M V From: Joe Witt <[email protected]> Sent: 29 June 2021 17:15 To: [email protected] Cc: Modepalli Venkata Avinash <[email protected]> Subject: Re: Taking Huge Time for Connect all Nodes to NIFI cluster CAUTION: This e-mail originated from outside of the organization. Do not click links or open attachments unless you recognise the sender and know the content is safe. Hello A cluster that size should be fine. We did make various improvement to cluster behavior and startup times though. What prevents you from moving to 1.13? How many flowfiles are in the repo when restarting is taking that long? thanks On Tue, Jun 29, 2021 at 7:38 AM Joe Obernberger <[email protected]<mailto:[email protected]>> wrote: Impressive cluster size! I do not have an answer for you, but could you change your architecture so that instead of one large NiFi cluster you have 2 or 3 smaller clusters? Very curious on the answer here as I have also noticed UI slow-downs as the number of nodes increases. -Joe On 6/29/2021 3:45 AM, Modepalli Venkata Avinash wrote: Hi List, We have 13 Nodes NIFI cluster in production & it’s taking huge time for completing NIFI restart. According to our analysis, flow election & flow validation from other nodes with coordinator is taking more time, approx. ~30 hours. Even after all 13 nodes gets connected, NIFI UI responds too slowly. Please find below cluster details. Apache NIFI Version : 1.9.0 Flow.xml.gz size : 13MB (gz compressed) OS : RHEL 7.6 JDK : jdk1.8.0_151 GC : Default GC(Parallel GC) of JDK1.8 is in place. Commented out G1GC because of Numerous bugs in JDK8 while using with WriteaHeadProvenance Repository Min & Max Memory : 140GB Server Memory Per Node : 256GB CPU/CORE : 48 Number of Nodes in Cluster : 13 Max Timer Driven Thread : 100 Running Processors Count : 12K Stopped Processors Count : 10K Disabled Processors Count : 25K Total Processor Count : 47K We couldn’t find any abnormalities in app logs, bootstrap logs & GC logging. Could you please share any input to identify & resolve this issue. Thanks for your help. Thanks & Regards, Avinash M V [https://ipmcdn.avast.com/images/icons/icon-envelope-tick-green-avg-v1.png]<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient> Virus-free. www.avg.com<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
