Hi Purushotham, Thanks for attaching the thread dumps. We can see in all of the thread dumps the same pattern:
"Timer-Driven Process Thread-6" Id=155 WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4f94fece at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403) at com.simba.athena.athena.dataengine.AJStreamResultSet.dequeue(Unknown Source) at com.simba.athena.athena.dataengine.AJStreamResultSet.<init>(Unknown Source) at com.simba.athena.athena.dataengine.AJQueryExecutor.execute(Unknown Source) at com.simba.athena.jdbc.common.SPreparedStatement.executeWithParams(Unknown Source) at com.simba.athena.jdbc.common.SPreparedStatement.executeQuery(Unknown Source) - waiting on com.simba.athena.jdbc.jdbc42.S42PreparedStatement@4a5576ae at org.apache.commons.dbcp2.PoolableConnection.validate(PoolableConnection.java:287) at org.apache.commons.dbcp2.PoolableConnectionFactory.validateConnection(PoolableConnectionFactory.java:389) at org.apache.commons.dbcp2.PoolableConnectionFactory.validateObject(PoolableConnectionFactory.java:375) at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:484) at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:365) at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:134) at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1563) at org.apache.nifi.dbcp.DBCPConnectionPool.getConnection(DBCPConnectionPool.java:470) at org.apache.nifi.dbcp.DBCPService.getConnection(DBCPService.java:49) at sun.reflect.GeneratedMethodAccessor481.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:87) at com.sun.proxy.$Proxy128.getConnection(Unknown Source) at org.apache.nifi.processors.standard.AbstractExecuteSQL.onTrigger(AbstractExecuteSQL.java:222) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1162) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:209) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Number of Locked Synchronizers: 1 - java.util.concurrent.ThreadPoolExecutor$Worker@4054f861 So we see here that NiFi is using a Poolable Connection and calling "validate," which is attempting to run the query "SELECT 1" in your case. It looks like the Athena JDBC driver never returns. Unfortunately, I don't know that there is much that can be done about that on the NiFi side. Googling for issues around JDBC Connections hanging with Athena did result in some troubleshooting [1] documents about queries hanging and resolving it by changing ICMP/MTU sizes. I would recommend trying the recommendations there, if you haven't already. [1] https://docs.aws.amazon.com/redshift/latest/mgmt/connecting-drop-issues.html On Jun 10, 2019, at 3:24 AM, Purushotham Pushpavanthar <[email protected]<mailto:[email protected]>> wrote: Hi Mark, I ran into same issue today. It helped me capture thread dumps. Attached are the thread dumps of 3 node cluster. Our nodes are running as docker containers in c5.xlarge instances. Nifi Version 1.9.2. Regards, Purushotham Pushpavanth On Fri, 7 Jun 2019 at 17:51, Mark Payne <[email protected]<mailto:[email protected]>> wrote: Purushotham, Generally if you run into a situation where you have a stuck thread you will need to provide a thread dump to understand what is going on. It’s easiest to do that by running “bin/nifi.sh dump dump1.txt” and then attaching the created dump1.txt to the email. Thanks -Mark Sent from my iPhone On Jun 7, 2019, at 3:45 AM, Purushotham Pushpavanthar <[email protected]<mailto:[email protected]>> wrote: Hi, I've been ExecuteSQL to execute some DDL statements whenever there is an update to my S3. This was working fine for me except for one glitch. It stops processing any incoming flowfiles with running threads. Once the processor gets into this state, it never recovers. It's not possible to stop the processor without forcefully terminating it. It starts working fine once I restart it through forceful termination. I went through the mail thread in the link http://apache-nifi-users-list.2361937.n4.nabble.com/ExecuteSQL-question-how-do-I-stop-long-running-queries-td3039.html and tried adding Validation Query, but it didn't help. I'm sending very light weight DDL statements like ALTER TABLE ADD PARTITION. I don't this is causing much load on the Athena End. I've attached my ExecuteSQL and DBConnectionPool configuration. Kindly review it and help me resolve/let me know workaround. Regards Purushotham Pushpavanth <Screenshot 2019-06-07 at 12.58.33 PM.png> <Screenshot 2019-06-07 at 12.59.01 PM.png> <Screenshot 2019-06-07 at 12.58.41 PM.png> <dump3.txt><dump2.txt><dump1.txt>
