wido opened a new issue #4657:
URL: https://github.com/apache/cloudstack/issues/4657


   ##### ISSUE TYPE
    * Bug Report
   
   
   ##### COMPONENT NAME
   ~~~
   Management Server
   MySQL
   ~~~
   
   ##### CLOUDSTACK VERSION
   ~~~
   4.13.1
   ~~~
   
   ##### CONFIGURATION
   This is a standard Management Server with a connection to a MariaDB server
   
   
   ##### OS / ENVIRONMENT
   - Ubuntu 16.04.7
   - 10.2.34-MariaDB-10.2.34+maria~xenial-log
   - CloudStack 4.13.1
   
   Settings in db.properties:
   
   <pre>
   
db.cloud.url.params=prepStmtCacheSize=517&cachePrepStmts=true&sessionVariables=sql_mode='STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION'
   db.cloud.maxIdle=30
   db.cloud.maxActive=250
   db.cloud.reconnectAtTxEnd=true
   db.cloud.autoReconnectForPools=true
   db.cloud.autoReconnect=tru
   </pre>
   
   ##### SUMMARY
   The Management Server seems to loose it's connection to MySQL while it is 
actually still up and running. Causing some components to fail inside the 
Management Server.
   
   
   ##### STEPS TO REPRODUCE
   Not known yet
   
   ##### ACTUAL RESULTS
   In the logs suddenly it shows:
   ~~~
   2021-02-06 03:59:04,748 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] 
(AgentManager-Handler-12:null) (logid:) VM state report. host: 30, vm id: 1019, 
power state: PowerOn
   2021-02-06 03:59:04,751 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] 
(AgentManager-Handler-12:null) (logid:) VM state report. host: 30, vm id: 187, 
power state: PowerOn
   2021-02-06 03:59:04,754 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] 
(AgentManager-Handler-12:null) (logid:) VM state report. host: 30, vm id: 702, 
power state: PowerOn
   2021-02-06 03:59:04,780 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] 
(AgentManager-Handler-12:null) (logid:) Done with process of VM state report. 
host: 30
   2021-02-06 03:59:04,836 DEBUG [c.c.a.m.AgentManagerImpl] 
(AgentManager-Handler-14:null) (logid:) Ping from 34(s-1072-VM)
   2021-02-06 03:59:10,284 DEBUG [c.c.s.StatsCollector] 
(StatsCollector-3:ctx-9527a5cd) (logid:2204af40) AutoScaling Monitor is 
running...
   2021-02-06 03:59:13,074 DEBUG [c.c.a.m.AgentManagerImpl] 
(AgentManager-Handler-3:null) (logid:) Ping from Routing host 
3(xx.c01.p01.xx.xx)
   2021-02-06 03:59:13,075 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] 
(AgentManager-Handler-3:null) (logid:) Process host VM state report from ping 
process. host: 3
   2021-02-06 03:59:13,369 DEBUG [c.c.a.m.AgentManagerImpl] 
(AgentManager-Handler-5:null) (logid:) SeqA 35-21752: Processing Seq 35-21752:  
{ Cmd , MgmtId: -1, via: 35, Ver: v1, Flags: 11, 
[{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":1071,"_loadInfo":"{\n
  \"connections\": []\n}","wait":0}}] }
   2021-02-06 03:59:14,662 DEBUG [c.c.a.t.Request] 
(StatsCollector-1:ctx-dcc6eeba) (logid:098d3bcd) Seq 1-8767382574583524527: 
Received:  { Ans: , MgmtId: 90520732637524, via: 1(xx.c01.p01.xx), Ver: v1, 
Flags: 10, { GetVolumeStatsAnswer } }
   2021-02-06 03:59:14,662 DEBUG [c.c.a.m.AgentManagerImpl] 
(StatsCollector-1:ctx-dcc6eeba) (logid:098d3bcd) Details from executing class 
com.cloud.agent.api.GetVolumeStatsCommand: 
   2021-02-06 03:59:14,741 DEBUG [c.c.a.t.Request] 
(StatsCollector-1:ctx-dcc6eeba) (logid:098d3bcd) Seq 1-8767382574583524528: 
Received:  { Ans: , MgmtId: 90520732637524, via: 1(xx.c01.p01.xx), Ver: v1, 
Flags: 10, { GetVolumeStatsAnswer } }
   2021-02-06 03:59:14,741 DEBUG [c.c.a.m.AgentManagerImpl] 
(StatsCollector-1:ctx-dcc6eeba) (logid:098d3bcd) Details from executing class 
com.cloud.agent.api.GetVolumeStatsCommand: 
   2021-02-06 03:59:14,782 ERROR [o.a.c.f.j.d.SyncQueueItemDaoImpl] 
(AsyncJobMgr-Heartbeat-1:ctx-6c122c53) (logid:f5164750) Unexpected sql 
exception, 
   java.sql.SQLException: Cannot get a connection, pool error Timeout waiting 
for idle object
           at 
org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:142)
           at 
com.cloud.utils.db.TransactionLegacy.getConnection(TransactionLegacy.java:557)
           at 
com.cloud.utils.db.TransactionLegacy.prepareStatement(TransactionLegacy.java:468)
           at 
com.cloud.utils.db.TransactionLegacy.prepareAutoCloseStatement(TransactionLegacy.java:461)
           at 
org.apache.cloudstack.framework.jobs.dao.SyncQueueItemDaoImpl.getNextQueueItems(SyncQueueItemDaoImpl.java:106)
           at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source)
           at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
           at java.lang.reflect.Method.invoke(Method.java:498)
           at 
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:338)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:197)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
           at 
com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:174)
           at 
org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
           at 
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212)
           at com.sun.proxy.$Proxy195.getNextQueueItems(Unknown Source)
           at 
org.apache.cloudstack.framework.jobs.impl.SyncQueueManagerImpl$3.doInTransactionWithoutResult(SyncQueueManagerImpl.java:140)
           at 
com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25)
           at 
com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:50)
           at com.cloud.utils.db.Transaction.execute(Transaction.java:40)
           at com.cloud.utils.db.Transaction.execute(Transaction.java:47)
           at 
org.apache.cloudstack.framework.jobs.impl.SyncQueueManagerImpl.dequeueFromAny(SyncQueueManagerImpl.java:137)
           at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$6.reallyRun(AsyncJobManagerImpl.java:813)
           at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$6.runInContext(AsyncJobManagerImpl.java:801)
           at 
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
           at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
           at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
           at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
           at 
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
           at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
           at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
           at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
           at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:748)
   Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
           at 
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:447)
           at 
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:361)
           at 
org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:134)
           ... 36 more
   ~~~
   
   This then keeps repeating and the management server doesn't work properly 
until restarted.
   
   Other log entries:
   
   <pre>
   2021-02-06 03:59:15,230 DEBUG [c.c.c.ClusterManagerImpl] 
(Cluster-Heartbeat-1:ctx-6b154ebc) (logid:4a081a66) Management server heartbeat 
takes too long to finish. profiler: Done. Duration: 10003ms, 
profilerHeartbeatUpdate: Started but not stopped, profilerPeerScan: Not Started
   2021-02-06 03:59:15,230 ERROR [c.c.c.ClusterManagerImpl] 
(Cluster-Heartbeat-1:ctx-6b154ebc) (logid:4a081a66) Unexpected exception in 
cluster heartbeat
   java.lang.RuntimeException: Cannot get a connection, pool error Timeout 
waiting for idle object
           at 
com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:148)
           at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source)
           at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
           at java.lang.reflect.Method.invoke(Method.java:498)
           at 
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:338)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:197)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
           at 
com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:174)
           at 
org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
           at 
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212)
           at com.sun.proxy.$Proxy176.update(Unknown Source)
           at 
com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl.java:555)
           at 
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
           at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
           at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
           at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
           at 
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
           at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
           at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
           at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
           at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:748)
   Caused by: java.sql.SQLException: Cannot get a connection, pool error 
Timeout waiting for idle object
           at 
org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:142)
           at 
com.cloud.utils.db.TransactionLegacy.getConnection(TransactionLegacy.java:557)
           at 
com.cloud.utils.db.TransactionLegacy.prepareStatement(TransactionLegacy.java:468)
           ... 27 more
   Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
           at 
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:447)
           at 
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:361)
           at 
org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:134)
           ... 29 more
   </pre>
   
   <pre>
   2021-02-06 03:59:15,227 WARN  [c.c.c.d.ManagementServerHostDaoImpl] 
(Cluster-Heartbeat-1:ctx-6b154ebc) (logid:4a081a66) Unexpected exception, 
   java.sql.SQLException: Cannot get a connection, pool error Timeout waiting 
for idle object
           at 
org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:142)
           at 
com.cloud.utils.db.TransactionLegacy.getConnection(TransactionLegacy.java:557)
           at 
com.cloud.utils.db.TransactionLegacy.prepareStatement(TransactionLegacy.java:468)
           at 
com.cloud.utils.db.TransactionLegacy.prepareAutoCloseStatement(TransactionLegacy.java:461)
           at 
com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:134)
           at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source)
           at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
           at java.lang.reflect.Method.invoke(Method.java:498)
           at 
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:338)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:197)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
           at 
com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:174)
           at 
org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
           at 
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212)
           at com.sun.proxy.$Proxy176.update(Unknown Source)
           at 
com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl.java:555)
           at 
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
           at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
           at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
           at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
           at 
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
           at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
           at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
           at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
           at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:748)
   Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
   </pre>
   
   
   Increasing *db.cloud.maxActive=250* might help, but I would rather not 
increase this until we are sure that's the root-cause.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to