Thank you Rafael, I test your fix and it seems that I have got the expected result. You can see the exception raised for database failover. I should notice I replace the file for cloudstack-mnagement and cloudstack-usage: /usr/share/cloudstack-usage/lib/cloud-framework-cluster-4.9.3.0.jar /usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-framework-cluster-4.9.3.0.jar
Logs: WARN [c.c.c.d.ManagementServerHostDaoImpl] (Cluster-Heartbeat-1:ctx-073cca55) (logid:e652d00b) Unexpected exception, com.cloud.utils.exception.CloudRuntimeException: Unable to commit or close the connection. at com.cloud.utils.db.TransactionLegacy.commit(TransactionLegacy.java:740) at com.cloud.cluster.dao.ManagementServerHostDaoImpl.update(ManagementServerHostDaoImpl.java:140) at sun.reflect.GeneratedMethodAccessor103.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) at com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) at com.sun.proxy.$Proxy203.update(Unknown Source) at com.cloud.cluster.ClusterManagerImpl$4.runInContext(ClusterManagerImpl.java:555) at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:473) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622) at java.lang.Thread.run(Thread.java:748) Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ... 46 more INFO [o.a.c.f.j.i.AsyncJobManagerImpl] (AsyncJobMgr-Heartbeat-1:ctx-5ef0f4d1) (logid:4bfa48b2) Begin cleanup expired async-jobs INFO [o.a.c.f.j.i.AsyncJobManagerImpl] (AsyncJobMgr-Heartbeat-1:ctx-5ef0f4d1) (logid:4bfa48b2) End cleanup expired async-jobs ERROR [c.c.u.d.ConnectionConcierge] (ConnectionConcierge-1:ctx-d3460aeb) (logid:b8c62262) Unable to keep the db connection for LockMaster1 java.sql.SQLException: Connection was killed at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1073) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3597) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3529) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1990) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2151) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2625) at com.mysql.jdbc.LoadBalancedMySQLConnection.execSQL(LoadBalancedMySQLConnection.java:155) at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119) at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:2283) at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.mysql.jdbc.LoadBalancingConnectionProxy$ConnectionErrorFiringInvocationHandler.invoke(LoadBalancingConnectionProxy.java:103) at com.mysql.jdbc.FailoverConnectionProxy$FailoverInvocationHandler.invoke(FailoverConnectionProxy.java:51) at com.sun.proxy.$Proxy257.executeQuery(Unknown Source) at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96) at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96) at com.cloud.utils.db.ConnectionConcierge$ConnectionConciergeManager.testValidity(ConnectionConcierge.java:148) at com.cloud.utils.db.ConnectionConcierge$ConnectionConciergeManager$1.runInContext(ConnectionConcierge.java:203) at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:473) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622) at java.lang.Thread.run(Thread.java:748) INFO [o.a.c.f.j.i.AsyncJobManagerImpl] (AsyncJobMgr-Heartbeat-1:ctx-22399f19) (logid:0deff5fe) Begin cleanup expired async-jobs INFO [o.a.c.f.j.i.AsyncJobManagerImpl] (AsyncJobMgr-Heartbeat-1:ctx-22399f19) (logid:0deff5fe) End cleanup expired async-jobs INFO [o.a.c.f.j.i.AsyncJobManagerImpl] (AsyncJobMgr-Heartbeat-1:ctx-6004a86b) (logid:64e4a3b3) Begin cleanup expired async-jobs INFO [o.a.c.f.j.i.AsyncJobManagerImpl] (AsyncJobMgr-Heartbeat-1:ctx-6004a86b) (logid:64e4a3b3) End cleanup expired async-jobs INFO [c.c.h.v.m.HostMO] (DirectAgent-19:ctx-351cbbac host01.cloud.local, cmd: GetVmStatsCommand) (logid:56c8e250) VM i-2-124-VM not found in host cache INFO [c.c.h.v.m.HostMO] (DirectAgent-1:ctx-9c0f5042 host02.cloud.local, cmd: GetVmStatsCommand) (logid:56c8e250) VM i-2-125-VM not found in host cache ERROR [c.c.u.d.ConnectionConcierge] (ConnectionConcierge-1:ctx-544154d7) (logid:bd0f585a) Unable to keep the db connection for LockMaster1 com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure The last packet successfully received from the server was 19,479 milliseconds ago. The last packet sent successfully to the server was 19,479 milliseconds ago. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at com.mysql.jdbc.Util.handleNewInstance(Util.java:411) at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1116) at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3352) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1971) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2151) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2625) at com.mysql.jdbc.LoadBalancedMySQLConnection.execSQL(LoadBalancedMySQLConnection.java:155) at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119) at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:2283) at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.mysql.jdbc.LoadBalancingConnectionProxy$ConnectionErrorFiringInvocationHandler.invoke(LoadBalancingConnectionProxy.java:103) at com.mysql.jdbc.FailoverConnectionProxy$FailoverInvocationHandler.invoke(FailoverConnectionProxy.java:51) at com.sun.proxy.$Proxy257.executeQuery(Unknown Source) at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96) at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96) at com.cloud.utils.db.ConnectionConcierge$ConnectionConciergeManager.testValidity(ConnectionConcierge.java:148) at com.cloud.utils.db.ConnectionConcierge$ConnectionConciergeManager$1.runInContext(ConnectionConcierge.java:203) at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:473) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622) at java.lang.Thread.run(Thread.java:748) Caused by: java.net.SocketException: Broken pipe (Write failed) at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:115) at java.net.SocketOutputStream.write(SocketOutputStream.java:161) ... 36 more On Mon, Dec 18, 2017 at 2:03 PM, Rafael Weingärtner <rafaelweingart...@gmail.com> wrote: > Here is a fix: > https://www.dropbox.com/s/kgakhs3v05uz88x/cloud-framework-cluster-4.9.3.0.jar?dl=1 > You need to replace this jar file in CloudStack installation. You should > also backup the original jar and restore it as soon as you finish testing. > To replace the JARs, you need to stop ACS, and just then start it. > > If everything works fine, I will open a PR against master, and with a bit > of luck we can push it into 4.11 > > On Sat, Dec 16, 2017 at 8:03 AM, Alireza Eskandari <astro.alir...@gmail.com> > wrote: > >> I'm using CS 4.9.3.0-shapeblue0 >> >> On Sat, Dec 16, 2017 at 12:49 PM, Rafael Weingärtner >> <rafaelweingart...@gmail.com> wrote: >> > Awesome! >> > I found one method that might seem the cause of the problem. >> > What is the version of ACS that you are using? >> > >> > On Sat, Dec 16, 2017 at 4:10 AM, Alireza Eskandari < >> astro.alir...@gmail.com> >> > wrote: >> > >> >> Hi >> >> >> >> Gabriel, >> >> My configuration is same as your suggestion, but I get the errors. >> >> >> >> Rafael, >> >> You are right. I confirm that CS works normally but I get those >> warnings. >> >> I would make me happy to help you for this fix :) >> >> >> >> >> >> On Tue, Dec 12, 2017 at 3:30 PM, Rafael Weingärtner >> >> <rafaelweingart...@gmail.com> wrote: >> >> > Alireza, >> >> > This is a warning and should not cause you much trouble. I have been >> >> trying >> >> > to pin point this problem for quite some time now. >> >> > If I generate a fix, would you be willing to test it? >> >> > >> >> > On Tue, Dec 12, 2017 at 8:56 AM, Gabriel Beims Bräscher < >> >> > gabrasc...@gmail.com> wrote: >> >> > >> >> >> Hi Alireza, >> >> >> >> >> >> I have production environments with Master to Master replication and >> >> >> we have no problems. We may need more details of your configuration. >> >> >> Have you configured the slave database? Are you sure that you >> configured >> >> >> correctly the ha heuristic? >> >> >> >> >> >> Considering that you already configured replication and "my.cnf", I >> will >> >> >> focus on the CloudSack db.properties file. >> >> >> >> >> >> When configuring Master-Master replication, you should have at >> >> >> /etc/cloudstack/management/db.properties something like: >> >> >> ----------------------------- >> >> >> db.cloud.autoReconnectForPools=true >> >> >> >> >> >> #High Availability And Cluster Properties >> >> >> db.ha.enabled=true >> >> >> >> >> >> db.cloud.queriesBeforeRetryMaster=5000 >> >> >> db.usage.failOverReadOnly=false >> >> >> db.cloud.slaves=acs-db-02 >> >> >> >> >> >> cluster.node.IP=<cluster node IP> >> >> >> >> >> >> db.usage.autoReconnect=true >> >> >> >> >> >> db.cloud.host=acs-db-01 >> >> >> db.usage.host=acs-db-01 >> >> >> >> >> >> #db.ha.loadBalanceStrategy=com.mysql.jdbc.SequentialBalanceStrategy >> >> >> db.ha.loadBalanceStrategy=com.cloud.utils.db.StaticStrategy >> >> >> >> >> >> db.cloud.failOverReadOnly=false >> >> >> db.usage.slaves=acs-db-02 >> >> >> ----------------------------- >> >> >> >> >> >> "db.ha.loadBalanceStrategy" is confiugured with the heuristic >> >> >> "com.cloud.utils.db.StaticStrategy" >> >> >> >> >> >> "db.ha.enabled" need to be “true” >> >> >> >> >> >> The primary database is configured with the variable “db.cloud.host”. >> >> The >> >> >> secondary database(s) is(are) configured with the variable >> >> >> “db.usage.slaves”. One variable that is different from both Apache >> >> >> CloudStack servers is “cluster.node.IP”, being the ACS mgt IP. >> >> >> Additionally, you will need to create a folder >> >> >> “/usr/share/cloudstack-mysql-ha/lib/” and move the jar file >> >> >> “cloud-plugin-database-mysqlha-4.9.3.0.jar” into the new folder. >> >> >> >> >> >> ----------------------------- >> >> >> mkdir -p /usr/share/cloudstack-mysql-ha/lib/ >> >> >> cp >> >> >> /usr/share/cloudstack-management/webapps/client/WEB- >> >> >> INF/lib/cloud-plugin-database-mysqlha-4.9.3.0.jar >> >> >> /usr/share/cloudstack-mysql-ha/lib/ >> >> >> ----------------------------- >> >> >> >> >> >> Cheers, >> >> >> Gabriel. >> >> >> >> >> >> 2017-12-12 6:30 GMT-02:00 Alireza Eskandari <astro.alir...@gmail.com >> >: >> >> >> >> >> >> > I have opened a new jira ticket about this problem: >> >> >> > https://issues.apache.org/jira/browse/CLOUDSTACK-10186 >> >> >> > >> >> >> >> >> > >> >> > >> >> > >> >> > -- >> >> > Rafael Weingärtner >> >> >> > >> > >> > >> > -- >> > Rafael Weingärtner >> > > > > -- > Rafael Weingärtner