[
https://issues.apache.org/jira/browse/CLOUDSTACK-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15696396#comment-15696396
]
ASF GitHub Bot commented on CLOUDSTACK-9595:
--------------------------------------------
Github user serg38 commented on the issue:
https://github.com/apache/cloudstack/pull/1762
@rafaelweingartner You might be right that pod_vlan_map should be in the
join. May be I didn't find the correct methods after all. @jburwell @rhtyd What
do you think?
I was able to find management serve log for Deadlock 1. Looks like one of
transaction came from findAndUpdateDirectAgentToLoad method in HostDaoImpl
which creates rather complex transaction:
2016-11-24 15:04:39,284 DEBUG [host.dao.HostDaoImpl] (ClusteredAgentManager
Timer:ctx-a8e9449c) Resetting hosts suitable for reconnect
2016-11-24 15:04:39,320 DEBUG [db.Transaction.Transaction]
(ClusteredAgentManager Timer:ctx-a8e9449c) Rolling back the transaction: Time =
36 Name = ClusteredAgentManager Timer; called by
-TransactionLegacy.rollback:879-TransactionLegacy.removeUpTo:822-TransactionLegacy.close:646-TransactionContextInterceptor.invoke:36-ReflectiveMethodInvocation.proceed:161-ExposeInvocationInterceptor.invoke:91-ReflectiveMethodInvocation.proceed:172-JdkDynamicAopProxy.invoke:204-$Proxy48.findAndUpdateDirectAgentToLoad:-1-ClusteredAgentManagerImpl.scanDirectAgentToLoad:195-ClusteredAgentManagerImpl.runDirectAgentScanTimerTask:185-ClusteredAgentManagerImpl.access$100:99
2016-11-24 15:04:39,322 ERROR [agent.manager.ClusteredAgentManagerImpl]
(ClusteredAgentManager Timer:ctx-a8e9449c) Unexpected exception DB Exception
on: com.mysql.jdbc.JDBC4PreparedStatement@1e58727c: SELECT host.id,
host.disconnected, host.name, host.status, host.type, host.private_ip_address,
host.private_mac_address, host.private_netmask, host.public_netmask,
host.public_ip_address, host.public_mac_address, host.storage_ip_address,
host.cluster_id, host.storage_netmask, host.storage_mac_address,
host.storage_ip_address_2, host.storage_netmask_2, host.storage_mac_address_2,
host.hypervisor_type, host.proxy_port, host.resource, host.fs_type,
host.available, host.setup, host.resource_state, host.hypervisor_version,
host.update_count, host.uuid, host.data_center_id, host.pod_id,
host.cpu_sockets, host.cpus, host.url, host.speed, host.ram, host.parent,
host.guid, host.capabilities, host.total_size, host.last_ping,
host.mgmt_server_id, host.dom0_memory, host.version, host.created, host.removed
FROM host WHERE host.resource IS NOT NULL AND host.mgmt_server_id =
345048964870 AND host.last_ping <= 1445339907 AND host.cluster_id IS NOT NULL
AND host.status IN ('Disconnected','Down','Alert') AND host.removed IS NULL
FOR UPDATE
Caused by:
com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock
found when trying to get lock; try restarting transaction
Beginning of second transaction was
SELECT host.id, host.disconnected, host.name, host.status, host.type,
host.private_ip_address, host.private_mac_address, host.private_netmask,
host.public_netmask, host.public_ip_address, host.public_mac_address,
host.storage_ip_address, host.cluster_id, host.storage_netmask,
host.storage_mac_address, host.storage_ip_address_2, host.storage_netmask_2,
host.storage_mac_address_2, host.hypervisor_type, host.proxy_port,
host.resource, host.fs_type, host.available, host.setup, host.resource_state,
host.hypervisor_version, host.update_count, host.uuid, host.data_center_id,
host.pod_id, host.cpu_sockets, host.cpus, host.url, host.speed, host.ram,
host.parent, host.guid, host.capabilities, host.total_size, host.last_ping,
host.mgmt_server_id, host.dom0_memory, host.version, host.created, host.removed
FROM host LEFT OUTER JOIN op_host_transfer ON host.id=op_host_transfer.id IN
I will try to trace it to the ACS method.
> Transactions are not getting retried in case of database deadlock errors
> ------------------------------------------------------------------------
>
> Key: CLOUDSTACK-9595
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9595
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Affects Versions: 4.8.0
> Reporter: subhash yedugundla
> Fix For: 4.8.1
>
>
> Customer is seeing occasional error 'Deadlock found when trying to get lock;
> try restarting transaction' messages in their management server logs. It
> happens regularly at least once a day. The following is the error seen
> 2015-12-09 19:23:19,450 ERROR [cloud.api.ApiServer]
> (catalina-exec-3:ctx-f05c58fc ctx-39c17156 ctx-7becdf6e) unhandled exception
> executing api command: [Ljava.lang.String;@230a6e7f
> com.cloud.utils.exception.CloudRuntimeException: DB Exception on:
> com.mysql.jdbc.JDBC4PreparedStatement@74f134e3: DELETE FROM
> instance_group_vm_map WHERE instance_group_vm_map.instance_id = 941374
> at com.cloud.utils.db.GenericDaoBase.expunge(GenericDaoBase.java:1209)
> at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
> at
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
> at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
> at
> com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
> at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
> at
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
> at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
> at
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
> at com.sun.proxy.$Proxy237.expunge(Unknown Source)
> at
> com.cloud.vm.UserVmManagerImpl$2.doInTransactionWithoutResult(UserVmManagerImpl.java:2593)
> at
> com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25)
> at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:57)
> at com.cloud.utils.db.Transaction.execute(Transaction.java:45)
> at com.cloud.utils.db.Transaction.execute(Transaction.java:54)
> at
> com.cloud.vm.UserVmManagerImpl.addInstanceToGroup(UserVmManagerImpl.java:2575)
> at
> com.cloud.vm.UserVmManagerImpl.updateVirtualMachine(UserVmManagerImpl.java:2332)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)