[
https://issues.apache.org/jira/browse/CLOUDSTACK-7749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14175321#comment-14175321
]
ASF subversion and git services commented on CLOUDSTACK-7749:
-------------------------------------------------------------
Commit dbf12d58e713ee5d3a120999cebf5a87d728b209 in cloudstack's branch
refs/heads/4.5 from [~minchen07]
[ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=dbf12d5 ]
CLOUDSTACK-7749: AsyncJob GC thread cannot purge queue items that have been
blocking for too long if exception is thrown in expunging some unfinished or
completed old jobs, this will make some future jobs stuck.
> AsyncJob GC thread cannot purge queue items that have been blocking for too
> long if exception is thrown in expunging some unfinished or completed old
> jobs, this will make some future jobs stuck.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: CLOUDSTACK-7749
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7749
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: Management Server
> Affects Versions: 4.3.0
> Reporter: Min Chen
> Assignee: Min Chen
> Priority: Critical
> Fix For: 4.5.0
>
>
> AsyncJobManager has a GC thread to clean up some unfinished job and complete
> jobs that are too old. In this same thread, we are also forcefully cancel
> blocking queue items if they've been staying there for too long. Currently if
> there is an exception thrown in expunging one job, for example, like the one
> below:
> 2014-10-14 17:57:26,347 INFO [o.a.c.f.j.i.AsyncJobManagerImpl]
> (AsyncJobMgr-Heartbeat-1:ctx-67ae4177) Expunging unfinished job AsyncJobVO
> {id:18443, userId: 60, accountId: 69, instanceType: null, instanc eId: null,
> cmd: com.cloud.vm.VmWorkStart, cmdInfo:
> rO0ABXNyABhjb20uY2xvdWQudm0uVm1Xb3JrU3RhcnR9cMGsvxz73gIACkoABGRjSWRMAAZhdm9pZHN0ADBMY29tL2Nsb3VkL2RlcGxveS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtMAAljb
>
> HVzdGVySWR0ABBMamF2YS9sYW5nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAVwb2RJZHEAfgACTAAGcG9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91dGlsL
>
> 01hcDtMAA1yZXNlcnZhdGlvbklkcQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-ZtlbwJWdrAgAESgAJYWNjb3VudElkSgAGdXNlcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB-AAN4cAAAAAAAAABFAAAAAAAAADwAAAAAAAACdXQAGVZpcnR1YWxNYWNoaW5lTWFuY
>
> WdlckltcGwAAAAAAAAAAHBwcHBwcHBzcgARamF2YS51dGlsLkhhc2hNYXAFB9rBwxZg0QMAAkYACmxvYWRGYWN0b3JJAAl0aHJlc2hvbGR4cD9AAAAAAAAMdwgAAAAQAAAAAXQAClZtUGFzc3dvcmR0ABxyTzBBQlhRQURuTmhkbVZrWDNCaGMzTjNiM0preHA,
> cmdVersi on: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
> result: null, initMsid: 244536014864905, completeMsid: null, lastUpdated:
> null, lastPolled: null, created: Wed Aug 20 18:14:13 EDT 2014}
> 2014-10-14 17:57:26,350 DEBUG [c.c.u.d.T.Transaction]
> (AsyncJobMgr-Heartbeat-1:ctx-67ae4177) Rolling back the transaction: Time = 2
> Name = AsyncJobMgr-Heartbeat-1; called by -TransactionLegacy.rollback:8
> 96-TransactionLegacy.removeUpTo:839-TransactionLegacy.close:663-TransactionContextInterceptor.invoke:35-ReflectiveMethodInvocation.proceed:161-ExposeInvocationInterceptor.invoke:91-ReflectiveMethodInvocat
> ion.proceed:172-JdkDynamicAopProxy.invoke:204-$Proxy151.expunge:-1-AsyncJobManagerImpl$8.doInTransactionWithoutResult:802-TransactionCallbackNoReturn.doInTransaction:25-Transaction$2.doInTransaction:49
> 2014-10-14 17:57:26,368 ERROR [o.a.c.f.j.i.AsyncJobManagerImpl]
> (AsyncJobMgr-Heartbeat-1:ctx-67ae4177) Unexpected exception when trying to
> execute queue item,
> com.cloud.utils.exception.CloudRuntimeException: DB Exception on:
> com.mysql.jdbc.JDBC4PreparedStatement@39fdfb52: DELETE FROM async_job WHERE
> async_job.id= 18443
> at com.cloud.utils.db.GenericDaoBase.expunge(GenericDaoBase.java:1144)
> at sun.reflect.GeneratedMethodAccessor247.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:622)
> at
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
> at
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
> at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
> at
> com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:33)
> at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
> at
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
> at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
> at
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
> at com.sun.proxy.$Proxy151.expunge(Unknown Source)
> at
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$8.doInTransactionWithoutResult(AsyncJobManagerImpl.java:802)
> at
> com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25)
> at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:49)
> at com.cloud.utils.db.Transaction.execute(Transaction.java:37)
> at com.cloud.utils.db.Transaction.execute(Transaction.java:46)
> at
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl.expungeAsyncJob(AsyncJobManagerImpl.java:799)
> at
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$7.reallyRun(AsyncJobManagerImpl.java:762)
> at
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$7.runInContext(AsyncJobManagerImpl.java:738)
> at
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:50)
> at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> at
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:47)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:701)
> Caused by:
> com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException:
> Cannot delete or update a parent row: a foreign key constraint fails
> (`cloud`.`async_job_join_map`, CONSTRAINT
> `fk_async_job_join_map__join_job_id` FOREIGN KEY (`join_job_id`) REFERENCES
> `async_job` (`id`))
> all the following purge queue item action will not get chances to run at all.
> This will cause potential job stuck issues since we are serializing VM
> operations per VM.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)