I did some further checking and with the database configured as db.mode internal , everything works fine. For the internal db mode, db connections are associated with default geronimo transaction manager ( Database.java ). I think for the default external db configuration given , this does not happen and hence the issue.
Regards Nandika On Fri, Mar 1, 2013 at 2:17 PM, Nandika Jayawardana <jayaw...@gmail.com>wrote: > Hi Sathwik, > > I am running ode with tomcat 7.0.29 and mysql 5.5.29 version. I used the > configuration settings given under "Configuring ODE in Tomcat with MySql > database". from ode war deployment guide. { > http://ode.apache.org/war-deployment.html }. > As you have explained, when the JobProcessorException is thrown due to > instance lock timeout , the transaction will be rollback and default retry > setting of 3 times will happen. However, the restoration of the deleted job > back to job table does not happen. Therefore subsequent retries will also > result in JobNoLongerInDbException. At execTransaction method, when the > retry loop is over, the exception thrown will also be > JobNoLongerInDbException. Since this exception is caught at "catch > (JobNoLongerInDbException jde) " block, it will never go into the > exponential back off setting. > > Is there any additional configuration settings I need to do ? > > Regards > Nandika > > > On Fri, Mar 1, 2013 at 12:21 PM, Sathwik B P <sathwik...@gmail.com> wrote: > >> Hi, >> This is really strange. >> >> This is the ideal behaviour: >> If a job fails for any reason it gets retried defined by the parameter >> (immediateRetryCount default 3 times with a time interval >> _immediateTransactionRetryInterval default 1 sec) and then the scheduler >> will put it on a exponential backoff defined by pow(5,retryCount) where >> retryCount is <= 10. >> >> If the rollback doesn't happen incase of any exception then none of the >> jobs will ever complete since it will never go into the exponential >> backoff >> path. >> >> In my opinion the transaction manager will maintain the jdbc connection >> object throughout it's execution, no matter how many times the connection >> is borrowed during the transaction. >> >> Which database are you using and what configuration changes have you done >> in ode-axis.properties. >> >> regards, >> sathwik >> >> On Fri, Mar 1, 2013 at 1:31 AM, Nandika Jayawardana <jayaw...@gmail.com >> >wrote: >> >> > Hi All, >> > I am running ode trunk build with apache tomcat as described in [1] . I >> > have an asynchronous bpel process which has a receive, invoke and a >> > receive. When I run this process for a while, I see that there are few >> > incomplete instances, although all the expected messages reached ode. >> From >> > the debug logs, I figured that it is happening as follows. >> > >> > If a thread executing a job tries to acquire the process instance >> lock, >> > while another thread is executing on the same instance and times out, it >> > will throw a timeout exception at InstanceLockManager which will be >> wrapped >> > to a JobProcessorException. >> > >> > In SimpleScheduler, RunJob.call method, when the execution of a job >> starts, >> > it will try to delete the job from the db. For the initial try, it >> > would succeed since the job is in db. However, when >> > the JobProcessorException exception happens due to timeout on instance >> > lock, the transaction gets rolled back. Ideally, the job should be >> restored >> > back when the rollback happens. However, the job does not get restored >> to >> > db as the transaction manager and db resources are not associated. Hence >> > when the scheduler tries to retry 3 times by default, it will fail with >> > job no longer in db error. This results in few of the process instances >> > never completing since the job was abandoned even though the messages >> > reached ode. >> > >> > Following log extracts from the ode log explains the scenario. >> > >> > grep instanceid >> > >> > 16:36:12,115 ODEServer-78 DEBUG [InstanceLockManager] >> > Thread[ODEServer-78,5,main]: lock(iid=36423, time=1MICROSECONDS) >> > 16:36:12,115 ODEServer-78 DEBUG [InstanceLockManager] >> > Thread[ODEServer-78,5,main]: lock(iid=36423, >> > time=1MICROSECONDS)-->WAITING(held by Thread[ODEServer-9,5,main]) >> > 16:36:12,115 ODEServer-78 DEBUG [InstanceLockManager] >> > Thread[ODEServer-78,5,main]: lock(iid=36423, >> time=1MICROSECONDS)-->TIMEOUT >> > (held by Thread[ODEServer-9,5,main]) >> > 16:36:12,115 ODEServer-78 DEBUG [BpelEngineImpl] Instance 36423 is busy, >> > rescheduling job. >> > 16:36:12,239 ODEServer-9 DEBUG [InstanceLockManager] >> > Thread[ODEServer-9,5,main]: unlock(iid=36423) >> > 16:36:15,120 ODEServer-78 DEBUG [SimpleScheduler] job no longer in db >> > forced rollback: Job hqejbhcnphr8357nokgnxp time: 2013-02-28 16:36:11 >> IST >> > transacted: true persisted: true details: JobDetails( instanceId: 36423 >> > mexId: null processId: null type: MATCHER channel: null correlatorId: >> > DebugCallbackPL.debugOpCallback correlationKeySet: >> > @2[CorrelationSet~746ee3bf-4c4c-4da9-bdb0-233a760ce377] retryCount: null >> > inMem: false detailsExt: {}) >> > >> > grep jobid >> > >> > 16:36:11,960 ODEServer-9 DEBUG [JdbcDelegate] insertJob >> > hqejbhcnphr8357nokgnxp on node hqejbhcnphr8357nokgj94 loaded=true >> > 16:36:12,007 ODEServer-1 DEBUG [SimpleScheduler] todo.enqueue job from >> db: >> > hqejbhcnphr8357nokgnxp for 1362049571960(16:36:11,960) >> > 16:36:12,007 ODEServer-78 DEBUG [JdbcDelegate] deleteJob >> > hqejbhcnphr8357nokgnxp on node hqejbhcnphr8357nokgj94 >> > 16:36:12,032 ODEServer-9 DEBUG [SimpleScheduler] scheduled immediate >> job: >> > hqejbhcnphr8357nokgnxp >> > 16:36:12,239 ODEServer-9 DEBUG [SimpleScheduler] Job >> hqejbhcnphr8357nokgnxp >> > is being processed (outstanding job) >> > 16:36:13,116 ODEServer-78 DEBUG [JdbcDelegate] deleteJob >> > hqejbhcnphr8357nokgnxp on node hqejbhcnphr8357nokgj94 >> > org.apache.ode.scheduler.simple.JobNoLongerInDbException: Job no longer >> in >> > db: hqejbhcnphr8357nokgnxp nodeId=hqejbhcnphr8357nokgj94 >> > 16:36:14,118 ODEServer-78 DEBUG [JdbcDelegate] deleteJob >> > hqejbhcnphr8357nokgnxp on node hqejbhcnphr8357nokgj94 >> > org.apache.ode.scheduler.simple.JobNoLongerInDbException: Job no longer >> in >> > db: hqejbhcnphr8357nokgnxp nodeId=hqejbhcnphr8357nokgj94 >> > 16:36:15,119 ODEServer-78 DEBUG [JdbcDelegate] deleteJob >> > hqejbhcnphr8357nokgnxp on node hqejbhcnphr8357nokgj94 >> > 16:36:15,120 ODEServer-78 DEBUG [SimpleScheduler] job no longer in db >> > forced rollback: Job hqejbhcnphr8357nokgnxp time: 2013-02-28 16:36:11 >> IST >> > transacted: true persisted: true details: JobDetails( instanceId: 36423 >> > mexId: null processId: null type: MATCHER channel: null correlatorId: >> > DebugCallbackPL.debugOpCallback correlationKeySet: >> > @2[CorrelationSet~746ee3bf-4c4c-4da9-bdb0-233a760ce377] retryCount: null >> > inMem: false detailsExt: {}) >> > >> > >> > Is this the expected behavior or is there any additional settings i >> should >> > configure to make transaction manager restore job to db at rollback ? >> > Will reinserting the job back to db when the JobProcessorException >> happens >> > fix this problem ? >> > >> > Regards >> > Nandika >> > >> > [1] http://ode.apache.org/war-deployment.html >> > >> > >