Hi,

I have been able to solve the problem by using the config setting specified
in the BTM jira http://jira.codehaus.org/browse/BTM-35

The original problem is still a question unanswered. Why is the connection
retrieved during deleteJob(..) is not enlisted in the transaction when
using Geronimo TM?

regards,
sathwik

On Wed, Jun 26, 2013 at 3:37 PM, Sathwik B P <sathwik...@gmail.com> wrote:

> Hi,
>
> I am able to see this problem on the trunk. PFA the logs.
>
> When the job is being executed by the scheduler RunJob.call()
>
> 1) if (!_db.deleteJob(job.jobId, _nodeId))
>           throw new JobNoLongerInDbException(job.jobId, _nodeId);
> 2) processor.onScheduledJob(jobInfo);
>
> The job is deleted first and then the processor.onScheduledJob(jobInfo)
> throws an exception in this case due to non availability of instance lock.
> This exception gets caught in the execTransaction() and goes in for a
> transaction rollback, followed by a retry. But the rollback doesn't reverse
> the deleted job. And during the second iteration it throws a
> JobNoLongerInDbException.
>
> line 286: 2013-06-24 12:00:27,803 6058570 DEBUG
> [org.apache.ode.scheduler.simple.SimpleScheduler] (ODEServer-18:) Job :
> hqejbhcnphr8dw7m78o3qq has been deleted
> line 291: 2013-06-24 12:00:27,804 6058571 DEBUG
> [org.apache.ode.bpel.engine.BpelEngineImpl] (ODEServer-18:) Instance 2608
> is busy, rescheduling job.
> line 294: org.apache.ode.bpel.iapi.Scheduler$JobProcessorException
> line 313: 2013-06-24 12:00:28,804 6059571 DEBUG
> [org.apache.ode.scheduler.simple.SimpleScheduler] (ODEServer-18:) Beginning
> a new transaction
> line 314: 2013-06-24 12:00:28,805 6059572 DEBUG
> [org.apache.ode.scheduler.simple.JdbcDelegate] (ODEServer-18:) deleteJob
> hqejbhcnphr8dw7m78o3qq on node hqejbhcnphr8dw7m78o3p4
> line 318: org.apache.ode.scheduler.simple.JobNoLongerInDbException: Job no
> longer in db: hqejbhcnphr8dw7m78o3qq nodeId=hqejbhcnphr8dw7m78o3p4
>
>
> I then integrated Bitronix transaction manager in my tomcat. Bitronix is
> also complaining that a second new connection is being enlisted.
>
> 2013-06-26 13:54:38,604 89798 DEBUG
> [org.apache.ode.scheduler.simple.JdbcDelegate] (ODEServer-2:) insertJob
> hqejbhcnphr8e518sj4h72 on node hqejbhcnphr8e518sj4h70 loaded=true
> 2013-06-26 13:54:38,607 89801 ERROR
> [org.apache.ode.scheduler.simple.SimpleScheduler] (ODEServer-2:) Database
> error.
> org.apache.ode.scheduler.simple.DatabaseException: java.sql.SQLException:
> error enlisting a JdbcConnectionHandle of a JdbcPooledConnection from
> datasource jdbc/ode in state ACCESSIBLE with usage count 1 wrapping a JDBC
> LrcXAConnection on com.mysql.jdbc.JDBC4Connection@49d456d0 on a JDBC
> LrcConnectionHandle on a JDBC LrcXAResource in state NO_TX
>
>
> regards,
> sathwik
>
>
> On Sun, Mar 24, 2013 at 6:13 PM, Sathwik B P <sathwik...@gmail.com> wrote:
>
>> Hi,
>> The transaction manager is initialised even before the database config is
>> loaded. Without initialization of the transaction manager the ode server
>> will not start. Irrespective of the DB mode the transaction manager will be
>> set on the Database object that has being configured, be it internal or
>> external.
>> Please check the init(..) method in org.apache.ode.axis2.ODEServer
>>
>> init(...){
>>     initTxMgr()
>>     initDataSource();
>>     initDAO();
>> }
>>
>> Please share your configuration files and logs files.
>>
>> regards,
>> sathwik
>>
>>
>> On Sat, Mar 23, 2013 at 9:32 PM, Nandika Jayawardana 
>> <jayaw...@gmail.com>wrote:
>>
>>> I did some further checking and with the database configured as db.mode
>>> internal , everything works fine. For the internal db mode, db
>>> connections
>>> are associated with default geronimo transaction manager ( Database.java
>>> ).
>>> I think for the default external db configuration given , this does not
>>> happen and hence the issue.
>>>
>>> Regards
>>> Nandika
>>>
>>> On Fri, Mar 1, 2013 at 2:17 PM, Nandika Jayawardana <jayaw...@gmail.com
>>> >wrote:
>>>
>>> > Hi Sathwik,
>>> >
>>> > I am running ode with tomcat 7.0.29 and mysql 5.5.29 version. I used
>>> the
>>> > configuration settings given under "Configuring ODE in Tomcat with
>>> MySql
>>> > database". from ode war deployment guide. {
>>> > http://ode.apache.org/war-deployment.html }.
>>> > As you have explained, when the JobProcessorException is thrown due to
>>> >  instance lock timeout , the transaction will be rollback and default
>>> retry
>>> > setting of 3 times will happen. However, the restoration of the
>>> deleted job
>>> > back to job table does not happen. Therefore subsequent retries will
>>> also
>>> > result in JobNoLongerInDbException. At execTransaction method, when the
>>> > retry loop is over, the exception thrown will also be
>>> >  JobNoLongerInDbException. Since this exception is caught at  "catch
>>> > (JobNoLongerInDbException jde) " block, it will never go into the
>>> > exponential back off setting.
>>> >
>>> > Is there any additional configuration settings I need to do ?
>>> >
>>> > Regards
>>> > Nandika
>>> >
>>> >
>>> > On Fri, Mar 1, 2013 at 12:21 PM, Sathwik B P <sathwik...@gmail.com>
>>> wrote:
>>> >
>>> >> Hi,
>>> >> This is really strange.
>>> >>
>>> >> This is the ideal behaviour:
>>> >> If a job fails for any reason it gets retried defined by the parameter
>>> >> (immediateRetryCount default 3 times with a time interval
>>> >> _immediateTransactionRetryInterval default 1 sec) and then the
>>> scheduler
>>> >> will put it on a exponential backoff defined by pow(5,retryCount)
>>> where
>>> >> retryCount is <= 10.
>>> >>
>>> >> If the rollback doesn't happen incase of any exception then none of
>>> the
>>> >> jobs will ever complete since it will never go into the exponential
>>> >> backoff
>>> >> path.
>>> >>
>>> >> In my opinion the transaction manager will maintain the jdbc
>>> connection
>>> >> object throughout it's execution, no matter how many times the
>>> connection
>>> >> is borrowed during the transaction.
>>> >>
>>> >> Which database are you using and what configuration changes have you
>>> done
>>> >> in ode-axis.properties.
>>> >>
>>> >> regards,
>>> >> sathwik
>>> >>
>>> >> On Fri, Mar 1, 2013 at 1:31 AM, Nandika Jayawardana <
>>> jayaw...@gmail.com
>>> >> >wrote:
>>> >>
>>> >> > Hi All,
>>> >> > I am running ode trunk build with apache tomcat as described in [1]
>>> . I
>>> >> > have an asynchronous bpel process which has a receive, invoke and a
>>> >> > receive. When I run this process for a while, I see that there are
>>> few
>>> >> > incomplete instances, although all the expected messages reached
>>> ode.
>>> >> From
>>> >> > the debug logs, I figured that it is happening as follows.
>>> >> >
>>> >> >   If a thread executing a job tries to acquire the process instance
>>> >> lock,
>>> >> > while another thread is executing on the same instance and times
>>> out, it
>>> >> > will throw a timeout exception at InstanceLockManager which will be
>>> >> wrapped
>>> >> > to a  JobProcessorException.
>>> >> >
>>> >> > In SimpleScheduler, RunJob.call method, when the execution of a job
>>> >> starts,
>>> >> > it will try to delete the job from the db. For the initial try, it
>>> >> > would succeed since the job is in db. However, when
>>> >> > the JobProcessorException exception happens due to timeout on
>>> instance
>>> >> > lock, the transaction gets rolled back. Ideally, the job should be
>>> >> restored
>>> >> > back when the rollback happens. However, the job does not get
>>> restored
>>> >> to
>>> >> > db as the transaction manager and db resources are not associated.
>>> Hence
>>> >> > when the scheduler  tries to retry 3 times by default, it will fail
>>> with
>>> >> > job no longer in db error.  This results in few of the process
>>> instances
>>> >> > never completing since the job was abandoned even though the
>>> messages
>>> >> > reached ode.
>>> >> >
>>> >> > Following log extracts from the ode log explains the scenario.
>>> >> >
>>> >> > grep instanceid
>>> >> >
>>> >> > 16:36:12,115 ODEServer-78 DEBUG [InstanceLockManager]
>>> >> > Thread[ODEServer-78,5,main]: lock(iid=36423, time=1MICROSECONDS)
>>> >> > 16:36:12,115 ODEServer-78 DEBUG [InstanceLockManager]
>>> >> > Thread[ODEServer-78,5,main]: lock(iid=36423,
>>> >> > time=1MICROSECONDS)-->WAITING(held by Thread[ODEServer-9,5,main])
>>> >> > 16:36:12,115 ODEServer-78 DEBUG [InstanceLockManager]
>>> >> > Thread[ODEServer-78,5,main]: lock(iid=36423,
>>> >> time=1MICROSECONDS)-->TIMEOUT
>>> >> > (held by Thread[ODEServer-9,5,main])
>>> >> > 16:36:12,115 ODEServer-78 DEBUG [BpelEngineImpl] Instance 36423 is
>>> busy,
>>> >> > rescheduling job.
>>> >> > 16:36:12,239 ODEServer-9 DEBUG [InstanceLockManager]
>>> >> > Thread[ODEServer-9,5,main]: unlock(iid=36423)
>>> >> > 16:36:15,120 ODEServer-78 DEBUG [SimpleScheduler] job no longer in
>>> db
>>> >> > forced rollback: Job hqejbhcnphr8357nokgnxp time: 2013-02-28
>>> 16:36:11
>>> >> IST
>>> >> > transacted: true persisted: true details: JobDetails( instanceId:
>>> 36423
>>> >> > mexId: null processId: null type: MATCHER channel: null
>>> correlatorId:
>>> >> > DebugCallbackPL.debugOpCallback correlationKeySet:
>>> >> > @2[CorrelationSet~746ee3bf-4c4c-4da9-bdb0-233a760ce377] retryCount:
>>> null
>>> >> > inMem: false detailsExt: {})
>>> >> >
>>> >> > grep jobid
>>> >> >
>>> >> > 16:36:11,960 ODEServer-9 DEBUG [JdbcDelegate] insertJob
>>> >> > hqejbhcnphr8357nokgnxp on node hqejbhcnphr8357nokgj94 loaded=true
>>> >> > 16:36:12,007 ODEServer-1 DEBUG [SimpleScheduler] todo.enqueue job
>>> from
>>> >> db:
>>> >> > hqejbhcnphr8357nokgnxp for 1362049571960(16:36:11,960)
>>> >> > 16:36:12,007 ODEServer-78 DEBUG [JdbcDelegate] deleteJob
>>> >> > hqejbhcnphr8357nokgnxp on node hqejbhcnphr8357nokgj94
>>> >> > 16:36:12,032 ODEServer-9 DEBUG [SimpleScheduler] scheduled immediate
>>> >> job:
>>> >> > hqejbhcnphr8357nokgnxp
>>> >> > 16:36:12,239 ODEServer-9 DEBUG [SimpleScheduler] Job
>>> >> hqejbhcnphr8357nokgnxp
>>> >> > is being processed (outstanding job)
>>> >> > 16:36:13,116 ODEServer-78 DEBUG [JdbcDelegate] deleteJob
>>> >> > hqejbhcnphr8357nokgnxp on node hqejbhcnphr8357nokgj94
>>> >> > org.apache.ode.scheduler.simple.JobNoLongerInDbException: Job no
>>> longer
>>> >> in
>>> >> > db: hqejbhcnphr8357nokgnxp nodeId=hqejbhcnphr8357nokgj94
>>> >> > 16:36:14,118 ODEServer-78 DEBUG [JdbcDelegate] deleteJob
>>> >> > hqejbhcnphr8357nokgnxp on node hqejbhcnphr8357nokgj94
>>> >> > org.apache.ode.scheduler.simple.JobNoLongerInDbException: Job no
>>> longer
>>> >> in
>>> >> > db: hqejbhcnphr8357nokgnxp nodeId=hqejbhcnphr8357nokgj94
>>> >> > 16:36:15,119 ODEServer-78 DEBUG [JdbcDelegate] deleteJob
>>> >> > hqejbhcnphr8357nokgnxp on node hqejbhcnphr8357nokgj94
>>> >> > 16:36:15,120 ODEServer-78 DEBUG [SimpleScheduler] job no longer in
>>> db
>>> >> > forced rollback: Job hqejbhcnphr8357nokgnxp time: 2013-02-28
>>> 16:36:11
>>> >> IST
>>> >> > transacted: true persisted: true details: JobDetails( instanceId:
>>> 36423
>>> >> > mexId: null processId: null type: MATCHER channel: null
>>> correlatorId:
>>> >> > DebugCallbackPL.debugOpCallback correlationKeySet:
>>> >> > @2[CorrelationSet~746ee3bf-4c4c-4da9-bdb0-233a760ce377] retryCount:
>>> null
>>> >> > inMem: false detailsExt: {})
>>> >> >
>>> >> >
>>> >> > Is this the expected behavior or is there any additional settings i
>>> >> should
>>> >> > configure to make transaction manager restore job to db at rollback
>>> ?
>>> >> > Will reinserting the job back to db when the  JobProcessorException
>>> >> happens
>>> >> >  fix this problem ?
>>> >> >
>>> >> > Regards
>>> >> > Nandika
>>> >> >
>>> >> > [1] http://ode.apache.org/war-deployment.html
>>> >> >
>>> >>
>>> >
>>> >
>>>
>>
>>
>

Reply via email to