[ 
https://issues.apache.org/jira/browse/SYNCOPE-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16278436#comment-16278436
 ] 

gonzalad commented on SYNCOPE-1244:
-----------------------------------

Hi Francesco, thanks for your great support and sorry for the delay.

The patch works (I've restarted a few time, and didn't get the issue).
But, looking at the code, it doesn't solve all cases (I think it diminishes the 
failure window, but the window is there even if tiny).

i.e. in JobManagerImpl#load, we could have the following case :
1 - instance 1 *and* instance 2 call register(task) at a similar time (hence 
both call registerJob)
2 - both instance call isRunning check at a similar time, so the job is not 
running.
3 - and then we run the risk of having the initial error (call to SELECT * FROM 
QRTZ_LOCKS WHERE SCHED_NAME = 'scheduler' AND LOCK_NAME = ? FOR UPDATE by both 
instance)
=> at this moment, one of the 2 transaction instance is maked as failed by 
postgres.
4 - the failed instance calls reportDAO.findAll
    This generates an error and the instance startup fails.

I'm attaching a patch SYNCOPE-1244-2.diff (based on yours) that should solve 
this issue.
My patch code is brittle, another solution would be just to stop load method 
execution on first error.

wdyt ?


> Error creating bean with name 'logicInitializer' on startup related to quartz 
> clustering 
> -----------------------------------------------------------------------------------------
>
>                 Key: SYNCOPE-1244
>                 URL: https://issues.apache.org/jira/browse/SYNCOPE-1244
>             Project: Syncope
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 2.0.2
>            Reporter: gonzalad
>             Fix For: 2.0.7, 2.1.0
>
>         Attachments: SYNCOPE-1244-2.diff, SYNCOPE-1244.diff, TPSVC-3882.zip
>
>
> I use syncope 2.0.2 in high availability mode (4 instances).
> I'm using syncope with postgresql db.
> Sometimes on startup I have the following error, and the instance fails to 
> start:
> {code}
> 17-12-03 18:39:10.781 ERROR - [syncope] - [localhost-startStop-5] 
> org.springframework.web.context.ContextLoader - Context initialization failed
> org.springframework.beans.factory.BeanCreationException: Error creating bean 
> with name 'logicInitializer' defined in URL 
> [jar:file:/usr/local/tomcat/webapps/syncope/WEB-INF/lib/syncope-core-logic-2.0.2_1.jar!/org/apache/syncope/core/logic/init/LogicInitializer.class]:
>  Invocation of init method failed; nested exception is 
> <openjpa-2.4.2-r422266:1777108 fatal general error> 
> org.apache.openjpa.persistence.PersistenceException: ERROR: current 
> transaction is aborted, commands ignored until end of transaction block 
> {prepstmnt 977966121 SELECT t0.id, t0.active, t0.cronExpression, t0.name, 
> t1.id, t1.csvTemplate, t1.foTemplate, t1.htmlTemplate FROM Report t0 INNER 
> JOIN ReportTemplate t1 ON t0.template_id = t1.id} [code=0, state=25P02]
> FailedObject: SELECT e FROM JPAReport e [java.lang.String]
>       at 
> org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1628)
>  ~[spring-beans-4.3.6.RELEASE.jar:4.3.6.RELEASE]
>       at 
> org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:555)
>  ~[spring-beans-4.3.6.RELEASE.jar:4.3.6.RELEASE]
>       at 
> org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:483)
>  ~[
> Caused by: org.apache.openjpa.persistence.PersistenceException: ERROR: 
> current transaction is aborted, commands ignored until end of transaction 
> block {prepstmnt 977966121 SELECT t0.id, t0.active, t0.cronExpression, 
> t0.name, t1.id, t1.csvTemplate, t1.foTemplate, t1.htmlTemplate FROM Report t0 
> INNER JOIN ReportTemplate t1 ON t0.template_id = t1.id} [code=0, state=25P02]
>       at 
> org.apache.openjpa.jdbc.sql.DBDictionary.narrow(DBDictionary.java:5003) 
> ~[openjpa-jdbc-2.4.2.jar:2.4.2]
> ...
>       at 
> org.apache.openjpa.persistence.QueryImpl.getResultList(QueryImpl.java:290) 
> ~[openjpa-persistence-2.4.2.jar:2.4.2]
>       at 
> org.apache.syncope.core.persistence.jpa.dao.JPAReportDAO.findAll(JPAReportDAO.java:55)
>  ~[syncope-core-persistence-jpa-2.0.2_1.jar:2.0.2_1]
> ...
>       at 
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213)
>  ~[spring-aop-4.3.6.RELEASE.jar:4.3.6.RELEASE]
>       at com.sun.proxy.$Proxy427.findAll(Unknown Source) ~[?:?]
>       at 
> org.apache.syncope.core.logic.init.JobManagerImpl$3.exec(JobManagerImpl.java:327)
>  ~[syncope-core-logic-2.0.2_1.jar:2.0.2_1]
>       at 
> org.apache.syncope.core.logic.init.JobManagerImpl$3.exec(JobManagerImpl.java:310)
>  ~[syncope-core-logic-2.0.2_1.jar:2.0.2_1]
>       at 
> org.apache.syncope.core.spring.security.AuthContextUtils.execWithAuthContext(AuthContextUtils.java:114)
>  ~[syncope-core-spring-2.0.2_1.jar:2.0.2_1]
>       at 
> org.apache.syncope.core.logic.init.JobManagerImpl.load(JobManagerImpl.java:310)
>  ~[syncope-core-logic-2.0.2_1.jar:2.0.2_1]
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_131]
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_131]
> ...
>       at 
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213)
>  ~[spring-aop-4.3.6.RELEASE.jar:4.3.6.RELEASE]
>       at com.sun.proxy.$Proxy428.load(Unknown Source) ~[?:?]
>       at 
> org.apache.syncope.core.logic.init.LogicInitializer.afterPropertiesSet(LogicInitializer.java:70)
>  ~[syncope-core-logic-2.0.2_1.jar:2.0.2_1]
> Caused by: org.apache.openjpa.lib.jdbc.ReportingSQLException: ERROR: current 
> transaction is aborted, commands ignored until end of transaction block 
> {prepstmnt 977966121 SELECT t0.id, t0.active, t0.cronExpression, t0.name, 
> t1.id, t1.csvTemplate, t1.foTemplate, t1.htmlTemplate FROM Report t0 INNER 
> JOIN ReportTemplate t1 ON t0.template_id = t1.id} [code=0, state=25P02]
> ...
>       at org.apache.openjpa.persistence.QueryImpl.execute(QueryImpl.java:274) 
> ~[openjpa-persistence-2.4.2.jar:2.4.2]
>       at 
> org.apache.openjpa.persistence.QueryImpl.getResultList(QueryImpl.java:290) 
> ~[openjpa-persistence-2.4.2.jar:2.4.2]
>       at 
> org.apache.syncope.core.persistence.jpa.dao.JPAReportDAO.findAll(JPAReportDAO.java:55)
>  ~[syncope-core-persistence-jpa-2.0.2_1.jar:2.0.2_1]
> ...
>       at 
> org.apache.syncope.core.logic.init.JobManagerImpl$3.exec(JobManagerImpl.java:327)
>  ~[syncope-core-logic-2.0.2_1.jar:2.0.2_1]
> {code} 
> From what I understand, this happens when one instance already holds the 
> quartz lock, hence the other instance tries to grasp it and fails with a 
> LockException.
> The JobManagerImpl executes this kind of code
> https://github.com/apache/syncope/blob/cea47da1a1510446b7bd62fbc8e45cc8757d6611/core/provisioning-java/src/main/java/org/apache/syncope/core/provisioning/java/job/JobManagerImpl.java#L326-L351
>  :
> {code}
>                     // 1. jobs for SchedTasks
>                     Set<SchedTask> tasks = new 
> HashSet<>(taskDAO.<SchedTask>findAll(TaskType.SCHEDULED));
>                     tasks.addAll(taskDAO.<PullTask>findAll(TaskType.PULL));
>                     tasks.addAll(taskDAO.<PushTask>findAll(TaskType.PUSH));
>                     for (SchedTask task : tasks) {
>                         try {
>                             register(task, task.getStartAt(), 
> conf.getRight());
>                         } catch (Exception e) {
>                             LOG.error("While loading job instance for task " 
> + task.getKey(), e);
>                         }
>                     }
>                     // 2. jobs for Reports
>                     for (Report report : reportDAO.findAll()) {
>                         try {
>                             register(report, null, conf.getRight());
>                         } catch (Exception e) {
>                             LOG.error("While loading job instance for report 
> " + report.getName(), e);
>                         }
>                     }
> {code}
> The LockException happens during the first call to register.
> The exception is caught.
> Problem is in the logged stackTrace I see:
> {code}
> 17-12-03 18:39:06.659 ERROR - [syncope] - [localhost-startStop-5] 
> org.apache.syncope.core.provisioning.api.job.JobManager - Could not remove 
> job taskJobe84fd2bc-e472-48a1-8fd2-bce472c8a1ba
> org.quartz.impl.jdbcjobstore.LockException: Failure obtaining db row lock: 
> ERROR: current transaction is aborted, commands ignored until end of 
> transaction block {prepstmnt 1348492786 SELECT * FROM QRTZ_LOCKS WHERE 
> SCHED_NAME = 'scheduler' AND LOCK_NAME = ? FOR UPDATE} [code=0, state=25P02]
> {code}
> Postgres says `commands ignored until end of transaction block`.
> And the real exception happens when we call reportDAO.findAll(), with 
> {code}
> Caused by: <openjpa-2.4.2-r422266:1777108 fatal general error> 
> org.apache.openjpa.persistence.PersistenceException: ERROR: current 
> transaction is aborted, commands ignored until end of transaction block 
> {prepstmnt 977966121 SELECT t0.id, t0.active, t0.cronExpression, t0.name, 
> t1.id, t1.csvTemplate, t1.foTemplate, t1.htmlTemplate FROM Report t0 INNER 
> JOIN ReportTemplate t1 ON t0.template_id = t1.id} [code=0, state=25P02]
> FailedObject: SELECT e FROM JPAReport e [java.lang.String]
> {code}
> So postgresql seems to fulfill its promise and raise an error on any 
> subsequent SQL statement issued under the same transaction context.
> Perhaps, we need either :
>  * to start a new transaction on each call to register.
>  * to stop the jobManagerImpl.load method execution on the first error.
>    I don't know if that would be enough (i.e. what happend when the 
> TransactionInterceptor surrounding JobManagerImpl executes - would it throw 
> an error ?) 
>  
> I'll attach the full logs (the interesting part starts at line 4405).
> Thanks,
> Adrian



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to