Hi Team, 

I apologize for the lengthy message. 

I am having a problem with my Taverna system. I haven't been able to get
to the bottom of what is causing the problem. Curiously, I've been using
this system for 12-16 months without problem. It's been great. However,
recently our MySQL DB was upgraded forcing a change in the required
java_connector version. I've also changed from TV 1.7.1 to TV 1.7.2.
Seemingly tiny changes.

The system in question consists of clients machines running TV 1.7.2, a
server machine with TV 1.7.2 + Remote Execution service, and a MySQL DB
instead of Derby. 

I currently use a slightly modified version (DB configure modifications
plus changes to the sizes of some of the columns) of the remote
execution service (0.5+).

The problem is the DB gets in an inappropriate state where for a
particular job, its QueueEntry id is removed but its queueEntry_id in
Job is not null. We've determined this while checking sql logs to the
MySQL DB in question. In fact, we've observed (not 100% reproducible)
that over time the QueueEntry did occasionally get deleted before
queueEntry_id is set to null. This causes failures in the remote
execution service. To get the webservice back up and accessing the DB,
one simply needs to manually set to null the "orphaned" entryQueue_id
value.

What I don't quite understand is why this is happening (some OS upgrade
was performed but it's hard for me to get specific details about what
was changed)


Example Run scenario: Six jobs launched to Taverna on the client
machine. One Queue was enabled in the execution service (remote server).
So jobs run one at a time. The first two completed fine. The third
completed ( ascertained after the fact ) but a problem occurred in the
remote execution service.No further jobs would run.


1) The exception found in catalina.out looks (in part) like this.

> Exception in thread "Console reader for Job 
> 1dbde4cf-a7bc-46cd-b2f6-957aefa2a119" 
> javax.persistence.EntityNotFoundException: Unable to find 
> net.sf.taverna.service.datastore.bean.QueueEntry with id 3
>         at 
> org.hibernate.ejb.Ejb3Configuration$Ejb3EntityNotFoundDelegate.handleEntityNotFound(Ejb3Configuration.java:107)
>         at 
> org.hibernate.event.def.DefaultLoadEventListener.load(DefaultLoadEventListener.java:145)
>         at 
> org.hibernate.event.def.DefaultLoadEventListener.proxyOrLoad(DefaultLoadEventListener.java:195)
>         at 
> org.hibernate.event.def.DefaultLoadEventListener.onLoad(DefaultLoadEventListener.java:103)
>         at org.hibernate.impl.SessionImpl.fireLoad(SessionImpl.java:878)
>         at org.hibernate.impl.SessionImpl.internalLoad(SessionImpl.java:846)
>         at 
> org.hibernate.type.EntityType.resolveIdentifier(EntityType.java:557)
>         at org.hibernate.type.EntityType.resolve(EntityType.java:379)
>         at 
> org.hibernate.engine.TwoPhaseLoad.initializeEntity(TwoPhaseLoad.java:116)
>         at 
> org.hibernate.loader.Loader.initializeEntitiesAndCollections(Loader.java:842)
>         at org.hibernate.loader.Loader.doQuery(Loader.java:717)
>         at 
> org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:224)
>         at org.hibernate.loader.Loader.loadEntity(Loader.java:1851)
>         at 
> org.hibernate.loader.entity.AbstractEntityLoader.load(AbstractEntityLoader.java:48)
>         at 
> org.hibernate.loader.entity.AbstractEntityLoader.load(AbstractEntityLoader.java:42)
(snip)

> at 
> net.sf.taverna.service.datastore.dao.jpa.GenericDaoImpl.reread(GenericDaoImpl.java:70)
>         at 
> net.sf.taverna.service.backend.executor.ConsoleReaderThread.run(ProcessJobExecutor.java:281)
> Exception in thread "Queue Monitor Thread" 
> javax.persistence.EntityNotFoundException: Unable to find 
> net.sf.taverna.service.datastore.bean.QueueEntry with id 3
>         at 
> org.hibernate.ejb.Ejb3Configuration$Ejb3EntityNotFoundDelegate.handleEntityNotFound(Ejb3Configuration.java:107)
>         at 
> org.hibernate.event.def.DefaultLoadEventListener.load(DefaultLoadEventListener.java:14

2) Now, the DB looks like this. Note how queueEntry_id=3 exists, but the
corresponding entry in QueueEntry has been deleted. I think this is the
problem, I believe the delete QueueEntry step is supposed to be last in
the series of Job table updates for a COMPLETED job cleanup.


Job table:
> +---------------+
> | queueEntry_id |
> +---------------+
> |          NULL | 
> |          NULL | 
> |             3 | 
> |             4 | 
> |             5 | 
> |             6 | 
> +---------------+
> 6 rows in set (0.00 sec)
> 
> 
QueueEntry table: 
> +----+--------------------------------------+
> | id | queue_id                             |
> +----+--------------------------------------+
> |  4 | b042a005-50f2-4398-93fa-1791a969d676 | 
> |  5 | b042a005-50f2-4398-93fa-1791a969d676 | 
> |  6 | b042a005-50f2-4398-93fa-1791a969d676 | 
> +----+--------------------------------------+
> 3 rows in set (0.00 sec)


3) Persistence.xml is configured as follows. I've attempted to disable
pooling but changing those values had little effect on the failure.

> <persistence-unit name="tavernaService">
>                 <provider>org.hibernate.ejb.HibernatePersistence</provider>
>                 <class>net.sf.taverna.service.datastore.bean.Job</class>
>                 <class>net.sf.taverna.service.datastore.bean.Workflow</class>
>                 <class>net.sf.taverna.service.datastore.bean.DataDoc</class>
>                 <class>net.sf.taverna.service.datastore.bean.User</class>
>                 <class>net.sf.taverna.service.datastore.bean.Worker</class>
>                 <class>net.sf.taverna.service.datastore.bean.Queue</class>
>                 
> <class>net.sf.taverna.service.datastore.bean.QueueEntry</class>
>                 
> <class>net.sf.taverna.service.datastore.bean.Configuration</class>
>                 <properties>
>                         <property name="hibernate.archive.autodetection"
>                                 value="class, hbm" />
>                         <property name="hibernate.show_sql" value="false" />
>                         <property name="hibernate.format_sql" value="true" />
>                         <property name="hibernate.connection.driver_class"
>                         value="com.mysql.jdbc.Driver" />
>                         <property name="hibernate.connection.url"
>                         
> value="jdbc:mysql://db.edc.org:3306/name?user=user&amp;password=password&amp;create=true"/>
>                         <property name="hibernate.c3p0.min_size" value="5" />
>                         <property name="hibernate.c3p0.max_size" value="0" />
>                         <property name="hibernate.c3p0.timeout" value="30" />
>                         <property name="hibernate.c3p0.max_statements" 
> value="0" />
>                         <property name="hibernate.c3p0.idle_test_period"
>                                 value="300" />
>                         <property name="hibernate.dialect"
>                                 value="org.hibernate.dialect.DerbyDialect" />
>                         <property name="hibernate.dialect"
>                         value="org.hibernate.dialect.MySQLDialect" />
>                         <property name="hibernate.hbm2ddl.auto" 
> value="update" />
>                 </properties>
>         </persistence-unit>


4) The taverna running on the client machine can no longer interact with the 
remote execution service on the server machine and throws this exception:

> Exception in thread "AWT-EventQueue-0" java.lang.RuntimeException: Could not 
> get document from 
> https://compute.system.org:8443/network-0.5.1/v1/users/jtilson/jobs
>         at 
> net.sf.taverna.service.rest.client.RESTContext.loadDocument(RESTContext.java:302)
>         at 
> net.sf.taverna.service.rest.client.AbstractREST.loadDocument(AbstractREST.java:77)
>         at 
> net.sf.taverna.service.rest.client.LinkedREST.getDocument(LinkedREST.java:30)
>         at 
> net.sf.taverna.service.rest.client.JobsREST.getJobs(JobsREST.java:33)
>         at 
> net.sf.taverna.service.rest.client.JobsREST.iterator(JobsREST.java:41)
>         at 
> net.sf.taverna.service.executeremotely.ui.JobsPanel.addJobs(JobsPanel.java:187)
>         at 
> net.sf.taverna.service.executeremotely.ui.JobsPanel.access$500(JobsPanel.java:27)


5) If I manually set queueEntry_id=3 to null, everything works again.



Any thoughts, comments, ideas are welcome. I am thinking this is simply a 
configuration problem but am not sure. 

--jeff




------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
taverna-hackers mailing list
[email protected]
Web site: http://www.taverna.org.uk
Mailing lists: http://www.taverna.org.uk/taverna-mailing-lists/
Developers Guide: http://www.mygrid.org.uk/tools/developer-information

Reply via email to