[ 
https://issues.apache.org/jira/browse/TAVERNA-871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696959#comment-14696959
 ] 

Stian Soiland-Reyes commented on TAVERNA-871:
---------------------------------------------


When a job is waiting for a control link, it is queued.


https://github.com/apache/incubator-taverna-engine/blob/master/taverna-workflowmodel-impl/src/main/java/org/apache/taverna/workflowmodel/processor/dispatch/impl/DispatchStackImpl.java#L188

is where the error occurs. My first guess would be that:

    queues.get(owningProcess)  returns null

which is odd given that it just did:

    synchronized (queues) {
       // ...
       queues.containsKey(owningProcess)



But down here where the queue is purged it is NOT synchronized:
https://github.com/apache/incubator-taverna-engine/blob/master/taverna-workflowmodel-impl/src/main/java/org/apache/taverna/workflowmodel/processor/dispatch/impl/DispatchStackImpl.java#L302

So perhaps changing this to:

synchronized(queues) {
 queues.remove()
}

would do the trick.


To test this in Taverna 2.5 command line you  can do the change from the 
old/core-1.5 tag (note: Different folders and Java package name)

https://github.com/apache/incubator-taverna-engine/blob/old/core-1.5/workflowmodel-impl/src/main/java/net/sf/taverna/t2/workflowmodel/processor/dispatch/impl/DispatchStackImpl.java#L314:

and after mvn clean install of workflowmodel-impl.jar - copy in the replacement 
workflowmodel-impl-1.5.jar (which then isn't anymore 1.5) into the 
corresponding folder under repository/ folder of your Taverna installation and 
rerun the workflow tests.

Is anyone able to have a go with this and see if this fixes the issue? 
(Obviously you would first need to verify that loop-running 
control-0003x0005.t2flow breaks it)

> Race condition: occassional NullPointerException in DispatchStackImpl
> ---------------------------------------------------------------------
>
>                 Key: TAVERNA-871
>                 URL: https://issues.apache.org/jira/browse/TAVERNA-871
>             Project: Apache Taverna
>          Issue Type: Bug
>          Components: Taverna Engine
>            Reporter: Stian Soiland-Reyes
>             Fix For: engine 3.1.0
>
>         Attachments: control-0003x0005.t2flow
>
>
> Raised by Javier Rojas Balderrama on users:
> {quote}
> I'm executing some workflows by command line and from time to time the 
> execution freezes so I have to start over the execution. The log associated 
> to this is bellow. Is this only a command line issue or a general behaviour? 
> {quote}
> {code}
> INFO  2015-07-29 08:49:11,170 
> (de.uni_luebeck.inb.knowarc.usecases.invocation.local.LocalUseCaseInvocation:116)
>  - mainTempDirectory is /tmp
> INFO  2015-07-29 08:49:11,170 
> (de.uni_luebeck.inb.knowarc.usecases.invocation.local.LocalUseCaseInvocation:117)
>  - Using tempDir /tmp/usecase1293979972903664991dir
> INFO  2015-07-29 08:49:11,170 
> (net.sf.taverna.t2.activities.externaltool.ExternalToolActivity:237) - Run id 
> is cddff932-9804-4a55-bef7-4679eab931c5
> INFO  2015-07-29 08:49:11,170 
> (de.uni_luebeck.inb.knowarc.usecases.invocation.local.LocalUseCaseInvocation:351)
>  - cmds[0] = /bin/sh
> INFO  2015-07-29 08:49:11,170 
> (de.uni_luebeck.inb.knowarc.usecases.invocation.local.LocalUseCaseInvocation:351)
>  - cmds[1] = -c
> INFO  2015-07-29 08:49:11,170 
> (de.uni_luebeck.inb.knowarc.usecases.invocation.local.LocalUseCaseInvocation:351)
>  - cmds[2] = sleep 0
> INFO  2015-07-29 08:49:11,170 
> (de.uni_luebeck.inb.knowarc.usecases.invocation.local.LocalUseCaseInvocation:353)
>  - Command is sleep 0 in directory /tmp/usecase1293979972903664991dir
> WARN  2015-07-29 08:49:11,173 
> (net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Invoke:236) - 
> Failed (INVOCATION) invoking 
> net.sf.taverna.t2.activities.externaltool.ExternalToolActivity@415ce54f for 
> job DispatchJobEvent facade0:control-0045x0035:S_1567[]: Uncaught exception 
> while invoking 
> net.sf.taverna.t2.activities.externaltool.ExternalToolActivity@415ce54f
> java.lang.NullPointerException
>       at 
> net.sf.taverna.t2.workflowmodel.processor.dispatch.impl.DispatchStackImpl.satisfyConditions(DispatchStackImpl.java:188)
>       at 
> net.sf.taverna.t2.workflowmodel.impl.ProcessorImpl$2.finishedWith(ProcessorImpl.java:176)
>       at 
> net.sf.taverna.t2.workflowmodel.processor.dispatch.impl.DispatchStackImpl$TopLayer.sendCachePurge(DispatchStackImpl.java:313)
>       at 
> net.sf.taverna.t2.workflowmodel.processor.dispatch.impl.DispatchStackImpl$TopLayer.receiveResult(DispatchStackImpl.java:281)
>       at 
> net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize.receiveResult(Parallelize.java:165)
>       at 
> net.sf.taverna.t2.workflowmodel.processor.dispatch.AbstractDispatchLayer.receiveResult(AbstractDispatchLayer.java:85)
>       at 
> net.sf.taverna.t2.workflowmodel.processor.dispatch.AbstractErrorHandlerLayer.receiveResult(AbstractErrorHandlerLayer.java:136)
>       at 
> net.sf.taverna.t2.workflowmodel.processor.dispatch.AbstractErrorHandlerLayer.receiveResult(AbstractErrorHandlerLayer.java:136)
>       at 
> net.sf.taverna.t2.workflowmodel.processor.dispatch.AbstractDispatchLayer.receiveResult(AbstractDispatchLayer.java:85)
>       at 
> net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Invoke$InvokeCallBack.receiveResult(Invoke.java:352)
>       at 
> net.sf.taverna.t2.activities.externaltool.ExternalToolActivity$1.run(ExternalToolActivity.java:272)
>       at java.lang.Thread.run(Thread.java:745)
> ERROR 2015-07-29 08:49:11,174 
> (net.sf.taverna.t2.workflowmodel.processor.dispatch.AbstractErrorHandlerLayer:200)
>  - Could not find any active jobs for facade0:control-0045x0035:S_1567
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to