I've attached the script here. It actually has a bit of output to
stdout, and I've verified that it runs locally, but unfortunately
running it through the PGA produces no output whatsoever, not even the
autogenerated SLURM submit script, hence why I'm suspicious that it is
even running at all. I've set the experiment up to write stdout and
stderr to file,and it did this with the echo experiment.
I think these are the relevant parts of the log, as well (after this it
just has the logging for auto-refresh):
2016-05-23 11:15:00,387 [pool-15-thread-10] INFO
org.apache.airavata.gfac.server.GfacServerHandler -
-----------------------------------7-----------------------------------------
2016-05-23 11:15:00,387 [pool-15-thread-10] INFO
org.apache.airavata.gfac.server.GfacServerHandler -
PROCESS_71118aff-fa37-449f-ae7f-fb170243d0bf
2016-05-23 11:15:00,405 [pool-15-thread-10] INFO
org.apache.airavata.gfac.impl.GFacEngineImpl - expId:
juhygf_ff4b4196-aaa7-4e05-9dfd-74978e9cfa19, processId:
PROCESS_71118aff-fa37-449f-ae7f-fb170243d0bf, get process cancel data
from zookeeper node
/experiments/juhygf_ff4b4196-aaa7-4e05-9dfd-74978e9cfa19/PROCESS_71118aff-fa37-449f-ae7f-fb170243d0bf/cancelListener
2016-05-23 11:15:00,912 [pool-19-thread-11] INFO
org.apache.airavata.gfac.core.context.ProcessContext - expId:
juhygf_ff4b4196-aaa7-4e05-9dfd-74978e9cfa19, processId:
PROCESS_71118aff-fa37-449f-ae7f-fb170243d0bf :- Process status changed
STARTED -> CONFIGURING_WORKSPACE
2016-05-23 11:15:00,961 [pool-19-thread-11] INFO
org.apache.airavata.messaging.core.impl.RabbitMQStatusPublisher -
Publishing status to rabbitmq...
2016-05-23 11:15:00,962 [pool-19-thread-11] INFO
org.apache.airavata.gfac.core.context.TaskContext - expId:
juhygf_ff4b4196-aaa7-4e05-9dfd-74978e9cfa19, processId:
PROCESS_71118aff-fa37-449f-ae7f-fb170243d0bf, taskId:
TASK_b42f5d4e-3b70-4daa-a42f-c5e571cf97c8, type: ENV_SETUP:- Task status
changed CREATED -> EXECUTING
2016-05-23 11:15:00,962 [pool-8-thread-5] INFO
org.apache.airavata.orchestrator.server.OrchestratorServerHandler -
expId: juhygf_ff4b4196-aaa7-4e05-9dfd-74978e9cfa19, processId:
PROCESS_71118aff-fa37-449f-ae7f-fb170243d0bf :- Process status changed
event received for status CONFIGURING_WORKSPACE
2016-05-23 11:15:01,003 [pool-19-thread-11] INFO
org.apache.airavata.messaging.core.impl.RabbitMQStatusPublisher -
Publishing status to rabbitmq...
2016-05-23 11:15:01,003 [pool-19-thread-11] INFO
org.apache.airavata.gfac.impl.Factory - SSH Session validation
succeeded, key :jeff_localhost_22
2016-05-23 11:15:01,006 [pool-19-thread-11] INFO
org.apache.airavata.gfac.impl.Factory - Channel creation test
succeeded, key :jeff_localhost_22
2016-05-23 11:15:01,006 [pool-19-thread-11] INFO
org.apache.airavata.gfac.impl.Factory - Reuse SSH session for
:jeff_localhost_22
2016-05-23 11:15:01,006 [pool-19-thread-11] INFO
org.apache.airavata.gfac.impl.HPCRemoteCluster - Creating directory:
localhost:/tmp/PROCESS_71118aff-fa37-449f-ae7f-fb170243d0bf
2016-05-23 11:15:01,010 [pool-19-thread-11] INFO
org.apache.airavata.gfac.core.context.TaskContext - expId:
juhygf_ff4b4196-aaa7-4e05-9dfd-74978e9cfa19, processId:
PROCESS_71118aff-fa37-449f-ae7f-fb170243d0bf, taskId:
TASK_b42f5d4e-3b70-4daa-a42f-c5e571cf97c8, type: ENV_SETUP:- Task status
changed EXECUTING -> COMPLETED
2016-05-23 11:15:01,053 [pool-19-thread-11] INFO
org.apache.airavata.messaging.core.impl.RabbitMQStatusPublisher -
Publishing status to rabbitmq...
2016-05-23 11:15:01,053 [pool-19-thread-11] INFO
org.apache.airavata.gfac.core.context.ProcessContext - expId:
juhygf_ff4b4196-aaa7-4e05-9dfd-74978e9cfa19, processId:
PROCESS_71118aff-fa37-449f-ae7f-fb170243d0bf :- Process status changed
CONFIGURING_WORKSPACE -> INPUT_DATA_STAGING
2016-05-23 11:15:01,095 [pool-19-thread-11] INFO
org.apache.airavata.messaging.core.impl.RabbitMQStatusPublisher -
Publishing status to rabbitmq...
2016-05-23 11:15:01,095 [pool-19-thread-11] INFO
org.apache.airavata.gfac.core.context.TaskContext - expId:
juhygf_ff4b4196-aaa7-4e05-9dfd-74978e9cfa19, processId:
PROCESS_71118aff-fa37-449f-ae7f-fb170243d0bf, taskId:
TASK_f5028495-bc51-4400-90d3-f56ca06acbde, type: DATA_STAGING:- Task
status changed CREATED -> EXECUTING
2016-05-23 11:15:01,095 [pool-8-thread-6] INFO
org.apache.airavata.orchestrator.server.OrchestratorServerHandler -
expId: juhygf_ff4b4196-aaa7-4e05-9dfd-74978e9cfa19, processId:
PROCESS_71118aff-fa37-449f-ae7f-fb170243d0bf :- Process status changed
event received for status INPUT_DATA_STAGING
On 05/23/2016 11:55 AM, Pierce, Marlon wrote:
Hi Jeff,
What is the script that you are trying to run (see below)? Can you add some
debugging messages there?
Marlon
On 5/23/16, 11:47 AM, "Jeff" <[email protected]> wrote:
With my
current setup, I can run jobs that do not require external scripts
(e.g., echo <some string>), but when I try to run any kind of script the
experiment never completes.
##########################################################################
# this script was generated by openmm-builder. to customize it further,
# you can save the file to disk and edit it with your favorite editor.
##########################################################################
from __future__ import print_function
from simtk.openmm import app
import simtk.openmm as mm
from simtk import unit
import sys
pdb = app.PDBFile(sys.argv[1])
forcefield = app.ForceField('amber03.xml', 'amber03_obc.xml')
system = forcefield.createSystem(pdb.topology, nonbondedMethod=app.NoCutoff,
constraints=None, rigidWater=False)
integrator = mm.LangevinIntegrator(300*unit.kelvin, 91/unit.picoseconds,
1.0*unit.femtoseconds)
platform = mm.Platform.getPlatformByName('CPU')
simulation = app.Simulation(pdb.topology, system, integrator, platform)
simulation.context.setPositions(pdb.positions)
print('Minimizing...')
simulation.minimizeEnergy()
print('Equilibrating...')
simulation.step(100)
simulation.reporters.append(app.DCDReporter('trajectory.dcd', 1000))
simulation.reporters.append(app.StateDataReporter('sim.csv', 1000, step=True,
potentialEnergy=True, totalEnergy=True, temperature=True, separator='\t'))
print('Running Production...')
simulation.step(10000)
print('Done!')