Hi, user job manager gets into the state where the submission with globusrun hangs and job is never submitted
server logs say: Sep 13 15:35:30 gram5 gridinfo[30804]: ts=2011-09-13T03:35:30.104424Z id=30804 event=gram.job.start level=INFO gramid=/16145890501405029996/576663433152357309/ peer=130.216.189.203:57672 Sep 13 15:35:30 gram5 gridinfo[30804]: ts=2011-09-13T03:35:30.104582Z id=30804 event=gram.add_request.end level=WARN gramid=/16145890501405029996/576663433152357309/ status=-130 reason="the job manager was sent a stop signal (job is still running)" Sep 13 15:35:30 gram5 gridinfo[30804]: ts=2011-09-13T03:35:30.104885Z id=30804 event=gram.job.end level=INFO gramid=/16145890501405029996/576663433152357309/ status=-130 msg="Request start failed" reason="the job manager was sent a stop signal (job is still running)" submission with globusrun hangs: globusrun -batch -r gram5.ceres.auckland.ac.nz '&(executable=echo)(arguments= hello)(job_type=single)(count=1)(hostCount=1)(vo="/nz/nesi")(maxWalltime=10)(directory=/home/smas036)' globus_gram_client_callback_allow successful GRAM Job submission successful https://gram5.ceres.auckland.ac.nz:40398/16145891598704212781/576663433152357309/ submission with two-phase does not hang and results in: globusrun -batch -r gram5.ceres.auckland.ac.nz '&(two_phase=5)(executable=echo)(arguments= hello)(job_type=single)(count=1)(hostCount=1)(vo="/nz/nesi")(maxWalltime=10)(directory=/home/smas036)' globus_gram_client_callback_allow successful GRAM Job submission failed because the job contact string does not match any which the job manager is handling (error code 156) https://gram5.ceres.auckland.ac.nz:40398/16145891597960224316/576663433152357309/ our users are getting into this problem all the time, but I cannot reproduce putting job manager into that state. They can submit again when I kill it. We haven't seen this, before our job submission software started submitting jobs with two-phase. Cheers, Yuriy
