Hi, Am 10.01.2012 um 15:44 schrieb Hugh Macdonald:
> We've been using Grid (Currently Son of Grid Engine, 8.0.0b) for about 9 > months now, but the person who knew it best recently left, so I'm spending a > lot of my time trying to get my head around things. > > > I'm trying to get Grid to accurately respond to return codes (99 for retry, > 100 for error) > > It seems that, if I use 'qsub' to submit a job, it works fine, but if I > submit the same job through drmaa, it doesn't respond correctly. > > > The script I've been running is purely the following: > > sleep 5 > exit 100 > > When I submit it using drmaa (via our own internal interface), I checked the > DRMAA's job template, and found a load of command-line args in > 'nativeSpecification'. I used these to replicate, as close as possible, the > same command in qsub. yes, it's a known issue: http://arc.liv.ac.uk/pipermail/gridengine-users/2010-June/031158.html -- Reuti > This is the qsub command-line that I used. > > qsub -w n -l clamp=0,arch=lx-amd64 -p -800 -shell yes -js 0 -P myproject -V > -q default -R yes -pe XX 1-1 -wd `pwd` -j y ./test_exit.sh > > > I find that the DRMAA jobs, as soon as they finish, disappear from Grid. > However, when the qsub submitted jobs finish, they get (correctly) marked up > as Errored. > > > I grabbed the 'qstat -j <jobid>' output from each of the created tasks. The > only differences were the job name, and that the DRMAA task has the following > in it: > > stdout_path_list: NONE:myhostname:/jobs/myproject/seq/shot/work/app/logs > > > Any ideas what might be going on here? It'd be good to be able to get my jobs > to be able to error/retry properly. > > > Cheers > > Hugh Macdonald > nvizible – VISUAL EFFECTS > www.nvizible.com > > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
