Hi all, I have a queue configured for USR2 notification on qdel, with 10 seconds delay.
It works fine for batch jobs, I can see the USR2 signal is sent 10 seconds before the KILL. But when using parallel jobs, I have several issues: 1) Subjobs submitted using qrsh -inherit are killed right away (every time) Is there a way to inherit -notify to subtasks ? 2) The master job also gets killed right away "randomly" (like 1 out of 10 times), just after being sent the USR2 signal. I pasted simple reproducers scripts here: https://gist.github.com/nicoulaj/91a18d5c0ed952cbd027bae53bbbedbd - test.sh is the submit command - test-master.sh is the parallel job script - test-slave.sh is the parallel subtask script I could find this old issue, which makes me think SGE is supposed to handle this correctly: https://arc.liv.ac.uk/trac/SGE/ticket/660 Any idea ? Best regards, Julien
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users