Greetings. We have a queue defined with a soft & hard wall-clock limit of:
qconf -sq free64 | egrep "_rt|notify" notify 00:05:00 s_rt 48:00:00 h_rt 48:05:00 And jobs get killed correctly after 2 days of wall-clock run time. We now have Grid Engine checkpoint setup and would like to make it so that jobs do not get killed, but rather be sent the suspend signal so that checkpoint takes over instead of being killed. After reading and doing some tests with the queue "suspend_method", I am not sure I am on the right track. So what is the proper / correct way to do this? To *not* have jobs killed but to have the checkpoint process take over when s_rt is reached? Joseph _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
