Am 13.06.2012 um 02:10 schrieb Joseph Farran: > Well, for our needs, we *REALLY* need Parallel Job suspension. It's not > even a choice for us. > > If Torque/Maui can do it, I am sure OGE can do it without issues. > > Can someone please tell me what patch I need to install to un-break / turn-on > Parallel job suspension? > > If you guys are that paranoid about PE suspension, how about adding an on/off > flag for this since the code is already there and let the admin pick.
Yep, but there is also the case that a slave gets suspended and you have to distribute it back to the master process of the parallel job. Therefore I had the idea to set: qmaster_params SUSPEND_PARALLEL_GROUP=yes in the RFE. But now I wonder whether it would be better to be put it in the PE definition, as some parallel libraries might not like it. This is related to another RFE, whether a job is eligible for suspension or not. https://arc.liv.ac.uk/trac/SGE/ticket/735 -- Reuti > Joseph > > > On 06/12/2012 06:52 AM, Dave Love wrote: >> "Joseph A. Farran"<[email protected]> writes: >> >>> If you guys are taking requests, *please* add suspension and ignore old Sun >>> recommendation. >> Support for suspension exists, it's just broken (per the issue Reuti >> pointed to). The use of | is clearly wrong, but the other bit isn't >> clear. It's one of the available patches I wanted to understand before >> applying (and had forgotten about). Can anyone cast more light on it? >> > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
