Am 26.03.2013 um 17:10 schrieb Reuti: > Hi, > > Am 26.03.2013 um 12:17 schrieb Arnau Bria: > >> I'm migrating a bash jsv script to perl and adding some >> modifications, but I have some doubts: >> >> 1) jsv_correct vs jsv_accept. From man: >> >> If the result_type is ACCEPTED the job will be accepted as it was >> initially submitted by the end user. All param_commands and >> env_commands which might have been sent before the >> result_command are ignored in this case. The result_type CORRECT >> indicates that the job should be accepted after all modifications sent >> via param_commands and env_commands are applied to the job >> >> But if I do modifications (I'm doing jsv_sub_add_param) and then >> jsv_accept, the job is modified and submited, so, why is jsv_correct >> needed? what could happen if I do not correct but accept? > > I would say it's a bug, that the changes made to the job are committed. They > should be ignored.
As far as I remember there is just one implementation which only sends differences in case of jsv_correct. It might be the Java JSV. I agree, it would be easier for JSV implementers as well as for users to have just only one statement. > > >> 2) core binding. I have it configured for serial and smp jobs, but >> which is the correct strategy and configuration for mpi jobs? >> Is linear going to span jobs acros different host sockets? > > AFAICS the request is applied on all machines which you get granted for the > job. I.e. applied per `qrsh -inherit ...` besides setting it for the > jobscript already. This is hard to handle in case of a round robin > allocation, as you don't know in advance whether you get just one slot per > machine or more. Maybe the best would be to use it with a fixed allocation > rule only. Yes, linear spans across sockets, while it tries to allocate cores on one socket first. Basically it chooses the socket with most free cores and fills it up, then it chooses the second socket, and so on. Something like "packing" jobs close to shared cache levels. In Univa Grid Engine it is not a per qrsh -inherit call anymore (as it is like for SGE 6.2u5), it is now a per host request because core management was moved in 8.1.0 from execd level into the scheduler itself. The scheduler has a global view on used resources. When requesting linear with JSV you need to request "linear_automatic" since "linear" equals to something like "qsub -binding linear:2:0,0" while "linear_automatic" equals to the more common "qsub -binding linear:2". If you are using OpenMPI you can also generate a rankfile out of the PE hostfile and delegate the core selection to OpenMPI. But in SGE you have the same core selection for each host hence the jobs must run host exclusively, which is no real advantage. In Univa Grid Engine you don't have this limitation anymore, again because the scheduler selects cores with having a global view. Maybe this is interesting for you: http://www.gridengine.eu/grid-engine-internals/119-boosting-openmpi-performance-with-rankfiles-core-binding-and-univa-grid-engine Daniel > > -- Reuti > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
