> Am 05.05.2015 um 11:52 schrieb Stefano Bridi <[email protected]>: > > Ok, sorry, yesterday I miss to reply to the list. > Today is not a busy day for that queue so I had to recreate the > problem: by doing this I saw that while the queue is empty all works > as expected (for the seconds between the submit and the start of the > job it is displayed "qw" by the 'qstat -q E5m' as expected. > The E5m queue is built with 5 nodes: n010[4-8]. At the moment only one > is under real use so I need to submit 5 jobs to have one "qw". > > $ qsub sleeper.sh > Your job 876766 ("sleeper.sh") has been submitted > $ qsub sleeper.sh > Your job 876767 ("sleeper.sh") has been submitted > $ qsub sleeper.sh > Your job 876768 ("sleeper.sh") has been submitted > $ qsub sleeper.sh > Your job 876769 ("sleeper.sh") has been submitted > $ qsub sleeper.sh > Your job 876770 ("sleeper.sh") has been submitted > $ qalter -w v 876770 > Job 876770 cannot run in queue "opteron" because it is not contained > in its hard queue list (-q) > Job 876770 cannot run in queue "x5355" because it is not contained in > its hard queue list (-q) > Job 876770 cannot run in queue "e5645" because it is not contained in > its hard queue list (-q) > Job 876770 cannot run in queue "x5560" because it is not contained in > its hard queue list (-q) > Job 876770 cannot run in queue "x5670" because it is not contained in > its hard queue list (-q) > Job 876770 cannot run in queue "E5" because it is not contained in its > hard queue list (-q) > Job 876770 (-l exclusive=true) cannot run at host "n0104" because > exclusive resource (exclusive) is already in use > Job 876770 (-l exclusive=true) cannot run at host "n0105" because > exclusive resource (exclusive) is already in use > Job 876770 (-l exclusive=true) cannot run at host "n0106" because > exclusive resource (exclusive) is already in use > Job 876770 (-l exclusive=true) cannot run at host "n0107" because > exclusive resource (exclusive) is already in use > Job 876770 (-l exclusive=true) cannot run at host "n0108" because > exclusive resource (exclusive) is already in use > verification: no suitable queues > $ > > Does this mean that the "exclusive" complex requested via the "qsub > -l excl=true" is evaluated on the node before the check on the hard > queue list?
It's not only related to the "exclusive" use. There seem to be some side effects whether a complex is attached to an exechost and/or a queue. I see jobs disappearing and reappearing depending on other running jobs when I use "-q". Nevertheless, maybe you can attach the complex on a queue level too. -- Reuti > If I am correct, is there another way to have both 'qstat > -q' and exclusive use of nodes working? > thanks > stefano > > Il 04/mag/2015 13:46, "Reuti" <[email protected]> ha scritto: > Hi, > > > Am 04.05.2015 um 13:25 schrieb Stefano Bridi <[email protected]>: > > > > Hi all, > > I need to give the possibility to the user to reserve one or more node > > for exclusive use for their runs. > > It is a mixed environment and If they don't reserve the node for > > exclusive use, the serial and low number of core jobs will fragment > > the availability of cores across many nodes. > > The problem is that now the "exclusive" jobs are not listed anymore in > > the "per queue" qstat: > > > > We solved the exclusive request by setting up a new complex: > > > > # qconf -sc excl > > #name shortcut type relop requestable > > consumable default urgency > > #-------------------------------------------------------------------------------------------------- > > exclusive excl BOOL EXCL YES > > YES 0 1000 > > > > and setting on every node usable in this way the relative complex (is > > there a way to set this system wide?): > > > > #qconf -se n0108 > > hostname n0108 > > load_scaling NONE > > complex_values exclusive=true > > load_values arch=linux-x64,num_proc=20,....[snip] > > processors 20 > > user_lists NONE > > xuser_lists NONE > > projects NONE > > xprojects NONE > > usage_scaling NONE > > report_variables NONE > > > > now it I submit a job like: > > $ cat sleeper.sh > > #!/bin/bash > > > > # > > #$ -cwd > > #$ -j y > > #$ -q E5m > > #$ -S /bin/bash > > #$ -l excl=true > > # > > date > > sleep 20 > > date > > > > $ > > All works as expected except qstat: > > a generic 'qstat' report: > > job-ID prior name user state submit/start at > > queue slots ja-task-ID > > ----------------------------------------------------------------------------------------------------------------- > > 876735 0.50601 sleeper.sh s.bridi qw 05/04/2015 12:20:45 > > 1 > > > > and the 'qstat -j 876735' report: > > ============================================================== > > job_number: 876735 > > exec_file: job_scripts/876735 > > submission_time: Mon May 4 12:20:45 2015 > > owner: s.bridi > > uid: 65535 > > group: domusers > > gid: 15000 > > sge_o_home: /home/s.bridi > > sge_o_log_name: s.bridi > > sge_o_path: > > /sw/openmpi/142/bin:.:/ge/bin/linux-x64:/usr/lib64/qt-3.3/bin:/ge/bin/linux-x64:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/s.bridi/bin > > sge_o_shell: /bin/bash > > sge_o_workdir: /home/s.bridi/testexcl > > sge_o_host: login0 > > account: sge > > cwd: /home/s.bridi/testexcl > > merge: y > > hard resource_list: exclusive=true > > mail_list: s.bridi@login0 > > notify: FALSE > > job_name: sleeper.sh > > jobshare: 0 > > hard_queue_list: E5m > > shell_list: NONE:/bin/bash > > env_list: > > script_file: sleeper.sh > > scheduling info: [snip] > > > > while the > > 'qstat -q E5m' don't list the job! > > Usually this means that the job is not allowed to run in this queue. > > What does: > > $ qalter -w v 876735 > > ouput? > > -- Reuti > > > > Thanks > > Stefano > > _______________________________________________ > > users mailing list > > [email protected] > > https://gridengine.org/mailman/listinfo/users > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
