On 18 November 2011 14:21, Gerard Henry <[email protected]> wrote: > hello all, > > i got trouble to confgure a queue on SGE 6.2u5 (linux) > > I have two machines amd64, with this topology: SCCSCC so the total of > cores is 8. > > first, i defined a group: > # qconf -shgrp @qlong > group_name @qlong > hostlist charybde scylla > > then a queue: > # qconf -sq long1 > qname long1 > hostlist @qlong > seq_no 0 > load_thresholds np_load_avg=1.75 > suspend_thresholds NONE > nsuspend 1 > suspend_interval 00:05:00 > priority 0 > min_cpu_interval 00:05:00 > processors UNDEFINED > qtype BATCH INTERACTIVE > ckpt_list NONE > pe_list make > rerun FALSE > slots 4 > tmpdir /tmp > shell /bin/csh > prolog NONE > epilog NONE > shell_start_mode posix_compliant > starter_method NONE > suspend_method NONE > resume_method NONE > terminate_method NONE > notify 00:00:60 > owner_list NONE > user_lists NONE > xuser_lists NONE > subordinate_list NONE > complex_values NONE > projects NONE > xprojects NONE > calendar NONE > initial_state default > s_rt INFINITY > h_rt INFINITY > s_cpu INFINITY > h_cpu INFINITY > s_fsize INFINITY > h_fsize INFINITY > s_data INFINITY > h_data INFINITY > s_stack INFINITY > h_stack INFINITY > s_core INFINITY > h_core INFINITY > s_rss INFINITY > h_rss INFINITY > s_vmem INFINITY > h_vmem INFINITY > > but when i try to submit a job, it fails with: > % qsub -w v ./script1.sh > Job 14431 cannot run in PE "mpi_labo" because it only offers 0 slots > > the beginning of the script is: > ... > #$ -q long1 > #$ -pe mpi_labo 6 > > > and the PE is defined by: > qconf -sp mpi_labo > pe_name mpi_labo > slots 8 > user_lists NONE > xuser_lists NONE > start_proc_args /bin/true > stop_proc_args /bin/true > allocation_rule $pe_slots
I think the above line is the problem $pe_slots means that "the full range of processes as specified with the qsub(1) -pe switch has to be allocated on a single host". You only have 4 slots per host so jobs larger than that won't run. > control_slaves TRUE > job_is_first_task FALSE > urgency_slots min > accounting_summary FALSE > > > If i try to submit with "-pe mpi_labo 4", it works. What am i missing? > > I also tried to augment the value: > qconf -mq long1 > slots 8 > but in this case, the program executes his 8 threads on the same host, > that's not what i want; > > thanks in advance for help, > > gerard > > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users > > > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
