Hi, Am 25.02.2013 um 20:42 schrieb Arnau:
> I have open-MPI parallel environment configured in our cluster and it was > working fine till now that we have lots of simple jobs in queue and ompi ones > are not been scheuled. They've been in queue for some time and now they are > the first ones to be scheduled, but they never find not enough free slots. > Every time a slot is free, some jobs with low priority starts : > > I've added the "-R y" to force the resoruce reservation, but jobs are still > in queue. Did you also specify a runtime for all jobs, otherwise the default runtime will be used? What are the settings of: $ qconf -ssconf ... max_reservation 16 default_duration 9999:00:00 -- Reuti > so I'm missing some configuration step and I've been reading and looking > around but I¡ve not found what is it... > > > 65316 0.05000 wath.sh XXXXXX r 02/25/2013 20:37:50 1 > > ############################################################################ > - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS > ############################################################################ > 63070 0.26805 mpi_astrid YYYYY qw 02/25/2013 12:20:45 20 > > some of my conf: > > # qconf -sp ompi > pe_name ompi > slots 128 > user_lists NONE > xuser_lists NONE > start_proc_args /bin/true > stop_proc_args /bin/true > allocation_rule $fill_up > control_slaves TRUE > job_is_first_task FALSE > urgency_slots min > accounting_summary TRUE > > # qconf -sq default|grep omp > pe_list smp ompi > > qstta: > > job_number: 63070 > exec_file: job_scripts/63070 > submission_time: Mon Feb 25 12:20:45 2013 > owner: XXXX > uid: XXX > group: XXXX > gid: 6171 > sge_o_home: /users/jXXXX > sge_o_log_name: XXXX > sge_o_path: > /usr/lib64/openmpi/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/lib64/openmpi/bin/:/usr/lib64/compat-openmpi/bin/:/users/jjaeger/dcicin/bin > sge_o_shell: /bin/bash > sge_o_workdir: /nfs/users/XXXXX > sge_o_host: ant-XXX > account: sge > cwd: /users/XXXXX > reserve: y > merge: y > hard resource_list: virtual_free=12G > mail_list: XXX@ant-XXXXXes > notify: FALSE > job_name: mpi_astrid.sh > jobshare: 0 > shell_list: NONE:/bin/bash > env_list: > script_file: mpi_astrid.sh > parallel environment: ompi range: 20 > version: 1 > [...] > cannot run in PE "ompi" because it only offers 2 slots > > > I 'm sure I'm missing some conf, but I don't know which file is it... > > Anyone could give me a hand? > > TIA, > Arnau > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
