Hi,

Am 25.02.2013 um 20:42 schrieb Arnau:

> I have open-MPI parallel environment configured in our cluster and it was 
> working fine till now that we have lots of simple jobs in queue and ompi ones 
> are not been scheuled. They've been in queue for some time and now they are 
> the first ones to be scheduled, but they never find  not enough free slots. 
> Every time a slot is free, some jobs with low priority starts :
> 
> I've added the "-R y" to force the resoruce reservation, but jobs are still 
> in queue.

Did you also specify a runtime for all jobs, otherwise the default runtime will 
be used? What are the settings of:

$ qconf -ssconf
...
max_reservation                   16
default_duration                  9999:00:00

-- Reuti


> so I'm missing some configuration step and I've been reading and looking 
> around but I¡ve not found what is it...
> 
> 
>   65316 0.05000 wath.sh    XXXXXX     r     02/25/2013 20:37:50     1        
> 
> ############################################################################
>  - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
> ############################################################################
>   63070 0.26805 mpi_astrid YYYYY       qw    02/25/2013 12:20:45    20        
> 
> some of my conf:
> 
> # qconf -sp ompi
> pe_name            ompi
> slots              128
> user_lists         NONE
> xuser_lists        NONE
> start_proc_args    /bin/true
> stop_proc_args     /bin/true
> allocation_rule    $fill_up
> control_slaves     TRUE
> job_is_first_task  FALSE
> urgency_slots      min
> accounting_summary TRUE
> 
> # qconf -sq default|grep omp
> pe_list               smp ompi
> 
> qstta:
> 
> job_number:                 63070
> exec_file:                  job_scripts/63070
> submission_time:            Mon Feb 25 12:20:45 2013
> owner:                      XXXX
> uid:                        XXX
> group:                      XXXX
> gid:                        6171
> sge_o_home:                 /users/jXXXX
> sge_o_log_name:             XXXX
> sge_o_path:                 
> /usr/lib64/openmpi/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/lib64/openmpi/bin/:/usr/lib64/compat-openmpi/bin/:/users/jjaeger/dcicin/bin
> sge_o_shell:                /bin/bash
> sge_o_workdir:              /nfs/users/XXXXX
> sge_o_host:                 ant-XXX
> account:                    sge
> cwd:                        /users/XXXXX
> reserve:                    y
> merge:                      y
> hard resource_list:         virtual_free=12G
> mail_list:                  XXX@ant-XXXXXes
> notify:                     FALSE
> job_name:                   mpi_astrid.sh
> jobshare:                   0
> shell_list:                 NONE:/bin/bash
> env_list:                   
> script_file:                mpi_astrid.sh
> parallel environment:  ompi range: 20
> version:                    1
> [...]
>                       cannot run in PE "ompi" because it only offers 2 slots
> 
> 
> I 'm sure I'm missing some conf, but I don't know which file is it...
> 
> Anyone could give me a hand?
> 
> TIA,
> Arnau
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to