Ok, thanks.
I have tried dividing the total memory/#threads and it worked. I didn't
try it before because I had previous experiences doing that in other
parallel jobs and they failed. I suppose I did it in mpi processes and
the such.
As far as I understand, correct me if I'm wrong, the h_vmem is
allocated/set the limit just for the shepherd process, isn't it? In that
case, if all slots are assigned to the same node, the limit is set to
the whole memory (say 10Gb). But if SGE schedules the job in 5 different
nodes, then the limit for each node is set to its proportional part (say
2Gb).
Is that right or am I missing something?
Txema
PS: Sorry for the delay on the answer, I wanted to try a few things
before emailing again, but I'm having problems with mpi right now. I'll
open a new thread later on.
El 22/02/12 19:39, Reuti escribió:
Am 22.02.2012 um 18:57 schrieb Txema Heredia Genestar:
Thanks for yous answers, I'll go one by one, but first, a few clarifications:
1- We are stuck with 6.1u4. In a few weeks we will install a new cluster, with
a more recent version.
2- I don't care about "smp". In fact, before reading your answers I never understood
properly the differences between $pe_slots and $fill_up. I have a $pe_slots parallel environment
called "threaded" and the problem is still there. Basically, I just want my PE to NOT
multiply the memory reservation.
Now, your answers:
Bob - I would like to use "consumable JOB", but, unfortunately, this is not
available until SGE 6.2. Even though, that would screw up any mpi job trying to run in
our cluster. We mainly run single-core jobs, but from time to time some threaded or mpi
jobs need to be run.
Mazouzi - Right now I have PE's only available in two "testing" nodes. The
problem happens in them both.
Reuti - I have tried both combinations: 1-queue@1-node and 1-queue@N-nodes. No
luck, same problem everywhere. In fact, one node has 48Gb while the other has
56Gb, so when I ask for a 6-threaded 10Gb job (60Gb total), one node replies
stating that it only offers 4 slots, and the other offers 5.
Sure, if you request a particular node, then $fill_up and $pe_slots will have
the same effect. So you are limit to the installed memory. 60 GB isn't
installed, so you can't get it.
It's necessary to divide by hand before and request an uprounded 2GB, hence you
get 12 GB for your 6 slots threaded job.
-- Reuti
I have read your ticket and that is exactly my problem, the resources multiply. But, as
far as I know, they solved it with the "consumable JOB" thing? Unfortunately
the links are broken (
http://gridengine.sunsource.net/nonav/source/browse/~checkout~/gridengine/doc/devel/rfe/non-multiplied-pe-requests.txt
).
JSV's are a nope in 6.1u4
William - Yours is my best bet. Long time ago I tried tinkering with the
"slots" attribute, but never thought about adding this threaded one. I only see
one (minor) flaw in your solution: I cannot ask for an interval of threads (from 4 to 8)
as with -pe. This condemns to oblivion in the waiting queue any job sent while our
cluster is under some load. That would need to be addressed by manually scheduling. But
that will do, thanks.
Thank you very much.
Txema
PS: One last question: As I have no experience with 6.2 and JSV, what should be
my to-go approach once we install our new cluster with an up-to-date version?
El 21/02/12 21:40, Reuti escribió:
Hi,
Am 21.02.2012 um 20:20 schrieb Txema Heredia Genestar:
Hello all,
I am having some problems to run threaded jobs in SGE 6.1u4. In our cluster, h_vmem is defined as a consumable
attribute in all nodes. It is mandatory, all jobs must request it, with a default value of 6Gb. That constraint leads
any "parallel" job sent to the cluster to try to reserve a lot of memory (h_vmem * slots). This is ok for
most parallel processes (mpi and the such). But, sometimes, we need to run "threaded" jobs, where all jobs
share a chunk of memory (everything on a single node). This leads to situations where I need to send an 8-threaded job
that requires, say, 10 Gb of memory, but it cannot be scheduled because no node can handle a 80Gb request. When a
memory request cannot be fulfilled, the typical message of "cannot run in PE "smp" because it only
offers N slots" appears in qstat (where N is the maximum number of slots I wolud be able to use given the
requested h_vmem size).
This is the parallel environment I am trying to use:
# qconf -sp smp
pe_name smp
slots 9999
user_lists test_users
xuser_lists NONE
start_proc_args /bin/true
stop_proc_args /bin/true
allocation_rule $fill_up
for SMP mode you will need $pe_slots here, unless you are requesting exactly
one node in addition in the submission command.
I assume before you got simply more than one node.
==
The answer from Bob changing the complex h_vmem to JOB would help for this type
of job, but not if you have also MPI jobs in the cluster. I had an RFE for
introducing this on a PE level:
https://arc.liv.ac.uk/trac/SGE/ticket/197
To cite from the issue "Therefore I wrote, that an entry inthe PE would still be
advantageous: h_vmem can only be JOBS or YES"
==
For now: you could adjust the memory request in a JSV depending on the
requested PE, but for this you need 6.2 IIRC.
-- Reuti
control_slaves FALSE
job_is_first_task FALSE
urgency_slots min
The most annoying part of all this is that this behaviour is not consistent:
This morning I've been able to run a 6-threaded job requesting 10Gb of memory
in a 48Gb node. But, in the afternoon, the same job using the very same command
in the same node could not be run.
Does anyone have any suggestion on how to deal with this?
Thanks in advance,
Txema
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users