Bas van der Vlies wrote:
> Dear Michel,
>
> What I read from the code (It is while back that I did that, but we are in
> the process of patching some maui stuff). The
> * N-TaskCount is number of jobs on the node
> * and N->DRes.Procs —> Which resources are given to the consumer,
>
> So you can have a node with 16 cores and if you have an share node:
> * 4 nodes dat consume all cores
You mean 4 jobs, I guess?
> With the N->DRes.Procs you can determine if there are slot available for
> other jobs, eg:
> * 5 jobs each 2 core
> * N->DRes.Procs will be 10
> * and still 6 slots available
>
> That is what I read.
Hmm... Looking at the code, I see for example that functions
MPBSNodeLoad() and MPBSNodeUpdate() in file MPBSI.c both loop on the
comma-separated tokens of the "jobs" attribute of a node by incrementing
N->TaskCount for each token. This is a while loop using
ptr = MUStrTok(tmpBuffer,", \t",&TokPtr);
to get the first token and
ptr = MUStrTok(NULL,", \t",&TokPtr);
to get the next one, pretty similar to the C function strtok() and the
POSIX function strtok_r().
This means that with a jobs attribute looking like this:
0/48.server, 1/48.server, 2/49.server, 3/49.server
N->TaskCount will end up taking the value 4. It means that it counts the
number of processors, not the number of jobs on the node.
Lines 3188 to 3209 of MPBSI.c are where MPBSNodeUpdate() treats one
token by incrementing N->TaskCount and extracting the JobID. That is
after that it gets interesting. N->DRes.Procs is incremented this way:
3211 if (MJobFind(JobID,&J,0) == SUCCESS)
3212 {
3213 if (J->Req[0]->DRes.Procs == -1)
3214 {
3215 tmpProcs = N->CRes.Procs;
3216 }
3217 else
3218 {
3219 tmpProcs = MAX(1,J->Req[0]->DRes.Procs);
3220 }
3221
3222 N->DRes.Procs = MIN(N->DRes.Procs + tmpProcs,N->CRes.Procs);
3223
If I understand correctly, J->Req[0]->DRes.Procs) is the number of
processors dedicated to the job. But wait, what tells us that these
processors are on the current node? The only way I think this code might
work is if J->Req[0]->DRes.Procs == 0, making tmpProcs equal to 1. Then
N->DRes.Procs and N->TaskCount are the same...
MPBSNodeLoad() has similar looking code, but nor exactly the same:
2514 if (MJobFind(JobID,&J,0) == SUCCESS)
2515 {
2516 N->DRes.Procs += MAX(1,J->Req[0]->DRes.Procs); /* FIXME */
I will experiment with a debugger on an old cluster running an earlier
version of Torque and see what value J->Req[0]->DRes.Procs has.
>> On 8 dec. 2015, at 23:09, Michel Béland <[email protected]>
>> wrote:
>>
>> Hello,
>>
>> I am trying to modify Maui to understand correctly the "exec_host" job
>> attribute and the "jobs" node attribute. While reading the code, I
>> wondered what was the meaning of parts of the data structure.
>>
>> So what is the difference between N->TaskCount and N->DRes.Procs, where
>> N is a pointer of type mnode_t? The comments do not help a lot.
>>
>> --
>> Michel Béland, analyste en calcul scientifique
>> [email protected]
>> bureau S-250, pavillon Roger-Gaudry (principal), Université de Montréal
>> téléphone : 514 343-6111 poste 3892 télécopieur : 514 343-2155
>> Calcul Québec (www.calculquebec.ca)
>> Calcul Canada (calculcanada.ca)
>>
>> _______________________________________________
>> mauiusers mailing list
>> [email protected]
>> http://www.supercluster.org/mailman/listinfo/mauiusers
> --
> Bas van der Vlies
> | Operations, Support & Development | SURFsara | Science Park 140 | 1098 XG
> Amsterdam
> | T +31 (0) 20 800 1300 | [email protected] | www.surfsara.nl |
>
>
>
>
--
Michel Béland, analyste en calcul scientifique
[email protected]
bureau S-250, pavillon Roger-Gaudry (principal), Université de Montréal
téléphone : 514 343-6111 poste 3892 télécopieur : 514 343-2155
Calcul Québec (www.calculquebec.ca)
Calcul Canada (calculcanada.ca)
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers