Re: Re: [gt-user] How to set cores-per-node in WS job submission?

Jan Ploski Thu, 08 May 2008 04:52:04 -0700

[EMAIL PROTECTED] schrieb am 05/08/2008 12:36:23 PM:

> > >   2) generic control of RAM-per-process
> > 
> > We would like to move to JSDL where I would think this would be 
> > covered, but after scanning, it looks like it isn't.  jsdl posix has 
> > MemoryLimit, but that is for the job and not for each process in the 
> > job.  So I don't think even JSDL provides this.


> > 8.1.14.1 Definition
> > This element is a positive integer that describes the maximum amount 
> > of physical memory that the job should use when executing.
> > The amount is given in bytes. If this is not present then the
> > consuming system MAY choose its default value10.
> > 
> 
> This would suffice if implemented properly. 
> 
> The memory per process would be 
>    mem_per_process = MemoryLimit / count
> 
> The number of cores to assign per node on cluster with multi-core nodes
> could be calculated as
> 
>    available_RAM_per_node / mem_per_process

Steve,

I'm not sure whether it would be a correct implementation or another quick 
hack. JSDL says nothing about the relationship of a POSIXApplication to a 
group of processes launched by MPI. As a matter of fact, it is remarkably 
silent about the relationships between jobs and processes and says nothing 
about relationships among processes. Maybe noone familiar with MPI 
participated in writing JSDL or maybe - more likely - the tough issue was 
put off "until later".

Anyway, one can reason about the exectuion of an MPI application as a 
scenario involving the execution of n instances of a POSIXApplication. 
This interpretation would fit quite well the actual MPI runner 
implementations whose job is always to launch n processes of the 
user-specified executable on m <= n machines, using whatever 
system-specific means are available. Therefore, I would suggest that if 
JSDL is used, the MemoryLimit in the POSIXApplication element is not some 
aggregate "physical memory that the job should use when executing" to be 
divided among processes using a rule of thumb. Instead, treat it as a 
specification which applies to each single process of a multi-process job; 
it *is*, after all, a description of an executable POSIX process. For 
maximum flexibility, one should probably be able to specify a different 
POSIXApplication element for each MPI process.

Apart from these considerations, I am not sure if your "RAM per process" 
requirement is covered by the intent of MemoryLimit. MemoryLimit basically 
translates to "ulimit -m" in bash (JSDL authors also forgot to mention 
whether the hard or soft limit was meant). Is this what you are looking 
for? Or do you want to guarantee that a certain amount of memory can be 
allocated by a process without incurring paging activity during its entire 
execution? Perhaps both?

By the way, here is what the documentation of PBS/TORQUE offers on the 
topic of memory specifications (I bet you can find similar passages in SGE 
documentation, too):

'mem':
"Maximum amount of physical memory used by the job. (Ignored on Darwin, 
Digital Unix, Free BSD, HPUX 11, IRIX, NetBSD, and SunOS. Also ignored on 
Linux if number of nodes is not 1. Not implemented on AIX and HPUX 10.)"

'pmem':
"Maximum amount of physical memory used by any single process of the job. 
(Ignored on Fujitsu. Not implemented on Digital Unix and HPUX.)"

'pvmem':
"Maximum amount of virtual memory used by any single process in the job. 
(Ignored on Unicos.)"

'vmem':
"Maximum amount of virtual memory used by all concurrent processes in the 
job. (Ignored on Unicos.)"

Despite TORQUE being a "low-level" resource manager, this is just as fuzzy 
as JSDL, and remarkably non-portable. How on earth does one, in general, 
measure the "amount of physical memory" used by a process? The meaning of 
such magical parameters only becomes established for a particular 
architecture and particular configuration of TORQUE, and can only really 
be found out through experimentation or source code inspection. Some 
aspects of reality are really difficult to abstract away...

Regards,
Jan Ploski

Re: Re: [gt-user] How to set cores-per-node in WS job submission?

Reply via email to