Re: [gridengine users] simple report with qstat -F ?

Reuti Mon, 23 Jan 2012 09:07:48 -0800

Am 23.01.2012 um 16:30 schrieb Gerard Henry:

> On 01/20/12 07:55 PM, Reuti wrote:
> 
>> Well, for the CPU time spend right now on a process it's not reported per 
>> process, at least I've never heard of it. The kernel schedules the processes 
>> ouside of SGE and the nice values are only relative. So it could only report 
>> in short time intervals like: in the last 2 seconds I spent 50% on this 
>> process and 10% for each of the following processes. But a second later it 
>> might change already as one is waiting for I/O. I would expect that `top` is 
>> doing it this way, as the values are more constant in case you specify a 
>> longer refresh interval.
>> 
>> As in the outline for the memory above, it should be possible to compute the 
>> difference of used up CPU time in the last 1/5/15 minutes for each process 
>> and compute the amount of CPU spend for the jobs in a particular queue this 
>> way.
>> 
>> ==
>> 
>> Another idea: build a load sensor (i.e. one complex needs to be defined per 
>> queue in question and attached to each exechost) and feed:
>> 
>> top -b -n 2 -d 120 -p 1600,1234 | tail -n 3
>> 
>> to it where 1600 and 1234 are the PIDs of the jobs whose additonal group was 
>> found before $SGE_SPOOL_DIR/active_jobs. I suggest -n 2, as the first output 
>> doesn't show anything meaningful regarding CPU time because the interval is 
>> too short. This means to read one cycle ahead to avoid a delay in replying 
>> to SGE. The values can then be sorted by queue and assigned to appropriate 
>> complexes for each queue on each host.
> 
> 
> thanks to your reply. I'm not sure to understand, certainly because my 
> question was so obscure.
> Here is a solution i tried. Considering that all jobs are ended, and i want 
> to give values about cpu and memory consumed by jobs during last year, i dig 
> into /local/export/sge/default/common/accounting and extract with the 
> attached python script the folowing results:
> user iusti (51) cpu: 634.871534559 days mem: 0.805913076979 Go
> user irphe (135) cpu: 414.912775315 days mem: 1.4525422628 Go
> user l3m (252) cpu: 567.227461918 days mem: 1.70951371079 Go
> user lma (45) cpu: 1139.86254098 days mem: 1.71787710938 Go
> user latp (106) cpu: 127.5595829 days mem: 1.41270344795 Go
> 
> i'm just summing each cpu and mem values for a user on a queue defined on one 
> host.
> The only problem is that the mem record is false due to the "4go" bug in SGE 
> 6.2u5


Yes, after the job you get the integral value over a certain time frame. But 
it's harder to make a statement of the actual cpu consumption for each task 
right now while it's running. Imagine you have only a single core CPU - at a 
tiny timeframe each process gets 100% and is the only one running - but this is 
not the output you are looking for. So it's necessary to define a timeframe: 
the shared CPU in the last 5 minutes or so for each queue. Therefore I used the 
-d 120 for two minutes in the above `top` command (it seems like the very first 
output is always after 3 seconds and can't be avoided [unless you rewrite 
`top`]).


> another point obscure for me, is, " can i detect if a job is OpenMP or serial 
> job by looking in accounting file" ?

An OpenMP job should have a higher cpu time than wallclock time.


> for instance, i can write the following comand:
> cat /local/export/sge/default/common/accounting | awk -F':' 'BEGIN 
> {printf("owner group job_number granted_pe cpu mem category\n")} /<hostname>/ 
> && /<queuename>/ {printf("%s %s %s %s %s %s %s\n", $4, $3, $6, $34, $37, $38, 
> $40)}' | less
> owner group job_number granted_pe cpu mem category
> user1 iusti 13496 impi 2572922.350000 5628845.960567 -q dev -pe impi 10
> user1 iusti 13925 impi 15400293.770000 15728826.955261 -q dev -pe impi 25
> user2 iusti 13926 NONE 602366.910000 768713.172452 -q dev
> user2 iusti 14088 NONE 606897.320000 779858.199665 -q dev
> user1 iusti 14169 NONE 0.382940 0.003174 -U arusers -ar 2002
> 
> does "NONE" in granted_pe show an OpenMP job ?

Not per se, in fact it would be an abuse.

In my opinion also an OpenMP job is a parallel job and should request a PE 
(maybe named "smp" with "allocation_rule $pe_slots"). Otherwise a node might 
get oversubscribed.

But you can compare the wallclock time and CPU time consumed for the job.

-- Reuti
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] simple report with qstat -F ?

Reply via email to