Hi,

Am 08.01.2014 um 13:11 schrieb Kidwai, Hashir Karim:

> I am sure lots of people have asked the similar question, but I couldn’t find 
> the exact answer which I am looking for.
>  
> I have run the command on cluster of 18 computing nodes with 12 cores each 
> (qacct –j for a particular user) against all the jobs past 30 days and 
> compile the results as follows.
> <snip>
>  
> I am analyzing and comparing the CPU time and Wall clock time in hours (from 
> qacct command) for job submitted and finished in the month of December-2013. 
> These are my findings, so please correct me if I am mistaken.
>  
> 1.       Wall clock time is the time from job submission to job finish.

No. It's start time of the job to stop time of the job. When the job was 
submitted is unrelated to the measured wall clock time.


> 2.       CPU time is the usage time during the job execution.

Yes.


> Since every node is equipped with 12 cores, one should divide the time 
> (except for few instances) with 12 cores which will give one the same or 
> close to the same time as the wall clock.

(Not by 12, but the number of requested and used cores for this job. Otherwise 
slave tasks are not tightly integrated into SGE [see below]. Unless you need 
X11 forwarding inside the cluster, you can even disable ssh/rsh in the cluster 
[I do this to allow only admin staff to reach the nodes by a direct `ssh`])

This depends on the implementation of the application. If it scales perfectly 
linear with an increasing number of cores: yes. But often you gain only a 
speedup by a much smaller rate like 66% per core instead of 50% for each 
doubling of number of cores. And due to the algorithm and communcation between 
the processes some cores might be idle for some time and hence the overall CPU 
time across all cores divided by X is less than the wall clock time. There is 
nothing you can do about it, except not using too many cores for this 
particular application.

E.g. gaining only 66% instead instead of 50%, using 8 cores would mean to lower 
the execution time (from 1.0) to 0.287 (reflecting needed wall clock time) 
instead of 0.125: roughly half of the computing time is wasted. The real used 
CPU time might lay somewhere between 0.287 * 8 and 0.125 * 8.


> But it is infact the total time of all the Cores involved in running the job 
> (??). What is exactly the logic behind it, if my assumption is right ?

> 3.       Slots are basically the total # of cores involved in job execution 
> (slots = Cores)??

Yes.


> 4.       In some instances (not shown in the above table), although wall 
> clock is quite significant but the CPU usage time is close to 0, what could 
> be the logic behind it, it could be a problem with the job or any other 
> factor ?

This might happen if a parallel library is not tightly integrated into SGE. 
I.e.: the main job script starting the `mpiexec` doesn't consume any computing 
time at all (only a fraction for the startup), and the kids are not tracked by 
SGE. Which library was taken for these jobs showing almost no computing time, 
what are the settings of the requested PE and the submission command itself?


> 5.       While analyzing the jobs , I noticed that there is only one hostname 
> (compute node) associated with the job, why is that so? What about other 
> nodes which are running the same job, is there a way to trace them?

For a tightly integrated parallel job where all slave tasks are tracked by SGE, 
you would get an entry for each started remote process in `qacct` (unless the 
parallel environment (PE) has set "accounting_summary TRUE" (`man sge_pe`). If 
you get several entries, all entried must be summarized up for this job.

-- Reuti
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to