Hi,

We recently noticed that the CPU time value recorded in the accounting file
is wrong (value is too high)  for parallel SMP jobs (OpenMP, Gaussian09)

SGE version: OGS 2011.11
System: Centos 6.4
kernel: 2.6.32-220.23.1.el6.x86_64

qacct -j 3559
...
jobnumber    3559
taskid       undefined
account      sge
priority     0
qsub_time    Mon Jan 13 21:47:08 2014
start_time   Mon Jan 13 21:47:09 2014
end_time     Sat Jan 25 21:33:12 2014
granted_pe   openmp
slots        8
failed       0
exit_status  0
ru_wallclock 1035963
ru_utime     12824765754.210
ru_stime     5630063030.787
ru_maxrss    853692
ru_ixrss     0
ru_ismrss    0
ru_idrss     0
ru_isrss     0
ru_minflt    187417160
ru_majflt    0
ru_nswap     0
ru_inblock   1948210454
ru_oublock   -1658595332
ru_msgsnd    0
ru_msgrcv    0
ru_nsignals  0
ru_nvcsw     169985434
ru_nivcsw    24673526
cpu          18454828784.997
mem          17468450365.385
io           6282.100
iow          0.000
maxvmem      10.010G
arid         undefined


The job duration is 1035963 (12d) how come the cpu time is  18454828784 !!
is this a known bug in recent version of Linux ?

the pe definition is :

pe_name            openmp
slots              9999
user_lists         NONE
xuser_lists        NONE
start_proc_args    NONE
stop_proc_args     NONE
allocation_rule    $pe_slots
control_slaves     TRUE
job_is_first_task  FALSE
urgency_slots      min
accounting_summary TRUE

Regards,
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to