[slurm-users] Inconsistencies in CPU time Reporting by sreport and sacct Tools

2024-04-17 Thread KK via slurm-users
I wish to ascertain the CPU core time utilized by user dj1 and dj. I have
tested with sreport cluster UserUtilizationByAccount, sreport job
SizesByAccount, and sacct. It appears that sreport cluster
UserUtilizationByAccount displays the total core hours used by the entire
account, rather than the individual user's cpu time. Here are the specifics:

Users dj and dj1 are both under the account mehpc.

In 2024-04-12 ~ 2024-04-15, dj1 used approximately 10 minutes of core time,
while dj used about 4 minutes. However, "sreport Cluster
UserUtilizationByAccount user=dj1 start=2024-04-12 end=2024-04-15" shows 14
minutes of usage. Similarly, "sreport job SizesByAccount Users=dj
start=2024-04-12 end=2024-04-15" hows about 14 minutes.
Using "sreport job SizesByAccount Users=dj1 start=2024-04-12
end=2024-04-15" or "sacct -u dj1 -S 2024-04-12 -E 2024-04-15 -o
"jobid,partition,account,user,alloccpus,cputimeraw,state,workdir%60" -X
|awk 'BEGIN{total=0}{total+=$6}END{print total}'" yields the accurate
values, which are around 10 minutes for dj1.

Attachment are the details.


detail_results
Description: Binary data

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Fwd: sreport cluster UserUtilizationByaccount Used result versus sreport job SizesByAccount or sacct: inconsistencies

2024-04-15 Thread KK via slurm-users
-- Forwarded message -
发件人: KK 
Date: 2024年4月15日周一 13:25
Subject: sreport cluster UserUtilizationByaccount Used result versus
sreport job SizesByAccount or sacct: inconsistencies
To: 


I wish to ascertain the CPU core hours utilized by user dj1 and dj. I have
tested with sreport cluster UserUtilizationByAccount, sreport job
SizesByAccount, and sacct. It appears that sreport cluster
UserUtilizationByAccount displays the total core hours used by the entire
account, rather than the individual user's cpu time. Here are the specifics:

Users dj and dj1 are both under the account mehpc.

In 2024-04-12 ~ 2024-04-15, dj1 used approximately 10 minutes of core time,
while dj used about 4 minutes. However, "*sreport Cluster
UserUtilizationByAccount user=dj1 start=2024-04-12 end=2024-04-15*" shows
14 minutes of usage. Similarly, "*sreport job SizesByAccount Users=dj
start=2024-04-12 end=2024-04-15*" hows about 14 minutes.
Using "*sreport job SizesByAccount Users=dj1 start=2024-04-12
end=2024-04-15*" or "*sacct -u dj1 -S 2024-04-12 -E 2024-04-15 -o
"jobid,partition,account,user,alloccpus,cputimeraw,state,workdir%60" -X
|awk 'BEGIN{total=0}{total+=$6}END{print total}'*" yields the accurate
values, which are around 10 minutes for dj1. Here are the details:

[root@ood-master ~]# sacctmgr list assoc format=cluster,user,account,qos
   Cluster   UserAccount  QOS
-- -- -- 
 mehpc  root   normal
 mehpc   root   root   normal
 mehpc mehpc   normal
 mehpc dj  mehpc   normal
 mehpcdj1  mehpc   normal


[root@ood-master ~]# sacct -X -u dj1 -S 2024-04-12 -E 2024-04-15 -o
jobid,ncpus,elapsedraw,cputimeraw
JobID NCPUS ElapsedRaw CPUTimeRAW
 -- -- --
4 1 60 60
5 2120240
6 1 61 61
8 2120240
9 0  0  0

[root@ood-master ~]# sacct -X -u dj -S 2024-04-12 -E 2024-04-15 -o
jobid,ncpus,elapsedraw,cputimeraw
JobID NCPUS ElapsedRaw CPUTimeRAW
 -- -- --
7 2120240


[root@ood-master ~]# sreport job SizesByAccount Users=dj1 start=2024-04-12
end=2024-04-15

Job Sizes 2024-04-12T00:00:00 - 2024-04-14T23:59:59 (259200 secs)
Time reported in Minutes

  Cluster   Account 0-49 CPUs   50-249 CPUs  250-499 CPUs  500-999 CPUs
 >= 1000 CPUs % of cluster
- - - - - -
- 
mehpc  root10 0 0 0
0  100.00%


[root@ood-master ~]# sreport job SizesByAccount Users=dj start=2024-04-12
end=2024-04-15

Job Sizes 2024-04-12T00:00:00 - 2024-04-14T23:59:59 (259200 secs)
Time reported in Minutes

  Cluster   Account 0-49 CPUs   50-249 CPUs  250-499 CPUs  500-999 CPUs
 >= 1000 CPUs % of cluster
- - - - - -
- 
mehpc  root 4 0 0 0
0  100.00%


[root@ood-master ~]# sreport Cluster UserUtilizationByAccount user=dj1
start=2024-04-12 end=2024-04-15

Cluster/User/Account Utilization 2024-04-12T00:00:00 - 2024-04-14T23:59:59
(259200 secs)
Usage reported in CPU Minutes

  Cluster Login Proper Name Account Used   Energy
- - --- ---  
mehpc   dj1 dj1 dj1   mehpc   140



[root@ood-master ~]# sreport Cluster UserUtilizationByAccount user=dj
start=2024-04-12 end=2024-04-15

Cluster/User/Account Utilization 2024-04-12T00:00:00 - 2024-04-14T23:59:59
(259200 secs)
Usage reported in CPU Minutes

  Cluster Login Proper Name Account Used   Energy
- - --- ---  
mehpcdj   dj dj   mehpc   140


[root@ood-master ~]# sacct -u dj1 -S 2024-04-12 -E 2024-04-15 -o