Hi!

> For the same time period, what is the reserved column say for "sreport -T
> CPU -t MinPer cluster utilization"?  Meaning, not by account.  If that
> Reserved is 0 for the cluster overall, then that does explain why it's also
> zero for all accounts.  If there is a       discrepancy, then perhaps there
> may be something to investigate.
>
In fact I've just submitted a bug for the cluster overall sreport, but just
for the GRES-Reserved values:
https://bugs.schedmd.com/show_bug.cgi?id=3187

The CPU Reserved is non-zero (so there is a discrepancy, right?):

$ sreport -T CPU,GRES/gpu  -t HourPer cluster utilization
Format=TresName,Allocated,Reserved,Idle,Down Start=`date -d "last month"
+%D`
End=now--------------------------------------------------------------------------------
Cluster Utilization 2016-09-18T00:00:00 - 2016-10-18T13:59:59
Use reported in TRES Hours/Percentage of Total
--------------------------------------------------------------------------------
     TRES Name         Allocated          Reserved              Idle
       Down
-------------- ----------------- ----------------- -----------------
-----------------
           cpu     10718(23.47%)        148(0.33%)     26881(58.87%)
 7914(17.33%)
      gres/gpu      2882(36.72%)          0(0.00%)      4570(58.24%)
 395(5.04%)


I expected that the sum of the by-account Reserved on CPU was the same of
the overall CPU Reserved (as it is by the Allocated).

In the GRES/GPU Reserved case is always 0 in all cases, so it's
consistent... but for me itlooked like a bug, so I reported it there... was
it ok?

Reserved time in sreport is time nodes are held idle (by the backfill
> scheduler) to start the job.  If you aren't using backfill,
>
We are using SchedulerType=sched/backfill.


> or if all job submissions request about the same quantity of hardware
> resources then it may always be zero.  If there were some users submitting
> large jobs and some small, then I would expect there to be some non-zero
> time.
>
Users are requesting different amount of resources, but I'm not sure about
the value of the "varience"... ;-)
Anyway, I need to read and think why it should be 0 if all the users ask
for the same amount of CPUs... even if they wait in the queue?
I'm sure I'm missing something here...


Thanks!


Albert

-- 
_________________________________________________________

OOO Albert Gil Moreno <https://imatge.upc.edu/web/people/albert-gil-moreno>
OOO Image Processing Group <https://imatge.upc.edu>
OOO Universitat Politècnica de Catalunya <http://www.upc.edu/>
_________________________________________________________

Reply via email to