Thanks, Ole! Your tools and what you do for the community is fantastic, we
all appreciate you!

Of course, I did look (and use) your script. But I need more info.

And no, this is not something that users would run *ever* (let alone at
every login). This is something I *myself* (the cluster administrator) need
to run, once a quarter, or perhaps even just once a year, to inform my
managers of cluster utilization to keep them apprised on the status of the
affairs, and justify change in funding for future hardware purchases. Sorry
for not making this clear, given the initial message I replied to.

Thanks for any suggestion you might have.

On Wed, Aug 21, 2024 at 12:19 AM Ole Holm Nielsen via slurm-users <
[email protected]> wrote:

> Hi Davide,
>
> Did you already check out what the slurmacct script can do for you?  See
>
> https://github.com/OleHolmNielsen/Slurm_tools/blob/master/slurmacct/slurmacct
>
> What you're asking for seems like a pretty heavy task regarding system
> resources and Slurm database requests.  You don't imagine this to run
> every time a user makes a login shell?  Some users might run "bash -l"
> inside jobs to emulate a login session, causing a heavy load on your
> servers.
>
> /Ole
>
> On 8/21/24 01:13, Davide DelVento via slurm-users wrote:
> > Thanks Kevin and Simon,
> >
> > The full thing that you do is indeed overkill, however I was able to
> learn
> > how to collect/parse some of the information I need.
> >
> > What I am still unable to get is:
> >
> > - utilization by queue (or list of node names), to track actual use of
> > expensive resources such as GPUs, high memory nodes, etc
> > - statistics about wait-in-queue for jobs, due to unavailable resources
> >
> > hopefully both in a sreport-like format by user and by overall system
> >
> > I suspect this information is available in sacct, but needs some
> > massaging/consolidation to become useful for what I am looking for.
> > Perhaps either (or both) of your scripts already do that in some place
> > that I did not find? That would be terrific, and I'd appreciate it if
> you
> > can point me to its place.
> >
> > Thanks again!
> >
> > On Tue, Aug 20, 2024 at 9:09 AM Kevin Broch via slurm-users
> > <[email protected] <mailto:[email protected]>>
> wrote:
> >
> >     Heavyweight solution (although if you have grafana and prometheus
> >     going already a little less so):
> >     https://github.com/rivosinc/prometheus-slurm-exporter
> >     <https://github.com/rivosinc/prometheus-slurm-exporter>
> >
> >     On Tue, Aug 20, 2024 at 12:40 AM Simon Andrews via slurm-users
> >     <[email protected] <mailto:[email protected]
> >>
> >     wrote:
> >
> >         Possibly a bit more elaborate than you want but I wrote a web
> >         based monitoring system for our cluster.  It mostly uses standard
> >         slurm commands for job monitoring, but I've also added storage
> >         monitoring which requires a separate cron job to run every
> night.
> >         It was written for our cluster, but probably wouldn't take much
> >         work to adapt to another cluster with similar structure.
> >
> >         You can see the code and some screenshots at:
> >
> >         https://github.com/s-andrews/capstone_monitor
> >         <https://github.com/s-andrews/capstone_monitor>
> >
> >         ..and there's a video walk through at:
> >
> >         https://vimeo.com/982985174 <https://vimeo.com/982985174>
> >
> >         We've also got more friendly scripts for monitoring current and
> >         past jobs on the command line.  These are in a private repository
> >         as some of the other information there is more sensitive but I'm
> >         happy to share those scripts.  You can see the scripts being used
> >         in https://vimeo.com/982986202 <https://vimeo.com/982986202>
> >
> >         Simon.
> >
> >         -----Original Message-----
> >         From: Paul Edmon via slurm-users <[email protected]
> >         <mailto:[email protected]>>
> >         Sent: 09 August 2024 16:12
> >         To: [email protected]
> >         <mailto:[email protected]>
> >         Subject: [slurm-users] Print Slurm Stats on Login
> >
> >         We are working to make our users more aware of their usage. One
> of
> >         the ideas we came up with was to having some basic usage stats
> >         printed at login (usage over past day, fairshare, job efficiency,
> >         etc). Does anyone have any scripts or methods that they use to do
> >         this? Before baking my own I was curious what other sites do and
> >         if they would be willing to share their scripts and methodology.
> >
> >         -Paul Edmon-
> >
> >
> >         --
> >         slurm-users mailing list -- [email protected]
> >         <mailto:[email protected]> To unsubscribe send an
> >         email to [email protected]
> >         <mailto:[email protected]>
> >
> >         ------------------------------------
> >         This email has been scanned for spam & viruses. If you believe
> >         this email should have been stopped by our filters, click the
> >         following link to report it
> >         (
> https://portal-uk.mailanyone.net/index.html#/outer/reportspam?token=dXNlcj1zaW1vbi5hbmRyZXdzQGJhYnJhaGFtLmFjLnVrO3RzPTE3MjMyMTY5MzA7dXVpZD02NkI2MzQyMTY5MzU2Q0YwRThDQzI5RTY4MkMxOEY5Mjt0b2tlbj01MjI1ZmJmYzJjODgzNWM3ZDE2ZGRiOTE2ZjIxYzk4MjliMjY2MjA0Ow%3D%3D
> <
> https://portal-uk.mailanyone.net/index.html#/outer/reportspam?token=dXNlcj1zaW1vbi5hbmRyZXdzQGJhYnJhaGFtLmFjLnVrO3RzPTE3MjMyMTY5MzA7dXVpZD02NkI2MzQyMTY5MzU2Q0YwRThDQzI5RTY4MkMxOEY5Mjt0b2tlbj01MjI1ZmJmYzJjODgzNWM3ZDE2ZGRiOTE2ZjIxYzk4MjliMjY2MjA0Ow%3D%3D
> >).
> >
>
> --
> slurm-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>
-- 
slurm-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to