date:20191126

[slurm-users] Verbose mode of the 'accel-bind' does not work.

2019-11-26 Thread Uemoto, Tomoki

Hi, all OS Version: RHEL 7.6 SLURM Version: slurm 18.08.6 I defined the gpu resource as follows: [test@ohpc137pbsop-c001 ~]$ scontrol show config |grep TaskPlugin TaskPlugin = task/cgroup TaskPluginParam = (null type) [test@ohpc137pbsop-c001 ~]$

Re: [slurm-users] Filter slurm e-mail notification

2019-11-26 Thread Brian Andrus

I guess you need to decide how to approach it. If you can't educate users as to how to appropriately use the -mail flag, then you are assuming they will abuse it. In that situation, you need to configure your mail server itself to rate limit or something. That approach depends on the mail

Re: [slurm-users] slurm reporting

2019-11-26 Thread Mark Hahn

Would Grafana do similar job as XDMoD? I was wondering whether to pipe up. I work for ComputeCanada, which runs a number of significant clusters. During a major upgrade a few years ago, we looked at XDMoD, and decided against it. Primarily because we wanted greater flexibility - we have

Re: [slurm-users] slurm reporting

2019-11-26 Thread Renfro, Michael

Once you added enough to ingest the Slurm logs into Influx or whatever, it could be similar. XDMoD already has the pieces in place to dig through your hierarchy of PIs, users, etc. Plus some built-in queries for correlating job size to wait time, for example:

Re: [slurm-users] slurm reporting

2019-11-26 Thread Ricardo Gregorio

Mike, It sounds interesting...In fact I had come across XDMoD this morning while "searching" for further info... Would Grafana do similar job as XDMoD? -Original Message- From: slurm-users On Behalf Of Renfro, Michael Sent: 26 November 2019 16:14 To: Slurm User Community List

Re: [slurm-users] slurm reporting

2019-11-26 Thread Renfro, Michael

> • Total number of jobs submitted by user (daily/weekly/monthly) > • Average queue time per user (daily/weekly/monthly) > • Average job run time per user (daily/weekly/monthly) Open XDMoD for these three. https://github.com/ubccr/xdmod , plus https://xdmod.ccr.buffalo.edu

[slurm-users] slurm reporting

2019-11-26 Thread Ricardo Gregorio

Hi all, I am new to both HPC and SLURM. I have been trying to run some usage reports (using sreport and sacct); but I cannot find a way to get the following info: * Total number of jobs submitted by user (daily/weekly/monthly) * Average queue time per user (daily/weekly/monthly) *

[slurm-users] scontrol not removing DRAINED

2019-11-26 Thread Rick Van Conant

If a node is marked as DOWN after it has been DRAINED, why is the node still showing DRAINED instead of DOWN? Rick Van Conant Systems Administrator SCD/SCF/HPC Fermi National Accelertator Laboratory 630-840-8747 office www.fnal.gov Connect with Fermilab Facebook |

Re: [slurm-users] good practices

2019-11-26 Thread Eli V

Inline below On Tue, Nov 26, 2019 at 5:50 AM Loris Bennett wrote: > > Hi Nigella, > > Nigella Sanders writes: > > > Thank you all for such interesting replies. > > > > The --dependency option is quite useful but in practice it has some > > inconvenients. Firstly, all 20 jobs are instantly

Re: [slurm-users] good practices

2019-11-26 Thread Loris Bennett

Hi Nigella, Nigella Sanders writes: > Thank you all for such interesting replies. > > The --dependency option is quite useful but in practice it has some > inconvenients. Firstly, all 20 jobs are instantly queued which some > users may be interpreting as an abusive use of common resources.

Re: [slurm-users] [External] Re: Filter slurm e-mail notification

2019-11-26 Thread Florian Zillner

Hi, I guess you could use a lua script to filter out flags you don't want. I haven't tried it with mail flags, but I'm using a script like the one referenced to enforce accounts/time limits, etc. https://funinit.wordpress.com/2018/06/07/how-to-use-job_submit_lua-with-slurm/ Cheers, Florian

Re: [slurm-users] good practices

2019-11-26 Thread Nigella Sanders

Thank you all for such interesting replies. The --dependency option is quite useful but in practice it has some inconvenients. Firstly, all 20 jobs are *instantly queued* which some users may be interpreting as an abusive use of common resources. Even worse, if a job fails, the rest one will stay

[slurm-users] Verbose mode of the 'accel-bind' does not work.

Re: [slurm-users] Filter slurm e-mail notification

Re: [slurm-users] slurm reporting

Re: [slurm-users] slurm reporting

Re: [slurm-users] slurm reporting

Re: [slurm-users] slurm reporting

[slurm-users] slurm reporting

[slurm-users] scontrol not removing DRAINED

Re: [slurm-users] good practices

Re: [slurm-users] good practices

Re: [slurm-users] [External] Re: Filter slurm e-mail notification

Re: [slurm-users] good practices

12 matches

Site Navigation

Mail list logo

Footer information