[slurm-users] Re: Munge log-file fills up the file system to 100%
> AFAIK, the fs.file-max limit is a node-wide limit, whereas "ulimit -n" > is per user. The ulimit is a frontend to rusage limits, which are per-process restrictions (not per-user). The fs.file-max is the kernel's limit on how many file descriptors can be open in aggregate. You'd have to edit that with sysctl: $ sysctl fs.file-max fs.file-max = 26161449 Check in e.g. /etc/sysctl.conf or /etc/sysctl.d if you have an alternative limit versus the default. > But if you have ulimit -n == 1024, then no user should be able to hit > the fs.file-max limit, even if it is 65536. (Technically, 96 jobs from > 96 users each trying to open 1024 files would do it, though.) Naturally, since the ulimit is per-process the equating of core count with the multiplier isn't valid. It also assumes Slurm isn't setup to oversubscribe CPU resources :-) >> I'm not sure how the number 3092846 got set, since it's not defined in >> /etc/security/limits.conf. The "ulimit -u" varies quite a bit among >> our compute nodes, so which dynamic service might affect the limits? If the 1024 is a soft limit, you may have users who are raising it to arbitrary values themselves, for example. Especially as 1024 is somewhat low for the more naively-written data science Python code I see on our systems. If Slurm is configured to propagate submission shell ulimits to the runtime environment and you allow submission from a variety of nodes/systems you could be seeing myriad limits reconstituted on the compute node despite the /etc/security/limits.conf settings. The main question needing an answer is _what_ process(es) are opening all the files on your systems that are faltering. It's very likely to be user jobs' opening all of them, I was just hoping to also rule out any bug in munged. Since you're upgrading munged, you'll now get the errno associated with the backlog and can confirm EMFILE vs. ENFILE vs. ENOMEM. -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] Re: Munge log-file fills up the file system to 100%
https://github.com/dun/munge/issues/94 The NEWS file claims this was fixed in 0.5.15. Since your log doesn't show the additional strerror() output you're definitely running an older version, correct? If you go on one of the affected nodes and do an `lsof -p ` I'm betting you'll find a long list of open file descriptors — that would explain the "Too many open files" situation _and_ indicate that this is something other than external memory pressure or open file limits on the process. > On Apr 15, 2024, at 08:14, Ole Holm Nielsen via slurm-users > wrote: > > We have some new AMD EPYC compute nodes with 96 cores/node running RockyLinux > 8.9. We've had a number of incidents where the Munge log-file > /var/log/munge/munged.log suddenly fills up the root file system, after a > while to 100% (tens of GBs), and the node eventually comes to a grinding > halt! Wiping munged.log and restarting the node works around the issue. > > I've tried to track down the symptoms and this is what I found: > > 1. In munged.log there are infinitely many lines filling up the disk: > > 2024-04-11 09:59:29 +0200 Info: Suspended new connections while > processing backlog > > 2. The slurmd is not getting any responses from munged, even though we run > "munged --num-threads 10". The slurmd.log displays errors like: > > [2024-04-12T02:05:45.001] error: If munged is up, restart with > --num-threads=10 > [2024-04-12T02:05:45.001] error: Munge encode failed: Failed to connect to > "/var/run/munge/munge.socket.2": Resource temporarily unavailable > [2024-04-12T02:05:45.001] error: slurm_buffers_pack_msg: auth_g_create: > RESPONSE_ACCT_GATHER_UPDATE has authentication error > > 3. The /var/log/messages displays the errors from slurmd as well as > NetworkManager saying "Too many open files in system". > The telltale syslog entry seems to be: > > Apr 12 02:05:48 e009 kernel: VFS: file-max limit 65536 reached > > where the limit is confirmed in /proc/sys/fs/file-max. > > We have never before seen any such errors from Munge. The error may perhaps > be triggered by certain user codes (possibly star-ccm+) that might be opening > a lot more files on the 96-core nodes than on nodes with a lower core count. > > My workaround has been to edit the line in /etc/sysctl.conf: > > fs.file-max = 131072 > > and update settings by "sysctl -p". We haven't seen any of the Munge errors > since! > > The version of Munge in RockyLinux 8.9 is 0.5.13, but there is a newer > version in https://github.com/dun/munge/releases/tag/munge-0.5.16 > I can't figure out if 0.5.16 has a fix for the issue seen here? > > Questions: Have other sites seen the present Munge issue as well? Are there > any good recommendations for setting the fs.file-max parameter on Slurm > compute nodes? > > Thanks for sharing your insights, > Ole > > -- > Ole Holm Nielsen > PhD, Senior HPC Officer > Department of Physics, Technical University of Denmark > > -- > slurm-users mailing list -- slurm-users@lists.schedmd.com > To unsubscribe send an email to slurm-users-le...@lists.schedmd.com -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] Re: Restricting local disk storage of jobs
The native job_container/tmpfs would certainly have access to the job record, so modification to it (or a forked variant) would be possible. A SPANK plugin should be able to fetch the full job record [1] and is then able to inspect the "gres" list (as a C string), which means I could modify UD's auto_tmpdir accordingly. Having a compiled plugin executing xfs_quota to effect the commands illustrated wouldn't be a great idea -- luckily Linux XFS has an API. Seemingly not the simplest one, but xfsprogs is a working example. [1] https://gitlab.hpc.cineca.it/dcesari1/slurm-msrsafe > On Feb 7, 2024, at 05:25, Tim Schneider via slurm-users > wrote: > > Hey Jeffrey, > thanks for this suggestion! This is probably the way to go if one can find a > way to access GRES in the prolog. I read somewhere that people were calling > scontrol to get this information, but this seems a bit unclean. Anyway, if I > find some time I will try it out. > Best, > Tim > On 2/6/24 16:30, Jeffrey T Frey wrote: >> Most of my ideas have revolved around creating file systems on-the-fly as >> part of the job prolog and destroying them in the epilog. The issue with >> that mechanism is that formatting a file system (e.g. mkfs.) can be >> time-consuming. E.g. formatting your local scratch SSD as an LVM PV+VG and >> allocating per-job volumes, you'd still need to run a e.g. mkfs.xfs and >> mount the new file system. >> >> >> ZFS file system creation is much quicker (basically combines the LVM + mkfs >> steps above) but I don't know of any clusters using ZFS to manage local file >> systems on the compute nodes :-) >> >> >> One could leverage XFS project quotas. E.g. for Slurm job 2147483647: >> >> >> [root@r00n00 /]# mkdir /tmp-alloc/slurm-2147483647 >> [root@r00n00 /]# xfs_quota -x -c 'project -s -p /tmp-alloc/slurm-2147483647 >> 2147483647' /tmp-alloc >> Setting up project 2147483647 (path /tmp-alloc/slurm-2147483647)... >> Processed 1 (/etc/projects and cmdline) paths for project 2147483647 with >> recursion depth infinite (-1). >> [root@r00n00 /]# xfs_quota -x -c 'limit -p bhard=1g 2147483647' /tmp-alloc >> [root@r00n00 /]# cd /tmp-alloc/slurm-2147483647 >> [root@r00n00 slurm-2147483647]# dd if=/dev/zero of=zeroes bs=5M count=1000 >> dd: error writing ‘zeroes’: No space left on device >> 205+0 records in >> 204+0 records out >> 1073741824 bytes (1.1 GB) copied, 2.92232 s, 367 MB/s >> >>: >> >> [root@r00n00 /]# rm -rf /tmp-alloc/slurm-2147483647 >> [root@r00n00 /]# xfs_quota -x -c 'limit -p bhard=0 2147483647' /tmp-alloc >> >> >> Since Slurm jobids max out at 0x03FF (and 2147483647 = 0x7FFF) we >> have an easy on-demand project id to use on the file system. Slurm tmpfs >> plugins have to do a mkdir to create the per-job directory, adding two >> xfs_quota commands (which run in more or less O(1) time) won't extend the >> prolog by much. Likewise, Slurm tmpfs plugins have to scrub the directory at >> job cleanup, so adding another xfs_quota command will not do much to change >> their epilog execution times. The main question is "where does the tmpfs >> plugin find the quota limit for the job?" >> >> >> >> >> >>> On Feb 6, 2024, at 08:39, Tim Schneider via slurm-users >>> wrote: >>> >>> Hi, >>> >>> In our SLURM cluster, we are using the job_container/tmpfs plugin to ensure >>> that each user can use /tmp and it gets cleaned up after them. Currently, >>> we are mapping /tmp into the nodes RAM, which means that the cgroups make >>> sure that users can only use a certain amount of storage inside /tmp. >>> >>> Now we would like to use of the node's local SSD instead of its RAM to hold >>> the files in /tmp. I have seen people define local storage as GRES, but I >>> am wondering how to make sure that users do not exceed the storage space >>> they requested in a job. Does anyone have an idea how to configure local >>> storage as a proper tracked resource? >>> >>> Thanks a lot in advance! >>> >>> Best, >>> >>> Tim >>> >>> >>> -- >>> slurm-users mailing list -- slurm-users@lists.schedmd.com >>> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com >> > > -- > slurm-users mailing list -- slurm-users@lists.schedmd.com > To unsubscribe send an email to slurm-users-le...@lists.schedmd.com -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] Re: Restricting local disk storage of jobs
Most of my ideas have revolved around creating file systems on-the-fly as part of the job prolog and destroying them in the epilog. The issue with that mechanism is that formatting a file system (e.g. mkfs.) can be time-consuming. E.g. formatting your local scratch SSD as an LVM PV+VG and allocating per-job volumes, you'd still need to run a e.g. mkfs.xfs and mount the new file system. ZFS file system creation is much quicker (basically combines the LVM + mkfs steps above) but I don't know of any clusters using ZFS to manage local file systems on the compute nodes :-) One could leverage XFS project quotas. E.g. for Slurm job 2147483647: [root@r00n00 /]# mkdir /tmp-alloc/slurm-2147483647 [root@r00n00 /]# xfs_quota -x -c 'project -s -p /tmp-alloc/slurm-2147483647 2147483647' /tmp-alloc Setting up project 2147483647 (path /tmp-alloc/slurm-2147483647)... Processed 1 (/etc/projects and cmdline) paths for project 2147483647 with recursion depth infinite (-1). [root@r00n00 /]# xfs_quota -x -c 'limit -p bhard=1g 2147483647' /tmp-alloc [root@r00n00 /]# cd /tmp-alloc/slurm-2147483647 [root@r00n00 slurm-2147483647]# dd if=/dev/zero of=zeroes bs=5M count=1000 dd: error writing ‘zeroes’: No space left on device 205+0 records in 204+0 records out 1073741824 bytes (1.1 GB) copied, 2.92232 s, 367 MB/s : [root@r00n00 /]# rm -rf /tmp-alloc/slurm-2147483647 [root@r00n00 /]# xfs_quota -x -c 'limit -p bhard=0 2147483647' /tmp-alloc Since Slurm jobids max out at 0x03FF (and 2147483647 = 0x7FFF) we have an easy on-demand project id to use on the file system. Slurm tmpfs plugins have to do a mkdir to create the per-job directory, adding two xfs_quota commands (which run in more or less O(1) time) won't extend the prolog by much. Likewise, Slurm tmpfs plugins have to scrub the directory at job cleanup, so adding another xfs_quota command will not do much to change their epilog execution times. The main question is "where does the tmpfs plugin find the quota limit for the job?" > On Feb 6, 2024, at 08:39, Tim Schneider via slurm-users > wrote: > > Hi, > > In our SLURM cluster, we are using the job_container/tmpfs plugin to ensure > that each user can use /tmp and it gets cleaned up after them. Currently, we > are mapping /tmp into the nodes RAM, which means that the cgroups make sure > that users can only use a certain amount of storage inside /tmp. > > Now we would like to use of the node's local SSD instead of its RAM to hold > the files in /tmp. I have seen people define local storage as GRES, but I am > wondering how to make sure that users do not exceed the storage space they > requested in a job. Does anyone have an idea how to configure local storage > as a proper tracked resource? > > Thanks a lot in advance! > > Best, > > Tim > > > -- > slurm-users mailing list -- slurm-users@lists.schedmd.com > To unsubscribe send an email to slurm-users-le...@lists.schedmd.com -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
Re: [slurm-users] Fairshare: Penalising unused memory rather than used memory?
> On the automation part, it would be pretty easy to do regularly(daily?) stats > of jobs for that period of time and dump them into an sql database. > Then a select statement where cpu_efficiency is less than desired value and > get the list of not so nice users on which you can apply whatever > warnings/limits you want to do. Assuming the existence of collection and collation of the penalty data (e.g. in an SQL database) you could consider using a site factor plugin to deprioritize scheduling priority w/o the need to alter an association in the accounting database. https://slurm.schedmd.com/site_factor.html
Re: [slurm-users] Notify users about job submit plugin actions
In case you're developing the plugin in C and not LUA, behind the scenes the LUA mechanism is concatenating all log_user() strings into a single variable (user_msg). When the LUA code completes, the C code sets the *err_msg argument to the job_submit()/job_modify() function to that string, then NULLs-out user-msg. (There's a mutex around all of that code so slurmctld never executes LUA job submit/modify scripts concurrently.) The slurmctld then communicates that returned string back to sbatch/salloc/srun for display to the user. Your C plugin would do likewise — set *err_msg before returning from job_submit()/job_modify() — and needn't be mutex'ed if the code is reentrant. > On Jul 19, 2023, at 08:37, Angel de Vicente wrote: > > Hello Lorenzo, > > Lorenzo Bosio writes: > >> I'm developing a job submit plugin to check if some conditions are met >> before a job runs. >> I'd need a way to notify the user about the plugin actions (i.e. why its >> jobs was killed and what to do), but after a lot of research I could only >> write to logs and not the user shell. >> The user gets the output of slurm_kill_job but I can't find a way to add a >> custom note. >> >> Can anyone point me to the right api/function in the code? > > In our "job_submit.lua" script we have the following for that purpose: > > , > | slurm.log_user("%s: WARNING: [...]", log_prefix) > ` > > -- > Ángel de Vicente > Research Software Engineer (Supercomputing and BigData) > Tel.: +34 922-605-747 > Web.: http://research.iac.es/proyecto/polmag/ > > GPG: 0x8BDC390B69033F52
Re: [slurm-users] How do I set SBATCH_EXCLUSIVE to its default value?
> I get that these correspond > > --exclusive=userexport SBATCH_EXCLUSIVE=user > --exclusive=mcs export SBATCH_EXCLUSIVE=mcs > But --exclusive has a default behavior if I don't assign it a value. What do > I set SBATCH_EXCLUSIVE to, to get the same default behavior? Try setting the env var to an empty string: export SBATCH_EXCLUSIVE=""
Re: [slurm-users] slurm and singularity
You may need srun to allocate a pty for the command. The InteractiveStepOptions we use (that are handed to srun when no explicit command is given to salloc) are: --interactive --pty --export=TERM E.g. without those flags a bare srun gives a promptless session: [(it_nss:frey)@login00.darwin ~]$ salloc -p idle srun /opt/shared/singularity/3.10.0/bin/singularity shell /opt/shared/singularity/prebuilt/postgresql/13.2.simg salloc: Granted job allocation 3953722 salloc: Waiting for resource configuration salloc: Nodes r1n00 are ready for job ls -l total 437343 -rw-r--r-- 1 frey it_nss 180419 Oct 26 16:56 amd.cache -rw-r--r-- 1 frey it_nss 72 Oct 26 16:52 amd.conf -rw-r--r-- 1 frey everyone 715 Nov 12 23:39 anaconda-activate.sh drwxr-xr-x 2 frey everyone 4 Apr 11 2022 bin : With the --pty flag added: [(it_nss:frey)@login00.darwin ~]$ salloc -p idle srun --pty /opt/shared/singularity/3.10.0/bin/singularity shell /opt/shared/singularity/prebuilt/postgresql/13.2.simg salloc: Granted job allocation 3953723 salloc: Waiting for resource configuration salloc: Nodes r1n00 are ready for job Singularity> > On Feb 8, 2023, at 09:47 , Groner, Rob wrote: > > I tried that, and it says the nodes have been allocated, but it never comes > to an apptainer prompt. > > I then tried doing them in separate steps. Doing salloc works, I get a > prompt on the node that was allocated. I can then run "singularity shell > " and get the apptainer prompt. If I prefix that command with "srun", > then it just hangs and I never get the prompt. So that seems to be the > sticking point. I'll have to do some experiments running singularity with > srun. > > From: slurm-users on behalf of > Jeffrey T Frey > Sent: Tuesday, February 7, 2023 6:16 PM > To: Slurm User Community List > Subject: Re: [slurm-users] slurm and singularity > > You don't often get email from f...@udel.edu. Learn why this is important >> The remaining issue then is how to put them into an allocation that is >> actually running a singularity container. I don't get how what I'm doing >> now is resulting in an allocation where I'm in a container on the submit >> node still! > > Try prefixing the singularity command with "srun" e.g. > > > salloc srun /usr/bin/singularity shell >
Re: [slurm-users] slurm and singularity
> The remaining issue then is how to put them into an allocation that is > actually running a singularity container. I don't get how what I'm doing now > is resulting in an allocation where I'm in a container on the submit node > still! Try prefixing the singularity command with "srun" e.g. salloc srun /usr/bin/singularity shell
Re: [slurm-users] Why every job will sleep 100000000
If you examine the process hierarchy, that "sleep 1" process if probably the child of a "slurmstepd: [.extern]" process. This is a housekeeping step launched for the job by slurmd -- in older Slurm releases it would handle the X11 forwarding, for example. It should have no impact on the other steps of the job. > On Nov 4, 2022, at 05:26 , GHui wrote: > > I found a sleep process running by root, when I submit a job. And it sleep > 1 seconds. > Sometimes, my job is hung up. The job state is "R". Though it runs nothing, > the jobscript like the following, > -- > #!/bin/bash > #SBATCH -J sub > #SBATCH -N 1 > #SBATCH -n 1 > #SBATCH -p vpartition > > -- > > Is it because of "sleep 1" process? Or how could I debug it? > > Any help will be appreciated. > --GHui
Re: [slurm-users] sacct output in tabular form
You've confirmed my suspicion — no one seems to care for Slurm's standard output formats :-) At UD we did a Python curses wrapper around the parseable output to turn the terminal window into a navigable spreadsheet of output: https://gitlab.com/udel-itrci/slurm-output-wrappers > On Aug 25, 2021, at 01:41 , Sternberger, Sven > wrote: > > Hello! > > this is a simple wrapper for sacct which prints the > output from sacct as table. So you can make a > "sacctml -j foo --long" even without two 8k displays ;-) > > cheers
Re: [slurm-users] Bug: incorrect output directory fails silently
> I understand that there is no output file to write an error message to, but > it might be good to check the `--output` path during the scheduling, just > like `--account` is checked. > > Does anybody know a workaround to be warned about the error? I would make a feature request of SchedMD to fix the issue, then I would write a cli_filter plugin to validate the --output/--error/--input paths as desired until Slurm itself handles it.
Re: [slurm-users] squeue: compact pending job-array in one partition, but not in other
Did those four jobs 6577272_21 scavenger PD 0:00 1 (Priority) 6577272_22 scavenger PD 0:00 1 (Priority) 6577272_23 scavenger PD 0:00 1 (Priority) 6577272_28 scavenger PD 0:00 1 (Priority) run before and get requeued? Seems likely with a partition named "scavenger." > On Feb 23, 2021, at 13:59 , Loris Bennett wrote: > > Hi, > > Does anyone have an idea why pending elements of an array job in one > partition should be displayed compactly by 'squeue' but those of another > in a different partition are displayed one element per line? Please see below > (compact display in 'main', one element per line in 'scavenger'). > This is with version 20.02.6 > > Cheers, > > Loris > > JOBID PARTITION ST TIME NODES NODELIST(REASON) > ... > 6755576 main PD 0:00 1 (Priority) > 6749327_[754-1000] main PD 0:00 1 (Priority) > 6748246 main PD 0:00 1 (Priority) > 6749213 main PD 0:00 1 (Priority) > 6749309 main PD 0:00 1 (Priority) > 6750124 main PD 0:00 1 (Priority) > 6752967 main PD 0:00 1 (Priority) > 6746767 main PD 0:00 1 (Priority) > 6755188 main PD 0:00 1 (Priority) > 6702557_[13] main PD 0:00 4 (Priority) >6702858_[1-10] main PD 0:00 4 (Priority) > 6703700_[1-4,6-8] main PD 0:00 4 (Priority) > 6703764_[1] main PD 0:00 4 (Priority) > 6705324_[1,3,5,9] main PD 0:00 4 (Priority) > 6748962 main PD 0:00 4 (Priority) > 6709963 main PD 0:00 1 (Priority) > 6709964 main PD 0:00 1 (Priority) > 6709976 main PD 0:00 1 (Priority) > 6462709_[1-77,79-8 main PD 0:00 1 (QOSMaxCpuPerUserLimit) > 6463366_[1-2,28-72 main PD 0:00 1 (QOSMaxCpuPerUserLimit) >6577272_21 scavenger PD 0:00 1 (Priority) >6577272_22 scavenger PD 0:00 1 (Priority) >6577272_23 scavenger PD 0:00 1 (Priority) >6577272_28 scavenger PD 0:00 1 (Priority) > -- > Dr. Loris Bennett (Hr./Mr.) > ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de >
Re: [slurm-users] Exclude Slurm packages from the EPEL yum repository
> ...I would say having SLURM rpms in EPEL could be very helpful for a lot of > people. > > I get that this took you by surprise, but that's not a reason to not have > them in the repository. I, for one, will happily test if they work for me, > and if they do, that means that I can stop having to build them. I agree it's > not hard to do, but if I don't have to do it I'll be very happy about that. There have been plenty of arguments for why having them in EPEL isn't necessarily the best option. Many open source products (e.g. Postgres, Docker) maintain their own YUM repository online -- probably to exercise greater control over what's published, but also to avoid overlap with mainstream package repositories. If there is value perceived in having pre-built packages available, then perhaps the best solution for all parties is to publish the packages to a unique repository: those who want the pre-built packages explicitly configure their YUM to pull from that repository, those who have EPEL configured (which is a LOT of us) don't get overlapping Slurm packages interfering with their local builds. :::::::::: Jeffrey T. Frey, Ph.D. Systems Programmer V & Cluster Management IT Research Cyberinfrastructure & College of Engineering University of Delaware, Newark DE 19716 Office: (302) 831-6034 Mobile: (302) 419-4976 ::
[slurm-users] Constraint multiple counts not working
On a cluster running Slurm 17.11.8 (cons_res) I can submit a job that requests e.g. 2 nodes with unique features on each: $ sbatch --nodes=2 --ntasks-per-node=1 --constraint="[256GB*1&192GB*1]" … The job is submitted and runs as expected: on 1 node with feature "256GB" and 1 node with feature "192GB." A similar job on a cluster running 20.11.1 (cons_res OR cons_tres, tested with both) fails to submit: sbatch: error: Batch job submission failed: Requested node configuration is not available I enabled debug5 output with NodeFeatures: [2020-12-16T08:53:19.024] debug: JobId=118 feature list: [512GB*1&768GB*1] [2020-12-16T08:53:19.025] NODE_FEATURES: _log_feature_nodes: FEAT:512GB COUNT:1 PAREN:0 OP:XAND ACTIVE:r1n[00-47] AVAIL:r1n[00-47] [2020-12-16T08:53:19.025] NODE_FEATURES: _log_feature_nodes: FEAT:768GB COUNT:1 PAREN:0 OP:END ACTIVE:r2l[00-31] AVAIL:r2l[00-31] [2020-12-16T08:53:19.025] NODE_FEATURES: valid_feature_counts: feature:512GB feature_bitmap:r1n[00-47],r2l[00-31],r2x[00-10] work_bitmap:r1n[00-47],r2l[00-31],r2x[00-10] tmp_bitmap:r1n[00-47] count:1 [2020-12-16T08:53:19.025] NODE_FEATURES: valid_feature_counts: feature:768GB feature_bitmap:r1n[00-47],r2l[00-31],r2x[00-10] work_bitmap:r1n[00-47],r2l[00-31],r2x[00-10] tmp_bitmap:r2l[00-31] count:1 [2020-12-16T08:53:19.025] NODE_FEATURES: valid_feature_counts: NODES:r1n[00-47],r2l[00-31],r2x[00-10] HAS_XOR:T status:No error [2020-12-16T08:53:19.025] select/cons_tres: _job_test: SELECT_TYPE: test 0 pass: test_only [2020-12-16T08:53:19.026] debug2: job_allocate: setting JobId=118_* to "BadConstraints" due to a flaw in the job request (Requested node configuration is not available) [2020-12-16T08:53:19.026] _slurm_rpc_submit_batch_job: Requested node configuration is not available My syntax agrees with the 20.11.1 documentation (online and man pages) so it seems correct — and it works fine in 17.11.8. Any ideas? :::::::::: Jeffrey T. Frey, Ph.D. Systems Programmer V / Cluster Management Network & Systems Services / College of Engineering University of Delaware, Newark DE 19716 Office: (302) 831-6034 Mobile: (302) 419-4976 ::
Re: [slurm-users] Slurm versions 20.11.1 is now available
It's in the github commits: https://github.com/SchedMD/slurm/commit/8e84db0f01ecd4c977c12581615d74d59b3ff995 The primary issue is that any state the client program established on the connection after first making it (e.g. opening a transaction, creating temp tables) won't be present if MySQL automatically reconnects to the server. So the reconnected state won't match the state expected by the client. Better for the client to know the connection failed and reconnect on its own to reestablish state. > On Dec 11, 2020, at 10:31 , Malte Thoma wrote: > > > > Am 11.12.20 um 14:11 schrieb Michael Di Domenico: >>> -- Disable MySQL automatic reconnection. >> can you expand on this? seems an 'odd' thing to disable. > > same thoughts here :-) > > > > > > > > >> On Thu, Dec 10, 2020 at 4:44 PM Tim Wickberg wrote: >>> >>> We are pleased to announce the availability of Slurm version 20.11.1. >>> >>> This includes a number of fixes made in the month since 20.11 was >>> initially released, including critical fixes to nss_slurm and the Perl >>> API when used with the newer configless mode of operation. >>> >>> Slurm can be downloaded from https://www.schedmd.com/downloads.php . >>> >>> - Tim >>> >>> -- >>> Tim Wickberg >>> Chief Technology Officer, SchedMD LLC >>> Commercial Slurm Development and Support >>> * Changes in Slurm 20.11.1 == -- Fix spelling of "overcomited" to "overcomitted" in sreport's cluster utilization report. -- Silence debug message about shutting down backup controllers if none are configured. -- Don't create interactive srun until PrologSlurmctld is done. -- Fix fd symlink path resolution. -- Fix slurmctld segfault on subnode reservation restore after node configuration change. -- Fix resource allocation response message environment allocation size. -- Ensure that details->env_sup is NULL terminated. -- select/cray_aries - Correctly remove jobs/steps from blades using NPC. -- cons_tres - Avoid max_node_gres when entire node is allocated with --ntasks-per-gpu. -- Allow NULL arg to data_get_type(). -- In sreport have usage for a reservation contain all jobs that ran in the reservation instead of just the ones that ran in the time specified. This matches the report for the reservation is not truncated for a time period. -- Fix issue with sending wrong batch step id to a < 20.11 slurmd. -- Add a job's alloc_node to lua for job modification and completion. -- Fix regression getting a slurmdbd connection through the perl API. -- Stop the extern step terminate monitor right after proctrack_g_wait(). -- Fix removing the normalized priority of assocs. -- slurmrestd/v0.0.36 - Use correct name for partition field: "min nodes per job" -> "min_nodes_per_job". -- slurmrestd/v0.0.36 - Add node comment field. -- Fix regression marking cloud nodes as "unexpectedly rebooted" after multiple boots. -- Fix slurmctld segfault in _slurm_rpc_job_step_create(). -- slurmrestd/v0.0.36 - Filter node states against NODE_STATE_BASE to avoid the extended states all being reported as "invalid". -- Fix race that can prevent the prolog for a requeued job from running. -- cli_filter - add "type" to readily distinguish between the CLI command in use. -- smail - reduce sleep before seff to 5 seconds. -- Ensure SPANK prolog and epilog run without an explicit PlugStackConfig. -- Disable MySQL automatic reconnection. -- Fix allowing "b" after memory unit suffixes. -- Fix slurmctld segfault with reservations without licenses. -- Due to internal restructuring ahead of the 20.11 release, applications calling libslurm MUST call slurm_init(NULL) before any API calls. Otherwise the API call is likely to fail due to libslurm's internal configuration not being available. -- slurm.spec - allow custom paths for PMIx and UCX install locations. -- Use rpath if enabled when testing for Mellanox's UCX libraries. -- slurmrestd/dbv0.0.36 - Change user query for associations to optional. -- slurmrestd/dbv0.0.36 - Change account query for associations to optional. -- mpi/pmix - change the error handler error message to be more useful. -- Add missing connection in acct_storage_p_{clear_stats, reconfig, shutdown}. -- Perl API - fix issue when running in configless mode. -- nss_slurm - avoid deadlock when stray sockets are found. -- Display correct value for ScronParameters in 'scontrol show config'. >>> > > -- > Malte ThomaTel. +49-471-4831-1828 > HSM Documentation: https://spaces.awi.de/x/YF3-Eg (User) > https://spaces.awi.de/x/oYD8B (Admin) > HPC Documentation:
Re: [slurm-users] Heterogeneous GPU Node MPS
From the NVIDIA docs re: MPS: On systems with a mix of Volta / pre-Volta GPUs, if the MPS server is set to enumerate any Volta GPU, it will discard all pre-Volta GPUs. In other words, the MPS server will either operate only on the Volta GPUs and expose Volta capabilities, or operate only on pre-Volta GPUs. I'd be curious what happens if you change the ordering (RTX then V100) in the gres.conf -- would the RTX work with MPS and the V100 would not? > On Nov 13, 2020, at 07:23 , Holger Badorreck wrote: > > Hello, > > I have a heterogeneous GPU Node with one V100 and two RTX cards. When I > request resources with --gres=mps:100, always the V100 is chosen, and jobs > are waiting if the V100 is completely allocated, while RTX cards are free. If > I use --gres=gpu:1, also the RTX cards are used. Is something wrong with the > configuration or is it another problem? > > The node configuration in slurm.conf: > NodeName=node1 CPUs=48 RealMemory=128530 Sockets=1 CoresPerSocket=24 > ThreadsPerCore=2 Gres=gpu:v100:1,gpu:rtx:2,mps:600 State=UNKNOWN > > gres.conf: > Name=gpu Type=v100 File=/dev/nvidia0 > Name=gpu Type=rtx File=/dev/nvidia1 > Name=gpu Type=rtx File=/dev/nvidia2 > Name=mps Count=200 File=/dev/nvidia0 > Name=mps Count=200 File=/dev/nvidia1 > Name=mps Count=200 File=/dev/nvidia2 > > Best regards, > Holger
Re: [slurm-users] ProfileInfluxDB: Influxdb server with self-signed certificate
Making the certificate globally-available on the host may not always be permissible. If I were you, I'd write/suggest a modification to the plugin to make the CA path (CURLOPT_CAPATH) and verification itself (CURLOPT_SSL_VERIFYPEER) configurable in Slurm. They are both straightforward options in the CURL API (a char* and an int, respectively) that could be set directly from parsed Slurm config options. Many other SSL CURL options would be just as easy (revocation path, etc.). > On Aug 14, 2020, at 08:55 , Stefan Staeglich > wrote: > > Hi, > > all except of /etc/ssl/certs/ca-certificates.crt is ignored. So I've copied > it > to /usr/local/share/ca-certificates/ and run update-ca-certificates. > > Now it's working :) > > Best, > Stefan > > Am Freitag, 14. August 2020, 11:42:04 CEST schrieb Stefan Staeglich: >> Hi, >> >> I try to setup the acct_gather plugin ProfileInfluxDB. Unfortunately our >> influxdb server has a self-signed certificate only: >> [2020-08-14T09:54:30.007] [46.0] error: acct_gather_profile/influxdb >> _send_data: curl_easy_perform failed to send data (discarded). Reason: SSL >> peer certificate or SSH remote key was not OK >> >> I've copied the certificate to /etc/ssl/certs/ but this doesn't help. But >> his command is working: >> curl 'https://influxdb-server.privat:8086' --cacert /etc/ssl/certs/ >> influxdb.crt >> >> Has someone a solution for this issue? >> >> Best, >> Stefan > > > -- > Stefan Stäglich, Universität Freiburg, Institut für Informatik > Georges-Köhler-Allee, Geb.74, 79110 Freiburg,Germany > > E-Mail : staeg...@informatik.uni-freiburg.de > WWW: ml.informatik.uni-freiburg.de > Telefon: +49 761 203-54216 > Fax: +49 761 203-74217 > > > >
Re: [slurm-users] slurm array with non-numeric index values
On our HPC systems we have a lot of users attempting to organize job arrays for varying purposes -- parameter scans, SSMD (Single-Script, Multiple Datasets). We eventually wrote an abstract utility to try to help them with the process: https://github.com/jtfrey/job-templating-tool May be of some use to you. > On Jul 15, 2020, at 16:13 , c b wrote: > > I'm trying to run an embarrassingly parallel experiment, with 500+ tasks that > all differ in one parameter. e.g.: > > job 1 - script.py foo > job 2 - script.py bar > job 3 - script.py baz > and so on. > > This seems like a case where having a slurm array hold all of these jobs > would help, so I could just submit one job to my cluster instead of 500 > individual jobs. It seems like sarray is only set up for varying an integer > index parameter. How would i do this for non-numeric values (say, if the > parameter I'm varying is a string in a given list) ? > >
Re: [slurm-users] Slurm 20.02.3 error: CPUs=1 match no Sockets, Sockets*CoresPerSocket or Sockets*CoresPerSocket*ThreadsPerCore. Resetting CPUs.
If you check the source up on Github, that's more of a warning produced when you didn't specify a CPU count and it's going to calculate from the socket-core-thread numbers (src/common/read_config.c): /* Node boards are factored into sockets */ if ((n->cpus != n->sockets) && (n->cpus != n->sockets * n->cores) && (n->cpus != n->sockets * n->cores * n->threads)) { error("NodeNames=%s CPUs=%d match no Sockets, Sockets*CoresPerSocket or Sockets*CoresPerSocket*ThreadsPerCore. Resetting CPUs.", n->nodenames, n->cpus); n->cpus = n->sockets * n->cores * n->threads; } This behavior is present beginning in 18.x releases; in 17.x and earlier the inferred n->cpus was done quietly. > On Jun 16, 2020, at 04:12 , Ole Holm Nielsen > wrote: > > Today we upgraded the controller node from 19.05 to 20.02.3, and immediately > all Slurm commands (on the controller node) give error messages for all > partitions: > > # sinfo --version > sinfo: error: NodeNames=a[001-140] CPUs=1 match no Sockets, > Sockets*CoresPerSocket or Sockets*CoresPerSocket*ThreadsPerCore. Resetting > CPUs. > (lines deleted) > slurm 20.02.3 > > In slurm.conf we have defined NodeName like: > > NodeName=a[001-140] Weight=10001 Boards=1 SocketsPerBoard=2 CoresPerSocket=4 > ThreadsPerCore=1 ... > > According to the slurm.conf manual the CPUs should then be calculated > automatically: > > "If CPUs is omitted, its default will be set equal to the product of Boards, > Sockets, CoresPerSocket, and ThreadsPerCore." > > Has anyone else seen this error with Slurm 20.02? > > I wonder if there is a problem with specifying SocketsPerBoard in stead of > Sockets? The slurm.conf manual doesn't seem to prefer one over the other. > > I've opened a bug https://bugs.schedmd.com/show_bug.cgi?id=9241 > > Thanks, > Ole > >
Re: [slurm-users] unable to start slurmd process.
Is the time on that node too far out-of-sync w.r.t. the slurmctld server? > On Jun 11, 2020, at 09:01 , navin srivastava wrote: > > I tried by executing the debug mode but there also it is not writing anything. > > i waited for about 5-10 minutes > > deda1x1452:/etc/sysconfig # /usr/sbin/slurmd -v -v > > No output on terminal. > > The OS is SLES12-SP4 . All firewall services are disabled. > > The recent change is the local hostname earlier it was with local hostname > node1,node2,etc but we have moved to dns based hostname which is deda > > NodeName=node[1-12] NodeHostname=deda1x[1450-1461] NodeAddr=node[1-12] > Sockets=2 CoresPerSocket=10 State=UNKNOWN > other than this it is fine but after that i have done several time slurmd > process started on the node and it works fine but now i am seeing this issue > today. > > Regards > Navin. > > > > > > > > > > On Thu, Jun 11, 2020 at 6:06 PM Riebs, Andy wrote: > Navin, > > > > As you can see, systemd provides very little service-specific information. > For slurm, you really need to go to the slurm logs to find out what happened. > > > > Hint: A quick way to identify problems like this with slurmd and slurmctld is > to run them with the “-Dvvv” option, causing them to log to your window, and > usually causing the problem to become immediately obvious. > > > > For example, > > > > # /usr/local/slurm/sbin/slurmd -D > > > > Just it ^C when you’re done, if necessary. Of course, if it doesn’t fail when > you run it this way, it’s time to look elsewhere. > > > > Andy > > > > From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf Of > navin srivastava > Sent: Thursday, June 11, 2020 8:25 AM > To: Slurm User Community List > Subject: [slurm-users] unable to start slurmd process. > > > > Hi Team, > > > > when i am trying to start the slurmd process i am getting the below error. > > > > 2020-06-11T13:11:58.652711+02:00 oled3 systemd[1]: Starting Slurm node > daemon... > 2020-06-11T13:13:28.683840+02:00 oled3 systemd[1]: slurmd.service: Start > operation timed out. Terminating. > 2020-06-11T13:13:28.684479+02:00 oled3 systemd[1]: Failed to start Slurm node > daemon. > 2020-06-11T13:13:28.684759+02:00 oled3 systemd[1]: slurmd.service: Unit > entered failed state. > 2020-06-11T13:13:28.684917+02:00 oled3 systemd[1]: slurmd.service: Failed > with result 'timeout'. > 2020-06-11T13:15:01.437172+02:00 oled3 cron[8094]: pam_unix(crond:session): > session opened for user root by (uid=0) > > > > Slurm version is 17.11.8 > > > > The server and slurm is running from long time and we have not made any > changes but today when i am starting it is giving this error message. > > Any idea what could be wrong here. > > > > Regards > > Navin. > > > > > > > > >
Re: [slurm-users] ssh-keys on compute nodes?
An MPI library with tight integration with Slurm (e.g. Intel MPI, Open MPI) can use "srun" to start the remote workers. In some cases "srun" can be used directly for MPI startup (e.g. "srun" instead of "mpirun"). Other/older MPI libraries that start remote processes using "ssh" would, naturally, require keyless ssh logins to work across all compute nodes in the cluster. When we provision user accounts on our Slurm cluster we still add .ssh, .ssh/id_rsa (needed for older X11 tunneling via libssh2), and add the public key to .ssh/authorized_keys. All officially-supported MPIs on the cluster are tightly integrated with Slurm. But there are commercial products and older software our clients use that are not, so having keyless access ready for them helps those users get their workflows working more quickly. > On Jun 8, 2020, at 11:16 , Durai Arasan wrote: > > Hi, > > we are setting up a slurm cluster and are at the stage of adding ssh keys of > the users to the nodes. > > I thought it would be sufficient to add the ssh keys of the users to only the > designated login nodes. But I heard that it is also necessary to add them to > the compute nodes as well for slurm to be able to submit jobs of the users > successfully. Apparently this is true especially for MPI jobs. > > So is it true that ssh keys of the users must be added to the > ~/.ssh/authorized_keys of *all* nodes and not just the login nodes? > > Thanks, > Durai >
Re: [slurm-users] IPv6 for slurmd and slurmctld
Use netstat to list listening ports on the box (netstat -ln) and see if it shows up as tcp6 or tcp. On our (older) 17.11.8 server: $ netstat -ln | grep :6817 tcp0 0 0.0.0.0:68170.0.0.0:* LISTEN $ nc -6 :: 6817 Ncat: Connection refused. $ nc -4 localhost 6817 ^C > On May 1, 2020, at 12:44 , William Brown wrote: > > For some services that display of 0.0.0.0 does include IPv6, although it is > counter-intuitive. Try to see if you can connect to it using the IPv6 > address. > > William > > On Fri, 1 May 2020 at 16:35, Thomas Schäfer > wrote: > Hi, > > is there an switch, option, environment variable, configurable key word to > enable IP6 for the slurmd and slurmctld daemons? > > tcp LISTEN 0.0.0.0:6818 > > isn't a good choice, were everything else (nfs, ssh, ntp, dns) runs over IPv6. > > Regards, > Thomas > > > > > >
Re: [slurm-users] How to trap a SIGINT signal in a child process of a batch ?
You could also choose to propagate the signal to the child process of test.slurm yourself: #!/bin/bash #SBATCH --job-name=test #SBATCH --ntasks-per-node=1 #SBATCH --nodes=1 #SBATCH --time=00:03:00 #SBATCH --signal=B:SIGINT@30 # This example works, but I need it to work without "B:" in --signal options, so I want test.sh receives the SIGINT signal and not test.slurm sig_handler() { echo "BATCH interrupted" if [ -n "$child_pid" ]; then kill -INT $child_pid fi } trap 'sig_handler' SIGINT /home/user/test.sh & child_pid=$! wait $child_pid exit $? and #!/bin/bash function sig_handler() { echo "Executable interrupted" exit 2 } trap 'sig_handler' SIGINT echo "BEGIN" sleep 200 & wait echo "END" Having your signal handler in test.slurm "exit 2" signals the end of the job, so the child processes will be terminated whether they've hit their own signal handler yet or not. Signaling the child then returning control in test.slurm to wait and reap the child's exit code and "exit $?" actually gives the child time to do cleanup and influence the final exit code of the job. > On Apr 21, 2020, at 06:13 , Bjørn-Helge Mevik wrote: > > Jean-mathieu CHANTREIN writes: > >> But that is not enough, it is also necessary to use srun in >> test.slurm, because the signals are sent to the child processes only >> if they are also children in the JOB sense. > > Good to know! > > -- > Cheers, > Bjørn-Helge Mevik, dr. scient, > Department for Research Computing, University of Oslo
Re: [slurm-users] Problems calling mpirun in OpenMPI-3.1.6 + slurm and OpenMPI-4.0.3+slurm environments
I just reread your post -- you installed Open MPI 4.0.3 to /home/manumachu/openmpi-4.0.3/OPENMPI_INSTALL then set what's probably a different directory -- /scratch/manumachu/openmpi-4.0.3/OPENMPI_INSTALL/bin -- on your path. So I bet "which mpirun" won't show you what you're expecting :-) > On Apr 10, 2020, at 12:59 , Jeffrey T Frey wrote: > > Are you certain you're PATH addition is correct? The "-np" flag is still > present in a build of Open MPI 4.0.3 I just made, in fact: > > > $ 4.0.3/bin/mpirun > -- > mpirun could not find anything to do. > > It is possible that you forgot to specify how many processes to run > via the "-np" argument. > -- > > > Note that with the Slurm plugins present in your Open MPI build, there should > be no need to use "-np" on the command line; the Slurm RAS plugin should pull > such information from the Slurm runtime environment variables. If you do use > "-np" to request more CPUs that the job was allocated, you'll receive > oversubscription errors (you know, unless you include mpirun flags to allow > that to happen). > > > What if you add "which mpirun" to your job script ahead of the "mpirun" > command -- does it show you > /scratch/manumachu/openmpi-4.0.3/OPENMPI_INSTALL/bin/mpirun? > > > > >> On Apr 10, 2020, at 12:12 , Ravi Reddy Manumachu >> wrote: >> >> >> Dear Slurm Users, >> >> I am facing issues with the following combinations of OpenMPI and SLURM. I >> was wondering if you have faced something similar and can help me. >> >> OpenMPI-3.1.6 and slurm 19.05.5 >> OpenMPI-4.0.3 and slurm 19.05.5 >> >> I have the OpenMPI packages configured with "--with-slurm" option and >> installed. >> >> Configure command line: >> '--prefix=/home/manumachu/openmpi-4.0.3/OPENMPI_INSTALL' '--with-slurm' >> MCA ess: slurm (MCA v2.1.0, API v3.0.0, Component v4.0.3) >> MCA plm: slurm (MCA v2.1.0, API v2.0.0, Component v4.0.3) >> MCA ras: slurm (MCA v2.1.0, API v2.0.0, Component v4.0.3) >> MCA schizo: slurm (MCA v2.1.0, API v1.0.0, Component v4.0.3) >> >> I am executing the sbatch script shown below: >> >> #!/bin/bash >> #SBATCH --account=x >> #SBATCH --job-name=ompi4 >> #SBATCH --output=ompi4.out >> #SBATCH --error=ompi4.err >> #SBATCH --ntasks-per-node=1 >> #SBATCH --time=00:30:00 >> export PATH=/scratch/manumachu/openmpi-4.0.3/OPENMPI_INSTALL/bin:$PATH >> export >> LD_LIBRARY_PATH=/scratch/manumachu/openmpi-4.0.3/OPENMPI_INSTALL/lib:$LD_LIBRARY_PATH >> mpirun -np 4 ./bcast_timing -t 1 >> >> No matter what option I give to mpirun, I get the following error: >> mpirun: Error: unknown option "-np" >> >> I have used mpiexec also but received the same errors. >> >> To summarize, I am not able to call mpirun from a SLURM script. I can use >> srun but I have no idea how to pass MCA parameters I usually give to mpirun >> such as, "--map-by ppr:1:socket -mca pml ob1 -mca btl tcp,self -mca >> coll_tuned_use_dynamic_rules 1". >> >> Thank you for your help. >> >> -- >> Kind Regards >> Dr. Ravi Reddy Manumachu >> Research Fellow, School of Computer Science, University College Dublin >> Ravi Manumachu on Google Scholar, ResearchGate >
Re: [slurm-users] Problems calling mpirun in OpenMPI-3.1.6 + slurm and OpenMPI-4.0.3+slurm environments
Are you certain you're PATH addition is correct? The "-np" flag is still present in a build of Open MPI 4.0.3 I just made, in fact: $ 4.0.3/bin/mpirun -- mpirun could not find anything to do. It is possible that you forgot to specify how many processes to run via the "-np" argument. -- Note that with the Slurm plugins present in your Open MPI build, there should be no need to use "-np" on the command line; the Slurm RAS plugin should pull such information from the Slurm runtime environment variables. If you do use "-np" to request more CPUs that the job was allocated, you'll receive oversubscription errors (you know, unless you include mpirun flags to allow that to happen). What if you add "which mpirun" to your job script ahead of the "mpirun" command -- does it show you /scratch/manumachu/openmpi-4.0.3/OPENMPI_INSTALL/bin/mpirun? > On Apr 10, 2020, at 12:12 , Ravi Reddy Manumachu > wrote: > > > Dear Slurm Users, > > I am facing issues with the following combinations of OpenMPI and SLURM. I > was wondering if you have faced something similar and can help me. > > OpenMPI-3.1.6 and slurm 19.05.5 > OpenMPI-4.0.3 and slurm 19.05.5 > > I have the OpenMPI packages configured with "--with-slurm" option and > installed. > > Configure command line: > '--prefix=/home/manumachu/openmpi-4.0.3/OPENMPI_INSTALL' '--with-slurm' > MCA ess: slurm (MCA v2.1.0, API v3.0.0, Component v4.0.3) > MCA plm: slurm (MCA v2.1.0, API v2.0.0, Component v4.0.3) > MCA ras: slurm (MCA v2.1.0, API v2.0.0, Component v4.0.3) > MCA schizo: slurm (MCA v2.1.0, API v1.0.0, Component v4.0.3) > > I am executing the sbatch script shown below: > > #!/bin/bash > #SBATCH --account=x > #SBATCH --job-name=ompi4 > #SBATCH --output=ompi4.out > #SBATCH --error=ompi4.err > #SBATCH --ntasks-per-node=1 > #SBATCH --time=00:30:00 > export PATH=/scratch/manumachu/openmpi-4.0.3/OPENMPI_INSTALL/bin:$PATH > export > LD_LIBRARY_PATH=/scratch/manumachu/openmpi-4.0.3/OPENMPI_INSTALL/lib:$LD_LIBRARY_PATH > mpirun -np 4 ./bcast_timing -t 1 > > No matter what option I give to mpirun, I get the following error: > mpirun: Error: unknown option "-np" > > I have used mpiexec also but received the same errors. > > To summarize, I am not able to call mpirun from a SLURM script. I can use > srun but I have no idea how to pass MCA parameters I usually give to mpirun > such as, "--map-by ppr:1:socket -mca pml ob1 -mca btl tcp,self -mca > coll_tuned_use_dynamic_rules 1". > > Thank you for your help. > > -- > Kind Regards > Dr. Ravi Reddy Manumachu > Research Fellow, School of Computer Science, University College Dublin > Ravi Manumachu on Google Scholar, ResearchGate
Re: [slurm-users] Slurm version 20.02.0 is now available
Did you reuse the 20.02 select/cons_res/Makefile.{in,am} in your plugin's source? You probably will have to re-model your plugin after the select/cray_aries plugin if you need to override those two functions (it also defines its own select_p_job_begin() and doesn't link against libcons_common.la). Naturally, omitting libcons_common.a from your plugin doesn't help if you use other functions defined in select/common. > On Feb 26, 2020, at 00:48 , Dean Schulze wrote: > > There was a major refactoring between the 19.05 and 20.02 code. Most of the > callbacks for select plugins were moved to cons_common. I have a plugin for > 19.05 that depends on two of those callbacks: select_p_job_begin() and > select_p_job_fini(). My plugin is a copy of the select/cons_res plugin, but > when I implement those functions in my plugin I get this error because those > functions already exist in cons_common: > > /home/dean/src/slurm.versions/slurm-20.02/slurm-20.02.0/src/plugins/select/cons_common/cons_common.c:1134: > multiple definition of `select_p_job_begin'; > .libs/select_liqid_cons_res.o:/home/dean/src/slurm.versions/slurm-20.02/slurm-20.02.0/src/plugins/select/liqid_cons_res/select_liqid_cons_res.c:559: > first defined here > /usr/bin/ld: ../cons_common/.libs/libcons_common.a(cons_common.o): in > function `select_p_job_fini': > /home/dean/src/slurm.versions/slurm-20.02/slurm-20.02.0/src/plugins/select/cons_common/cons_common.c:1561: > multiple definition of `select_p_job_fini'; > .libs/select_liqid_cons_res.o:/home/dean/src/slurm.versions/slurm-20.02/slurm-20.02.0/src/plugins/select/liqid_cons_res/select_liqid_cons_res.c:607: > first defined here > collect2: error: ld returned 1 exit status > > Since only one select plugin can be used at a time (determined in slurm.conf) > I could put my code in the cons_common implementation of those functions, but > if I ever switch plugins then my plugin code will get executed when it > shouldn't be. > > How can I "override" those callbacks in my own plugin? This isn't Java (but > it sure looks like the slurm code tries to do Java in C). > > > On Tue, Feb 25, 2020 at 11:57 AM Tim Wickberg wrote: > After 9 months of development and testing we are pleased to announce the > availability of Slurm version 20.02.0! > > Downloads are available from https://www.schedmd.com/downloads.php. > > Highlights of the 20.02 release include: > > - A "configless" method of deploying Slurm within the cluster, in which > the slurmd and user commands can use DNS SRV records to locate the > slurmctld host and automatically download the relevant configuration files. > > - A new "auth/jwt" authentication mechanism using JWT, which can help > integrate untrusted external systems into the cluster. > > - A new "slurmrestd" command/daemon which translates a new Slurm REST > API into the underlying libslurm calls. > > - Packaging fixes for RHEL8 distributions. > > - Significant performance improvements to the backfill scheduler, as > well as to string construction and processing. > > Thank you to all customers, partners, and community members who > contributed to this release. > > As with past releases, the documentation available at > https://slurm.schedmd.com has been updated to the 20.02 release. Past > versions are available in the archive. This release also marks the end > of support for the 18.08 release. The 19.05 release will remain > supported up until the 20.11 release in November, but will not see as > frequent updates, and bug-fixes will be targeted for the 20.02 > maintenance releases going forward. > > -- > Tim Wickberg > Chief Technology Officer, SchedMD > Commercial Slurm Development and Support >
Re: [slurm-users] Srun not setting DISPLAY with --x11 for one account
> So the answer then is to either kludge the keys by making symlinks to the > cluster and cluster.pub files warewulf makes (I tried this already and I know > it works), or to update to the v19.x release and the new style x11 forwarding. Our answer was to create RSA keys for all users in their ~/.ssh directory if they didn't have that pair already. If Warewulf were to change the key type in cluster{,.pub} to one that libssh2 doesn't support you'll have a different problem to debug :-) > Is the update to v19 fairly straightforward? Is it stable enough at this > point to just give it a try? The way they've implemented X11 forwarding in v19 is to use Slurm's own messaging infrastructure (between salloc and slurmd/slurmstepd) to move the data rather than using a third-party library (libssh2). I'm not clear on whether or not that data is encrypted in transit (it must be...). The release notes do make it clear that requiring salloc on the client side means X11 forwarding no longer works with batch jobs, but I never used it in batch jobs anyway. I can't comment on stability of v19 releases...I'm interested in others' input on that point myself!
Re: [slurm-users] Srun not setting DISPLAY with --x11 for one account
The Slurm-native X11 plugin demands you use ~/.ssh/id_rsa{,.pub} keys. It's hard-coded into the plugin: /* * Ideally these would be selected at run time. Unfortunately, * only ssh-rsa and ssh-dss are supported by libssh2 at this time, * and ssh-dss is deprecated. */ static char *hostkey_priv = "/etc/ssh/ssh_host_rsa_key"; static char *hostkey_pub = "/etc/ssh/ssh_host_rsa_key.pub"; static char *priv_format = "%s/.ssh/id_rsa"; static char *pub_format = "%s/.ssh/id_rsa.pub"; > On Jan 27, 2020, at 09:34 , Simon Andrews > wrote: > > I’ve managed to track down the difference between the accounts which work and > those which don’t – but I still don’t understand the mechanism. > > The accounts which work all had their home directories used on an older > system. The ones which fail were only ever used on the new system. The > relevant difference seems to be the way their ssh keys are set up. On the > old system a standard ssh-keygen was run, creating ~/.ssh/id_rsa and > ~/.ssh/id_rsa.pub files and putting the pub file into authorized_keys. > > On the new warewulf based system ssh-keygen was again run, but the default > key file names was changed. We now have ~/.ssh/cluster and > ~/.ssh/cluster.pub and there is a ~/.ssh/config file which contains: > > # Added by Warewulf 2019-12-10 > Host pebble* >IdentityFile ~/.ssh/cluster >StrictHostKeyChecking=no > > This all works fine, and I can ssh from the head node to the ‘pebble’ compute > nodes just fine, however something in the code for the slurm x11 forwarder is > specifically looking for id_rsa files (or is ignoring the config file), since > the forwarding fails if I don’t have these, and works as soon as I do. > > Any ideas where this might be happening so I can either file a bug for change > whatever setting this needs? > > Simon. > > From: slurm-users On Behalf Of > William Brown > Sent: 24 January 2020 17:21 > To: Slurm User Community List > Subject: Re: [slurm-users] Srun not setting DISPLAY with --x11 for one account > > There are differences for X11 between Slurm versions so it may help to know > which version you have. > > I tried some of your commands on our slurm 19.05.3-2 cluster, and > interestingly on the session on the compute node I don't see the cookie for > the login node: This was with MobaXterm: > > [user@prdubrvm005 ~]$ xauth list > prdubrvm005.research.rcsi.com/unix:10 MIT-MAGIC-COOKIE-1 > 2efc5dd851736e3848193f65d038eca8 > [user@prdubrvm005 ~]$ srun --pty --x11 --preserve-env /bin/bash > [user@prdubrhpc1-02 ~]$ xauth list > prdubrhpc1-02.research.rcsi.com/unix:95 MIT-MAGIC-COOKIE-1 > 2efc5dd851736e3848193f65d038eca8 > [user@prdubrhpc1-02 ~]$ echo $DISPLAY > localhost:95.0 > > Any per-user problem would make me suspect the user having a different shell, > or something in their login script. Can you make their .bashrc and > .bash_profile just exit? Or look for hidden configuration files for > in their home directory? > > William > > > > On Fri, 24 Jan 2020 at 16:05, Simon Andrews > wrote: > I have a weird problem which I can’t get to the bottom of. > > We have a cluster which allows users to start interactive sessions which > forward any X11 sessions they generated on the head node. This generally > works fine, but on the account of one user it doesn’t work. The X11 > connection to the head node is fine, but it won’t transfer to the compute > node. > > The symptoms are shown below: > > A good user gets this: > > [good@headnode ~]$ xauth list > headnode.babraham.ac.uk/unix:12 MIT-MAGIC-COOKIE-1 > f04a2bf9a921a3357e44373655add14a > > [good@headnode ~]$ echo $DISPLAY > localhost:12.0 > > [good@headnode ~]$ srun --pty -p interactive --x11 --preserve-env /bin/bash > > [good@compute ~]$ xauth list > headnode.babraham.ac.uk/unix:12 MIT-MAGIC-COOKIE-1 > f04a2bf9a921a3357e44373655add14a > compute/unix:25 MIT-MAGIC-COOKIE-1 f04a2bf9a921a3357e44373655add14a > > [good@compute ~]$ echo $DISPLAY > localhost:25.0 > > So the cookie is copied from the head node and forwarded and the DISPLAY > variable is updated. > > The bad user gets this: > > [bad@headnode ~]$ xauth list > headnode.babraham.ac.uk/unix:10 MIT-MAGIC-COOKIE-1 > c39a493a37132d308b37469d363d8692 > > [bad@headnode ~]$ echo $DISPLAY > localhost:10.0 > > [bad@headnode ~]$ srun --pty -p interactive --x11 --preserve-env /bin/bash > > [bad@compute ~]$ xauth list > headnode.babraham.ac.uk/unix:10 MIT-MAGIC-COOKIE-1 > c39a493a37132d308b37469d363d8692 > > [bad@compute ~]$ echo $DISPLAY > localhost:10.0 > > So the cookie isn’t copied and the DISPLAY isn’t updated. I can’t see any > errors in the logs and I can’t see anything different about this account. > > If I do a straight forward ssh -Y from the head node to a compute node from > the bad account then that works fine – it’s only whatever is specific about > the way that srun forwards X which fails. > >
Re: [slurm-users] blastx fails with "Error memory mapping"
Does your Slurm cgroup or node OS cgroup configuration limit the virtual address space of processes? The "Error memory mapping" is thrown by blast when trying to create a virtual address space that exposes the contents of a file on disk (see "man mmap") so the file can be accessed via pointers (with the OS handling paging data in and out of the file on disk) rather than by means of standard file i/o calls (e.g. fread(), fscanf(), read()). It sounds like you don't have enough system RAM, period, or the cgroup "memory.memsw.limit_in_bytes" is set too low for the amount of file content you're attempting to mmap() into the virtual address space (e.g. BIG files). > On Jan 24, 2020, at 07:03 , Mahmood Naderan wrote: > > Hi, > Although I can run the blastx command on terminal on all nodes, I can not use > slurm for that due to a so called "memory map error". > Please see below that I pressed ^C after some seconds when running via > terminal. > > Fri Jan 24 15:29:57 +0330 2020 > [shams@hpc ~]$ blastx -db ~/ncbi-blast-2.9.0+/bin/nr -query > ~/khTrinityfilterless1.fasta -max_target_seqs 5 -outfmt 6 -evalue 1e-5 > -num_threads 2 > ^C > [shams@hpc ~]$ date > Fri Jan 24 15:30:09 +0330 2020 > > > However, the following script fails > > [shams@hpc ~]$ cat slurm_blast.sh > #!/bin/bash > #SBATCH --job-name=blast1 > #SBATCH --output=my_blast.log > #SBATCH --partition=SEA > #SBATCH --account=fish > #SBATCH --mem=38GB > #SBATCH --nodelist=hpc > #SBATCH --nodes=1 > #SBATCH --ntasks-per-node=2 > > export PATH=~/ncbi-blast-2.9.0+/bin:$PATH > blastx -db ~/ncbi-blast-2.9.0+/bin/nr -query ~/khTrinityfilterless1.fasta > -max_target_seqs 5 -outfmt 6 -evalue 1e-5 -num_threads 2 > [shams@hpc ~]$ sbatch slurm_blast.sh > Submitted batch job 284 > [shams@hpc ~]$ cat my_blast.log > Error memory mapping:/home/shams/ncbi-blast-2.9.0+/bin/nr.52.psq > openedFilesCount=151 threadID=0 > Error: NCBI C++ Exception: > T0 > "/home/coremake/release_build/build/PrepareRelease_Linux64-Centos_JSID_01_560232_130.14.18.6_9008__PrepareRelease_Linux64-Centos_1552331742/c++/compilers/unix/../../src/corelib/ncbiobj.cpp", > line 981: Critical: ncbi::CObject::ThrowNullPointerException() - Attempt to > access NULL pointer. > Stack trace: > blastx ???:0 ncbi::CStackTraceImpl::CStackTraceImpl() offset=0x77 > addr=0x1d95da7 > blastx ???:0 ncbi::CStackTrace::CStackTrace(std::string const&) > offset=0x25 addr=0x1d98465 > blastx ???:0 ncbi::CException::x_GetStackTrace() offset=0xA0 > addr=0x1ec7330 > blastx ???:0 ncbi::CException::SetSeverity(ncbi::EDiagSev) offset=0x49 > addr=0x1ec2169 > blastx ???:0 ncbi::CObject::ThrowNullPointerException() offset=0x2D2 > addr=0x1f42582 > blastx ???:0 ncbi::blast::CBlastTracebackSearch::Run() offset=0x61C > addr=0xf2929c > blastx ???:0 ncbi::blast::CLocalBlast::Run() offset=0x404 addr=0xed4684 > blastx ???:0 CBlastxApp::Run() offset=0xC9C addr=0x9cbf7c > blastx ???:0 ncbi::CNcbiApplication::x_TryMain(ncbi::EAppDiagStream, > char const*, int*, bool*) offset=0x8E3 addr=0x1da0e13 > blastx ???:0 ncbi::CNcbiApplication::AppMain(int, char const* const*, > char const* const*, ncbi::EAppDiagStream, char const*, std::string const&) > offset=0x782 addr=0x1d9f6b2 > blastx ???:0 main offset=0x5E5 addr=0x9caa05 > /lib64/libc.so.6 ???:0 __libc_start_main offset=0xF5 addr=0x7f9a0fb3e505 > blastx ???:0 blastx() [0x9ca345] offset=0x0 addr=0x9ca345 > > > > Any idea about that? > > > Regards, > Mahmood > >
Re: [slurm-users] Question about networks and connectivity
Open MPI matches available hardware in node(s) against its compiled-in capabilities. Those capabilities are expressed as modular shared libraries (see e.g. $PREFIX/lib64/openmpi). You can use environment variables or command-line flags to influence which modules get used for specific purposed. For example, the Byte-Transfer Layer (BTL) module has openib, tcp, self, shared-memory (sm), vader implementations. So long as your build of Open MPI knew about Infiniband and the runtime can see the hardware, Open MPI should rank that interface highest-performance and use it. > On Dec 9, 2019, at 08:54 , Sysadmin CAOS wrote: > > Hi mercan, > > OK, I forgot to compile OpenMPI with Infiniband support... But I still have a > doubt: SLURM scheduler assigns (offers) some nodes called "node0x" to my > sbatch job because in my SLURM cluster nodes have been added with "node0x" > name. My OpenMPI application has been (now) compiled with ibverbs support.. > but how I tell to my application or to my SLURM sbatch submit script that my > MPI program MUST use Infiniband network? If SLURM has assigned to me node01 > and node02 (with IP address 192.168.11.1 and 192.168.11.2 in a gigabit > network) and Infiniband is 192.168.13.x, who transform from "clus01" > (192.168.12.1) and "clus02" (192.168.12.2) to "infi01" (192.168.13.1) and > "infi02" (192.168.13.2). > > This step still baffles me... > > Sorry if my question is easy for you... but now I have been entered in a sea > of doubts. > > Thanks. > > El 05/12/2019 a las 14:27, mercan escribió: >> Hi; >> >> Your mpi and NAMD use your second network because of your applications did >> not compiled for infiniband. There are many compiled NAMD versions. the verb >> and ibverb versions are for using infiniband. Also, when you compiling the >> mpi source, you should check configure script detect the infiniband network >> to use infiniband. And even while compiling the slurm too. >> >> Regards; >> >> Ahmet M. >> >> >> On 5.12.2019 15:07, sysadmin.caos wrote: >>> Hello, >>> >>> Really, I don't know if my question is for this mailing list... but I will >>> explain my problem and, then, you could answer me whatever you think ;) >>> >>> I manage a SLURM clusters composed by 3 networks: >>> >>> * a gigabit network used for NFS shares (192.168.11.X). In this >>> network, my nodes are "node01, node02..." in /etc/hosts. >>> * a gigabit network used by SLURM (all my nodes are added to SLURM >>> cluster using this network and the hostname assigned via /etc/host >>> to this second network). (192.168.12.X). In this network, my nodes >>> are "clus01, clus02..." in /etc/hosts. >>> * a Infiniband network (192.168.13.X). In this network, my nodes are >>> "infi01, infi02..." in /etc/hosts. >>> >>> When I submit a MPI job, SLURM scheduler offers me "n" nodes called, for >>> example, clus01 and clus02 and, there, my application runs perfectly using >>> second network for SLURM connectivity and first network for NFS (and NIS) >>> shares. By default, as SLURM connectivity is on second network, my nodelist >>> contains nodes called "clus0x". >>> >>> However, now, I'm getting a "new" problem. I want to use third network >>> (Infiniband), but as SLURM offers me "clus0x" (second network), my MPI >>> application runs OK but using second network. This problem also occurs, for >>> example, using NAMD (Charmrun) application. >>> >>> So, my questions are: >>> >>> 1. is this SLURM configuration correct for using both networks? >>> 1. If answer is "no", how do I configure SLURM for my purpose? >>> 2. But if answer is "yes", how can I ensure connections in my >>> SLURM job are going in Infiniband? >>> >>> Thanks a lot!! >>> > >
Re: [slurm-users] $TMPDIR does not honor "TmpFS"
If you check the applicable code in src/slurmd/slurmstepd/task.c, TMPDIR is set to "/tmp" if it's not already set in the job environment and then TMPDIR is created if permissible. It's your responsibility to set TMPDIR -- e.g. we have a plugin we wrote (autotmp) to set TMPDIR to per-job and per-step paths according to the job id. > On Nov 21, 2018, at 10:33 , Michael Gutteridge > wrote: > > > I don't think that's a bug. As far as I've ever known, TmpFS is only used to > tell slurmd where to look for available space (reported as TmpDisk for the > node). The manpage only indicates that, not any additional functionality. > We set TMPDIR in a task prolog: > > #!/bin/bash > echo "export TMPDIR=/loc/scratch/${SLURM_JOB_ID}" > echo "export SCRATCH_LOCAL=/loc/scratch/${SLURM_JOB_ID}" > echo "export SCRATCH=/net/scratch/${SLURM_JOB_ID}" > > - Michael > > > On Wed, Nov 21, 2018 at 6:52 AM Shenglong Wang wrote: > We have TMPDIR setup inside prolog file. Hope users do not have absolute path > /tmp inside their scripts. > > #!/bin/bash > > SLURM_BIN="/opt/slurm/bin" > > SLURM_job_tmp=/state/partition1/job-${SLURM_JOB_ID} > > mkdir -m 700 -p $SLURM_job_tmp > chown $SLURM_JOB_USER $SLURM_job_tmp > > echo "export SLURM_JOBTMP=$SLURM_job_tmp" > echo "export SLURM_JOB_TMP=$SLURM_job_tmp" > echo "export SLURM_JOB_TMPDIR=$SLURM_job_tmp" > echo "export TMPDIR=$SLURM_job_tmp” > > Best. > Shenglong > >> On Nov 21, 2018, at 9:44 AM, Roger Moye wrote: >> >> We are having the exact same problem with $TMPDIR. I wonder if a bug has >> crept in?I spoke to the SchedMD guys at SC18 last week and they were not >> aware of a bug but since more than one person is having this difficulty >> something must be wrong somewhere. >> >> -Roger >> >> From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf >> Of Douglas Duckworth >> Sent: Wednesday, November 21, 2018 7:38 AM >> To: slurm-users@lists.schedmd.com >> Subject: [slurm-users] $TMPDIR does not honor "TmpFS" >> >> Hi >> >> We are setting TmpFS=/scratchLocal in /etc/slurm/slurm.conf on nodes and >> controller. However $TMPDIR value seems to be /tmp not /scratchLocal. As a >> result users are writing to /tmp which we do not want. >> >> We are not setting $TMPDIR anywhere else such as /etc/profile.d nor do users >> have it defined in their ~/.bashrc or ~/.bash_profile. >> >> We do not see any error messages anywhere which could indicate why the >> default value of /tmp overrides our value of of TmpFS. >> >> As I understand prolog scripts can change this value though, if that's the >> case, then what's the purpose of setting TmpFS in /etc/slurm/slurm.conf? >> >> >> Thanks, >> >> Douglas Duckworth, MSc, LFCS >> HPC System Administrator >> Scientific Computing Unit >> Weill Cornell Medicine >> 1300 York Avenue >> New York, NY 10065 >> E: d...@med.cornell.edu >> O: 212-746-6305 >> F: 212-746-8690 >> >> --- >> >> The information in this communication and any attachment is confidential and >> intended solely for the attention and use of the named addressee(s). All >> information and opinions expressed herein are subject to change without >> notice. This communication is not to be construed as an offer to sell or the >> solicitation of an offer to buy any security. Any such offer or solicitation >> can only be made by means of the delivery of a confidential private offering >> memorandum (which should be carefully reviewed for a complete description of >> investment strategies and risks). Any reliance one may place on the accuracy >> or validity of this information is at their own risk. Past performance is >> not necessarily indicative of the future results of an investment. All >> figures are estimated and unaudited unless otherwise noted. If you are not >> the intended recipient, or a person responsible for delivering this to the >> intended recipient, you are not authorized to and must not disclose, copy, >> distribute, or retain this message or any part of it. In this case, please >> notify the sender immediately at 713-333-5440 >> >
Re: [slurm-users] new user simple question re sacct output line2
The identifier after the base numeric job id -- e.g. "batch" -- is the job step. The "batch" step is where your job script executes. Each time your job script calls "srun" a new numerical step is created, e.g. "82.0," "82.1," et al. Job accounting captures information for the entire job (JobID = "82") and the individual steps. All data points visible in sacct may or may not pertain or be captured for each individual step. > On Nov 14, 2018, at 08:38 , Matthew Goulden > wrote: > > Hi, > > New to slurm; currently working up to move our system from uge/sge > > sacct output including the default headers is three lines, What is line 2 > documenting? Most fields are blank. > > For most fields with values these are the same as for line 3: > AllocCPUS, > Elapsed, > State, > ExitCode, > ReqMem, > > For some fields with values these are clearly related to that in line 3 > (represented here as line1:line2:line3) > JobID: 82: 82.batch > JobIDRaw: 82: 82.batch > > For others the values are uniq to line 2: > JobName: : batch > Partition: all_slt_limit: > ReqCPUFreqMin: Unknown: 0 > ReqCPUFreqMax: Unknown: 0 > ReqCPUFreqGov: Unknown: 0 > ReqTRES: billing=1,cpu=1,node=1: > AllocTRES: billing=1,cpu=1,mem=125000M,node=1: > cpu=1,mem=125000M,node=1 > > I'm sure the documentation - which is excellent - details this but I've not > found where; can someone give me the pointer I need? > > Many thanks > > Matt > > ** > The information contained in the EMail and any attachments is confidential > and intended solely and for the attention and use of the named addressee(s). > It may not be disclosed to any other person without the express authority of > Public Health England, or the intended recipient, or both. If you are not the > intended recipient, you must not disclose, copy, distribute or retain this > message or any part of it. This footnote also confirms that this EMail has > been swept for computer viruses by Symantec.Cloud, but please re-sweep any > attachments before opening or saving. http://www.gov.uk/PHE > **
Re: [slurm-users] Slurmstepd sleep processes
See: https://github.com/SchedMD/slurm/blob/master/src/slurmd/slurmstepd/mgr.c Circa line 1072 the comment explains: /* * Need to exec() something for proctrack/linuxproc to * work, it will not keep a process named "slurmstepd" */ execl(SLEEP_CMD, "sleep", "1", NULL); Basically, proctrack/linuxproc will produce an error if a slurmstepd is running zero subprocesses. So a very long sleep command is spawned to satisfy that condition (no matter what proctrack plugin is actually being used). > On Aug 3, 2018, at 17:42 , Christopher Benjamin Coffey > wrote: > > Hello, > > Has anyone observed "sleep 1" processes on their compute nodes? They > seem to be tied to the slurmstepd extern process in slurm: > > 4 S root 136777 1 0 80 0 - 73218 do_wai 05:48 ?00:00:01 > slurmstepd: [13220317.extern] > 0 S root 136782 136777 0 80 0 - 25229 hrtime 05:48 ?00:00:00 > \_ sleep 1 > 4 S root 136784 1 0 80 0 - 73280 do_wai 05:48 ?00:00:02 > slurmstepd: [13220317.batch] > 4 S tes87136789 136784 0 80 0 - 26520 do_wai 05:48 ?00:00:00 > \_ /bin/bash /var/spool/slurm/slurmd/job13220317/slurm_script > 4 S root 136807 1 0 80 0 - 107157 do_wai 05:48 ? 00:00:01 > slurmstepd: [13220317.1] > > I'm not exactly sure what the extern piece is for. Anyone know what this is > all about? Is this normal? We just saw this the other day while investigating > some issues. Sleeping for 3.17 years seems strange. Any help would be > appreciated, thanks! > > Best, > Chris > > — > Christopher Coffey > High-Performance Computing > Northern Arizona University > 928-523-1167 > >