[slurm-dev] Disabling partitions by time of day

2014-03-14 Thread Bill Barth
). Is this possible with SLURM? I could turn off one partition with a cron job, but that would allow jobs to schedule at 7:59am and run into the closed window. Any other thoughts on how this might be accomplished? Thanks in advance, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu

[slurm-dev] Re: srun interactive failure after upgrade

2014-05-21 Thread Bill Barth
Yes, that's going to be a big problem for us to. It's good that 2.6 is working well for us. Any chance we can get the old behavior back? Best Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435 | Fax: (512) 475-9445

[slurm-dev] More odd scheduler reservation behavior

2014-06-05 Thread Bill Barth
, and if I stop SLURM on the node (/etc/init.d/slurm stop) it doesn't get replaced either. I would have sworn up and down that at least the latter worked. Can anyone provide some feedback? Thanks, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office

[slurm-dev] Re: More odd scheduler reservation behavior

2014-06-10 Thread Bill Barth
No thoughts on this from the list? I wouldn't have thought we were the only ones encountering this issue. Best, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435 | Fax: (512) 475-9445 On 6/5/14 3:09 PM, Bill

[slurm-dev] Re: Enforce to use srun and application logger

2014-06-10 Thread Bill Barth
...@github.com:rtevans/tacc_stats.git (this will eventually move the the main TACC GitHub, but that's a work in progress) Best, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435 | Fax: (512) 475-9445 On 6/10/14 6:19 AM

[slurm-dev] Re: Dynamic partitions on Linux cluster

2014-08-14 Thread Bill Barth
Why not make one partition and use fairshare to balance the usage over time? That way both institutes can run large jobs that span the whole machine when others are not using it. Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435

[slurm-dev] Re: Dynamic partitions on Linux cluster

2014-08-14 Thread Bill Barth
in the ratio of their contribution, but is allowed to use the whole machine. If both institutes have periods of down time, then the machine will be less likely to sit idle and more work will get done. I'll get off my soapbox now. Best, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu

[slurm-dev] Re: undelivered output

2014-09-10 Thread Bill Barth
into the bit bucket. Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435 | Fax: (512) 475-9445 On 9/10/14, 8:39 PM, Christopher Samuel sam...@unimelb.edu.au wrote: On 10/09/14 03:10, Erica Riello wrote: I would like

[slurm-dev] Re: HOSTNAME is the same for every node

2014-09-16 Thread Bill Barth
probably use cp. I haven't tested this, but you should probably escape the $ and use cp: srun cp /scratch/Hello /scratch/\$HOSTNAME might work. Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435 | Fax: (512) 475-9445

[slurm-dev] Re: Logical vs Physical Cores

2014-12-04 Thread Bill Barth
At TACC we solve this problem by disabling HyperThreads/SMT/whatever your CPU vendor calls it. Up to Intel SNB we haven't additional hardware threads to be useful for most of our codes. Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC

[slurm-dev] Lost nodes in a reservation

2015-05-11 Thread Bill Barth
for the slack caused by the bad one. One can also make the reservation larger by a few nodes to account for bad luck. I'm really wondering if there are any better options or any automated options. What do others do? Thanks, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu

[slurm-dev] Re: Lost nodes in a reservation

2015-05-12 Thread Bill Barth
No such luck, Aaron. I specified a node count in my original specification. I would have thought what you suggested was the behavior, but it appears not to be. Thanks, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435

[slurm-dev] Re: Question about prologging

2015-04-15 Thread Bill Barth
John, We're doing some work on tracking what codes get used at TACC which I'd be happy to share on-list or off-list if advertising is here is frowned upon. Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435 | Fax

[slurm-dev] partition priorities

2015-06-05 Thread Bill Barth
priorities override job priorities completely. This is consistent with our experience since 7/2 when we upgraded. Was this a recent change? Is there anything we can do to go back to the old behavior? Thanks, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069

[slurm-dev] Re: Reservations overlapping on nodes they should not!?

2015-06-25 Thread Bill Barth
in the future due to recurrence? It doesn't seem to to me. Thanks, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435 | Fax: (512) 475-9445 On 6/24/15, 1:35 PM, Bill Barth bba...@tacc.utexas.edu wrote: Thanks

[slurm-dev] Reservations overlapping on nodes they should not!?

2015-06-24 Thread Bill Barth
) State=INACTIVE Thanks, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435 | Fax: (512) 475-9445

[slurm-dev] Re: Reservations overlapping on nodes they should not!?

2015-06-24 Thread Bill Barth
the upgrade. I wonder if something went weird during the upgrade process. Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435 | Fax: (512) 475-9445 On 6/24/15, 12:00 PM, Jacqueline Scoggins jscogg...@lbl.gov

[slurm-dev] Re: Reservations overlapping on nodes they should not!?

2015-06-24 Thread Bill Barth
CoreCnt=576 Features=(null) PartitionName=normal-mic Flags= Users=bbarth,foo,bar,baz Accounts=(null ) Licenses=(null) State=INACTIVE -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435 | Fax: (512) 475-9445 On 6/24

[slurm-dev] More reservation woes

2015-07-02 Thread Bill Barth
to solve my problem, but no luck. Any advice would be much appreciated. Thanks, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435 | Fax: (512) 475-9445

[slurm-dev] Re: More reservation woes

2015-07-06 Thread Bill Barth
, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435 | Fax: (512) 475-9445

[slurm-dev] Re: More reservation woes

2015-07-06 Thread Bill Barth
thoughts on preventing this? Thanks, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435 | Fax: (512) 475-9445 On 7/6/15, 11:19 AM, John Desantis desan...@mail.usf.edu wrote: Bill, Has anyone used DAILY

[slurm-dev] Re: timeout issues

2015-07-14 Thread Bill Barth
to launch independent serial tasks in parallel. Best, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435 | Fax: (512) 475-9445 On 7/14/15, 11:17 AM, John Desantis desan...@mail.usf.edu wrote: Charles, Have you

[slurm-dev] DAILY reservations end up overlapped with other reservations on the same partition

2015-07-16 Thread Bill Barth
this? Thanks, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435 | Fax: (512) 475-9445

[slurm-dev] RE: DAILY reservations end up overlapped with other reservations on the same partition

2015-07-17 Thread Bill Barth
Thanks, Lyn. That's very interesting. I'm glad that this is reproducible. I was driving myself crazy over here. Thanks again, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435 | Fax: (512) 475-9445 On 7/17

[slurm-dev] Re: More reservation woes

2015-07-07 Thread Bill Barth
On 7/7/15, 4:40 PM, Bruce Roberts schedulerk...@gmail.com wrote: The reservations in the database are only for historical purposes, they don't get read in from the slurmctld. The DBD should be purging them as configured you shouldn't have the manually do anything. I would have thought so.

[slurm-dev] Re: What cluster provisioning system do you use?

2016-03-15 Thread Bill Barth
more than 10k nodes under management across a handful of clusters at once with this system. Best, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435 | Fax: (512) 475-9445 On 3/15/16, 7:39 AM, "Bjørn-Helge

[slurm-dev] Re: Basic question

2016-04-12 Thread Bill Barth
I was thinking of MPI codes primarily since that kind of work is my background. If you are running a serial/threaded program on one node, then you don't need srun, just set OMP_NUM_THREADS and go! Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069

[slurm-dev] Re: Basic question

2016-04-08 Thread Bill Barth
. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435 | Fax: (512) 475-9445 On 4/8/16, 2:44 PM, "Craig Yoshioka" <yoshi...@ohsu.edu> wrote: >Personally, I don¹t enjoy the overhead of creating batch fi

[slurm-dev] Re: Suggestions on node memory cleaning

2017-03-30 Thread Bill Barth
. Best, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435| Fax: (512) 475-9445 On 3/30/17, 10:52 AM, "Chad Cropper" <chad.crop...@genusplc.com> wrote: We would like to clean our memo

[slurm-dev] Re: Tools to retrieve user executable information

2017-03-24 Thread Bill Barth
modules, and libraries used by a job. Best, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435| Fax: (512) 475-9445 On 3/22/17, 10:29 AM, "E.M. Dragowsky" <dragow...@case.edu> wrote

[slurm-dev] Re: Per-job tmp directories and namespaces

2017-08-15 Thread Bill Barth
We don’t use cgroups with our SLURM at this time, though we have some ongoing investigations in that direction. There’s probably a way to get both plugins to cooperate. Best, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435

[slurm-dev] Re: Per-job tmp directories and namespaces

2017-08-10 Thread Bill Barth
Fortunately, once we figured out what systemd was doing, we didn’t need to interact with it besides adding its PAM module configuration line to slurm’s PAM config file. Best, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435

[slurm-dev] Re: Per-job tmp directories and namespaces

2017-08-10 Thread Bill Barth
it mounted. It took a little while to figure out where in the PAM stack to insert the pam_systemd.so configuration line to guarantee that it was working for all our SLURM jobs, but the above method seems to solve the problem. Best, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu

[slurm-dev] Re: Setting up Environment Modules package

2017-10-05 Thread Bill Barth
this way. We also do *everything else* via our own RPMs, so it fits with our local methodology. Best, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435| Fax: (512) 475-9445 On 10/5/17, 2:07 AM, "Ole Holm Ni