).
Is this possible with SLURM?
I could turn off one partition with a cron job, but that would allow jobs
to schedule at 7:59am and run into the closed window.
Any other thoughts on how this might be accomplished?
Thanks in advance,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu
Yes, that's going to be a big problem for us to. It's good that 2.6 is
working well for us.
Any chance we can get the old behavior back?
Best
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435 | Fax: (512) 475-9445
, and if I stop SLURM on the node (/etc/init.d/slurm stop) it
doesn't get replaced either. I would have sworn up and down that at least
the latter worked.
Can anyone provide some feedback?
Thanks,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office
No thoughts on this from the list? I wouldn't have thought we were the
only ones encountering this issue.
Best,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435 | Fax: (512) 475-9445
On 6/5/14 3:09 PM, Bill
...@github.com:rtevans/tacc_stats.git (this will eventually move
the the main TACC GitHub, but that's a work in progress)
Best,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435 | Fax: (512) 475-9445
On 6/10/14 6:19 AM
Why not make one partition and use fairshare to balance the usage over
time? That way both institutes can run large jobs that span the whole
machine when others are not using it.
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435
in the ratio of their contribution, but is allowed to use the whole
machine. If both institutes have periods of down time, then the machine
will be less likely to sit idle and more work will get done.
I'll get off my soapbox now.
Best,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu
into the
bit bucket.
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435 | Fax: (512) 475-9445
On 9/10/14, 8:39 PM, Christopher Samuel sam...@unimelb.edu.au wrote:
On 10/09/14 03:10, Erica Riello wrote:
I would like
probably use cp. I haven't tested this, but you should probably escape the
$ and use cp:
srun cp /scratch/Hello /scratch/\$HOSTNAME
might work.
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435 | Fax: (512) 475-9445
At TACC we solve this problem by disabling HyperThreads/SMT/whatever your
CPU vendor calls it. Up to Intel SNB we haven't additional hardware
threads to be useful for most of our codes.
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC
for the
slack caused by the bad one. One can also make the reservation larger by a
few nodes to account for bad luck.
I'm really wondering if there are any better options or any automated
options.
What do others do?
Thanks,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu
No such luck, Aaron. I specified a node count in my original
specification. I would have thought what you suggested was the behavior,
but it appears not to be.
Thanks,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435
John,
We're doing some work on tracking what codes get used at TACC which I'd be
happy to share on-list or off-list if advertising is here is frowned upon.
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435 | Fax
priorities override job priorities completely.
This is consistent with our experience since 7/2 when we upgraded. Was
this a recent change? Is there anything we can do to go back to the old
behavior?
Thanks,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
in the future due to recurrence? It
doesn't seem to to me.
Thanks,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435 | Fax: (512) 475-9445
On 6/24/15, 1:35 PM, Bill Barth bba...@tacc.utexas.edu wrote:
Thanks
) State=INACTIVE
Thanks,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435 | Fax: (512) 475-9445
the upgrade. I wonder if something went weird
during the upgrade process.
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435 | Fax: (512) 475-9445
On 6/24/15, 12:00 PM, Jacqueline Scoggins jscogg...@lbl.gov
CoreCnt=576 Features=(null)
PartitionName=normal-mic Flags=
Users=bbarth,foo,bar,baz Accounts=(null
) Licenses=(null) State=INACTIVE
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435 | Fax: (512) 475-9445
On 6/24
to solve my problem, but no luck. Any
advice would be much appreciated.
Thanks,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435 | Fax: (512) 475-9445
,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435 | Fax: (512) 475-9445
thoughts on preventing this?
Thanks,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435 | Fax: (512) 475-9445
On 7/6/15, 11:19 AM, John Desantis desan...@mail.usf.edu wrote:
Bill,
Has anyone used DAILY
to
launch independent serial tasks in parallel.
Best,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435 | Fax: (512) 475-9445
On 7/14/15, 11:17 AM, John Desantis desan...@mail.usf.edu wrote:
Charles,
Have you
this?
Thanks,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435 | Fax: (512) 475-9445
Thanks, Lyn. That's very interesting. I'm glad that this is reproducible.
I was driving myself crazy over here.
Thanks again,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435 | Fax: (512) 475-9445
On 7/17
On 7/7/15, 4:40 PM, Bruce Roberts schedulerk...@gmail.com wrote:
The reservations in the database are only for historical purposes, they
don't get read in from the slurmctld. The DBD should be purging them as
configured you shouldn't have the manually do anything.
I would have thought so.
more than 10k nodes under management across a handful of clusters
at once with this system.
Best,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435 | Fax: (512) 475-9445
On 3/15/16, 7:39 AM, "Bjørn-Helge
I was thinking of MPI codes primarily since that kind of work is my
background. If you are running a serial/threaded program on one node, then
you don't need srun, just set OMP_NUM_THREADS and go!
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435 | Fax: (512) 475-9445
On 4/8/16, 2:44 PM, "Craig Yoshioka" <yoshi...@ohsu.edu> wrote:
>Personally, I don¹t enjoy the overhead of creating batch fi
.
Best,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435| Fax: (512) 475-9445
On 3/30/17, 10:52 AM, "Chad Cropper" <chad.crop...@genusplc.com> wrote:
We would like to clean our memo
modules, and libraries
used by a job.
Best,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435| Fax: (512) 475-9445
On 3/22/17, 10:29 AM, "E.M. Dragowsky" <dragow...@case.edu> wrote
We don’t use cgroups with our SLURM at this time, though we have some ongoing
investigations in that direction. There’s probably a way to get both plugins to
cooperate.
Best,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435
Fortunately, once we figured out what systemd was doing, we didn’t need to
interact with it besides adding its PAM module configuration line to slurm’s
PAM config file.
Best,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435
it mounted. It took a little while to figure out
where in the PAM stack to insert the pam_systemd.so configuration line to
guarantee that it was working for all our SLURM jobs, but the above method
seems to solve the problem.
Best,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu
this way. We also do
*everything else* via our own RPMs, so it fits with our local methodology.
Best,
Bill.
--
Bill Barth, Ph.D., Director, HPC
bba...@tacc.utexas.edu| Phone: (512) 232-7069
Office: ROC 1.435| Fax: (512) 475-9445
On 10/5/17, 2:07 AM, "Ole Holm Ni
34 matches
Mail list logo