[slurm-dev] Re: TMPDIR, clean up and prolog/epilog

2016-08-22 Thread Lachlan Musicman
Marcin,

Thanks for the bindtmp link. Reading the code, I note it looks for
/etc/passwd. We are using sssd for auth - I presume that means this plugin
will not work for us?

cheers
L.

--
The most dangerous phrase in the language is, "We've always done it this
way."

- Grace Hopper

On 27 June 2016 at 01:10, Marcin Stolarek  wrote:

> This was discussed numbers of times before. You can check the list
> archive, or start for instance with:
> https://github.com/fafik23/slurm_plugins/tree/master/bindtmp
>
> cheers
> marcin
>
> 2016-06-24 7:22 GMT+02:00 Lachlan Musicman :
>
>> We are transitioning from Torque/Maui to SLURM and have only just noticed
>> that SLURM puts all files in /tmp and doesn't create a per job/user TMPDIR.
>>
>> On searching, we have found a number of options for creation of TMPDIR on
>> the fly using SPANK and lua and prolog/epilog.
>>
>> I am looking for something relatively benign, since this we are still
>> learning the new paradigm.
>>
>> One thing in particular: our /tmp files are SSD local to CPU rather than
>> on a shared filesystem for speed, so we will need to remove the tmps
>>
>> So I was looking at the --prolog and --task-prolog options, doing a
>> little testing on how I might export TMPDIR
>>
>> I had a very simple
>>
>> srun --prolog=/data/pro.sh --task-prolog=/data/t-pro.sh -l hostname
>>
>>  pro.sh
>>
>>
>>  #!/bin/bash
>>  echo "PROLOG: this is from the prologue. currently on `hostname`"
>>
>>  t-pro.sh
>>
>>
>>  #!/bin/bash
>>  echo "TASK-PROLOG: this is from the task-prologue. currently on
>> `hostname`"
>>
>> /data is a shared file system and is the WORKDIR
>>
>> I'm getting results from --prolog but not from --task-prolog.
>> Running this instead:
>>
>> srun --task-prolog=/data/t-pro.sh -l hostname
>>
>> I confirm still no output from task-prolog.
>>
>> What am I doing wrong?
>>
>> (both scripts have a+x)
>>
>> cheers
>> L.
>>
>> --
>> The most dangerous phrase in the language is, "We've always done it this
>> way."
>>
>> - Grace Hopper
>>
>
>


[slurm-dev] Re: Backfill scheduler should look at all jobs

2016-08-22 Thread Christopher Samuel

On 23/08/16 01:24, Ulf Markwardt wrote:

> I really want the bf scheduler to step down to the lowest priority -
> every once in a while.

Isn't this what bf_continue is for?

bf_continue
 The backfill scheduler periodically releases locks in order
 to permit other operations to proceed rather than blocking
 all activity for what could be an extended period of time.
 Setting this option will cause the backfill scheduler to
 continue processing pending jobs from its original job list
 after releasing locks even if job or node state changes.
 This can result in lower priority jobs being backfill
 scheduled instead of newly arrived higher priority jobs,
 but will permit more queued jobs to be considered for
 backfill scheduling.

It's part of what we use to cope with many thousands of jobs
pending in our queues (defer is also really useful too).

cheers,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/  http://twitter.com/vlsci


[slurm-dev] Slurm User Group Meeting - Registration Ending Soon

2016-08-22 Thread Jacob Jenson
The standard registration option for for the 2016 Slurm User Group 
meeting will end on August 31. If you plan on attending the Slurm user 
group meeting please sign up soon.


 * https://slug2016.eventbrite.com
 * http://slurm.schedmd.com/slurm_ug_agenda.html

Regards,
Jacob




[slurm-dev] Re: Backfill scheduler should look at all jobs

2016-08-22 Thread Ulf Markwardt
Daniel,

> You may want to take a look at adjusting bf_max_job_test and depending
> on your typical runtime for jobs increasing bf_resolution to combat an
> increase in overhead.

I know I can limit the number of jobs the bf scheduler looks at with
bf_max_job_... parameters.

My question is not about responsiveness, I can handle that pretty fine
with bf_yield_interval and bf_yield_sleep.

I really want the bf scheduler to step down to the lowest priority -
every once in a while.

Thanks,
Ulf

PS. Slurm statistics in Grafana proved to be very helpful for us:
(starting point http://giovannitorres.me/graphing-sdiag-with-graphite.html)
-- 
___
Dr. Ulf Markwardt

Technische Universität Dresden
Center for Information Services and High Performance Computing (ZIH)
01062 Dresden, Germany

Phone: (+49) 351/463-33640  WWW:  http://www.tu-dresden.de/zih



smime.p7s
Description: S/MIME Cryptographic Signature


[slurm-dev] Re: Backfill scheduler should look at all jobs

2016-08-22 Thread Daniel M. Weeks

Hi Ulf,

You may want to take a look at adjusting bf_max_job_test and depending
on your typical runtime for jobs increasing bf_resolution to combat an
increase in overhead.

On 08/22/2016 10:24 AM, Ulf Markwardt wrote:
> Dear all,
> 
> how can I bring the backfill scheduler to examine all low-priority jobs
> for possible execution?
> 
> We have situations with >10 jobs and times with just a few hundreds.
> So a fixed bf_interval does not make sense.
> 
> Is there some option to say the bf scheduler runs down to the bottom of
> the queue an then starts at the top, just ignoring bf_interval.
> 
> As a workaround, I now have a python script checking sdiag (via API). If
> "Last depth cycle" is much smaller than "Last queue length" I increase
> bf_interval by 50% (+ scontrol reconf) - and the other way round
> comparing the bf cycle time with bf_interval.
> This works but I would assume something built-in in Slurm - though I
> haven't found it, yet.
> 
> Best,
> Ulf
> 
> 
> 
> 


-- 
Daniel M. Weeks
Senior Systems Administrator
Center for Computational Innovations
Rensselaer Polytechnic Institute
Troy, NY 12180
518-276-4458


[slurm-dev] Backfill scheduler should look at all jobs

2016-08-22 Thread Ulf Markwardt
Dear all,

how can I bring the backfill scheduler to examine all low-priority jobs
for possible execution?

We have situations with >10 jobs and times with just a few hundreds.
So a fixed bf_interval does not make sense.

Is there some option to say the bf scheduler runs down to the bottom of
the queue an then starts at the top, just ignoring bf_interval.

As a workaround, I now have a python script checking sdiag (via API). If
"Last depth cycle" is much smaller than "Last queue length" I increase
bf_interval by 50% (+ scontrol reconf) - and the other way round
comparing the bf cycle time with bf_interval.
This works but I would assume something built-in in Slurm - though I
haven't found it, yet.

Best,
Ulf




-- 
___
Dr. Ulf Markwardt

Technische Universität Dresden
Center for Information Services and High Performance Computing (ZIH)
01062 Dresden, Germany

Phone: (+49) 351/463-33640  WWW:  http://www.tu-dresden.de/zih



smime.p7s
Description: S/MIME Cryptographic Signature


[slurm-dev] Re: Fully utilizing nodes

2016-08-22 Thread Diego Zuccato

Il 12/08/2016 01:42, Christopher Samuel ha scritto:

> The CR_ONE_TASK_PER_CORE is a hold over from then, it means that if
> you've got HT/SMT enabled you'll get one MPI rank (Slurm task) per
> physical core and that can then use threads to utilise the hardware
> thread units.  Without it you'll get a rank per thread, which may not be
> useful to your code.
I've had to remove CR_ONE_TASK_PER_CORE since it seems our users' jobs
(or cpuset bindings, that consider every thread an independent CPU)
didn't cope well with it. Having one rank per thread, on the other hand,
seems to work well enough.

-- 
Diego Zuccato
Servizi Informatici
Dip. di Fisica e Astronomia (DIFA) - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786
mail: diego.zucc...@unibo.it