There's always the --dependency flag for sbatch. So yes, depending on what
you wanted, you could line up another sbatch after the first if you liked.
cheers
L.
--
The most dangerous phrase in the language is, "We've always done it this
way."
- Grace Hopper
On 1 February 2017 at 08:38, TO_We
trival questions: does node has correct time wrt head node? and is node
correctly configured in slurm.conf? (# of cpus, amount of memory, etc)
cheers
L.
--
The most dangerous phrase in the language is, "We've always done it this
way."
- Grace Hopper
On 1 February 2017 at 08:03, E V wrote:
You might use a job submission plugin, but in this case, you have to
be aware of the fact that the job can still rejected by slurm after
the plugin was involved.
Another idea might be to wrap sbatch.
Maybe there is a much better way. I'm really not sure.
Could you provide a few details about wh
Still, for maintenance of the primary controller, you should call
"scontrol takeover" because Slurm commands, especially "srun", do not
work during these 60 seconds otherwise.
2017-01-31 19:12 GMT+01:00 Andrus, Brian Contractor :
> Aha! That was just far too large for me. Set it down to 60 second
We also still have 16.5 in our productive system. We thought about
exchanging the data through files but decided that it is better to use
the comment field of the job (in contrast to the AdminComment field,
this field is available also in 16.5, but can also be used by the user
[which does not happ
enabling debug5 doesn't show anything more useful. I don't see
anything relevant in slurmd.log just job starts and stops.
slurmctld.log has the takeover output with backup head node
immediately draining itself same as before but with more of the
context before the DRAIN:
[2017-01-31T15:37:38.387]
We are pleased to announce the availability of Slurm versions 16.05.9
and 17.02.0-0rc1 (release candidate 1).
16.05.9 contains around 25 rather minor bug fixes. Please upgrade at
your leisure.
The rc release contains all of the features intended for release 17.02.
Development has ended for
Aha! That was just far too large for me. Set it down to 60 seconds and things
seem happier (along with the users).
Thanks!
Brian
-Original Message-
From: TO_Webmaster [mailto:luftha...@gmail.com]
Sent: Tuesday, January 31, 2017 12:26 AM
To: slurm-dev
Subject: [slurm-dev] Re: Backup co
We still have Slurm 16.5. With the wrapper idea, I will try dump the message t
a file and then display the file contents in sbatch wrapper.
Thanks a lot.
Yang
From: TO_Webmaster
Sent: Tuesday, January 31, 2017 11:32 AM
To: slurm-dev
Subject: [slurm-dev]
I was told that this is impossible at the moment. We ended up in
putting the message in the comment field of the job (in Slurm 17.02,
you can use the AdminComment) and writing a small wrapper to sbatch
that prints the message to the screen.
Anyway, it would be nice if such a feature would be adde
In job_submit.lua, slurm.user_msg displays a message only when slurm.ERROR is
returned from job_submit.lua. Is there a way to display a message to users whe
slurm.SUCCESS is returned?
Yang
High Performance Research Computing
Texas A&M University
No eplilog scripts defined, and access to save state is fine, as an
scontrol takeover works, but does have the side affect of the backup
draining itself. I set SlurmctlDebug to debug3 and didn't get much
more info:
[2017-01-31T09:45:22.329] debug2: node_did_resp hpcc-1
[2017-01-31T09:45:22.329] de
I guess this is a simple question, but I have not found an answer yet:
We would like to run a script just after sbatch was launched.
We are aware of PrologSlurmctld= and Prolog= in slurm.conf,
but AFAIK these are executed when the job is allocated and there might
be several hours between submis
Dear Slurm Developers,
I've added the following changes so that the Lua job submit script can
handle jobs submitted with the --immediate flag in the Slurm 16.05.x source
branch.
I'm wondering if this change can be checked in.
$ git status
# On branch slurm-16.05
# Changes not staged for commit:
Hi,
You should take a look at the job_submit plugin. That is the nest place to
check if a job should be queued or it can be rejected otherwise.
Regards,
Carlos
On Thu, Jan 26, 2017 at 1:03 PM, Dmitry Chirikov wrote:
> Hi all,
>
> Playing with SPANK I faced up with an issue
> Seems I can't get
Hi Dmitry,
I can see the confusion, my last mail was poorly worded. When I ran the plugin
in the S_CTX_REMOTE context I was able to retrieve some information. I wasn't
able to retrieve any data when running in the S_CTX_ALLOCATOR context.
Again, apologies for the confusion.
---
Sam Gallop
F
What is the output of
scontrol show config | grep SlurmctldTimeout
?
2017-01-31 6:57 GMT+01:00 Andrus, Brian Contractor :
> Yes, if I do scontrol takeover, it successfully goes to the backup.
>
>
> Brian Andrus
> ITACS/Research Computing
> Naval Postgraduate School
> Monterey, California
> voic
17 matches
Mail list logo