Hi Ewan,
On Mon, Jun 8, 2015 at 2:39 AM, Roche Ewan ewan.ro...@epfl.ch wrote:
The underlying problem seems to be that SLURM isn’t correctly setting
CUDA_VISIBLE_DEVICES to match the device allowed by the cgroup.
Slurm actually does the right thing. The real culprit here is the NVML.
So for
Hello,
I have successfully run the production and test slurmctld on our submission
node. How do you actually specify which controller daemon to submit too? By
default it’s using the production controller using the default ports, but I
want to submit to my test controller that is using
Hi,
I saw that, but how do i set the the limit on partition? Do i have to use
sacctmgr for each user to limit for GrpCPUs? or i can just edit the
partitioname= in the slurm.conf? is there any example to see how can i
configure it?
Thanks, igor.
On 08/06/15 17:44, Moe Jette wrote:
See:
Dear All,
I am using GDB to debug segment fault. While I am using backtrace (bt)
option. I am getting forllowing results. How can I see the values of
*item=optimized
out*. Any of the other suggestion is also appreciatable ...
at layouts_mgr.c:1975
1975if (keydef-flags
Hello,
we’re seeing some odd behaviour with version 14.11.4 regarding the interaction
between cgroups and GPUs allocated via GRES.
The underlying problem seems to be that SLURM isn’t correctly setting
CUDA_VISIBLE_DEVICES to match the device allowed by the cgroup.
On one node we run two jobs
Dear All,
I want to use valgrind to check memory leakage. For that I found the
option --enable-memory-leak-debug, But I want more about how to use that
to check and resolve memory leakage with that options. Thanks for your
help suggestion in advance.
Regards
Dineshkumar RAJAGOPAL
*Grenoble
Hello,
I was searching for an option to configure in slurm.conf partition that will
limit each user to use not more than specific number of cpu's per on partition.
Is it possible? i want to configure it like that so there will be no situation
that one user is using all resources in the
Upon reflection, the sacct reports NODE_FAIL note that I reported is
really just a symptom; the problem (as noted further down) is that
slurmctld reports a node failure when a job was running at the time that
slurmctld went offline, regardless of the state of the job when
slurmctld comes
See:
http://slurm.schedmd.com/resource_limits.html
Quoting Igor Chebotar ichebo...@univ.haifa.ac.il:
Hello,
I was searching for an option to configure in slurm.conf partition
that will limit each user to use not more than specific number of
cpu's per on partition. Is it possible? i want
If the segfault will still occur when you compile with -g,
you should be good. You may also need -O0 to turn off
optimizations.
Bob
On Mon, 8 Jun 2015, Dinesh Kumar wrote:
Dear All,
I am using GDB to debug segment fault. While I am using backtrace (bt)
option. I am getting forllowing
Resolved the problem .. Thanks Bob
Regards
Dineshkumar RAJAGOPAL
*Grenoble Institute Of Technology*
*Grenoble,France*
On Mon, Jun 8, 2015 at 5:25 PM, Bob Moench r...@cray.com wrote:
If the segfault will still occur when you compile with -g,
you should be good. You may also need -O0 to
That's exactly what I was looking for, thanks very much.
2015-06-02 16:30 GMT+02:00 Moe Jette je...@schedmd.com:
See the MinJobAge configuration option:
http://slurm.schedmd.com/slurm.conf.html
Quoting Manuel Rodríguez Pascual manuel.rodriguez.pasc...@gmail.com:
Hi all,
I have been
12 matches
Mail list logo