[slurm-dev] Re: Fw:Slurm question for help

2016-02-25 Thread Loris Bennett
温圣召 writes: > Fw:Slurm question for help > > Dear Sir/Madam: > > I'm using slurm to build a small cluster。my munge,slurmctl,slurmdbd,slurmd > all run as root。 > I use srun submit jobs with --uid= opention. > r...@yq01-sys-hic-k4007.yq01.baidu.com matrixMulCUBLAS]# srun

[slurm-dev] Re: Kill Signals Sent By SLURM

2016-02-25 Thread Timothy Brown
Hi Mike, We're just starting to look at preempting. So I can't help you there (just yet!). However I was testing with signals sent by slurm awhile ago. I'm pretty sure it's SIGTERM. I've attached a silly do nothing program that I was using. Tim From: Mike

[slurm-dev] Fw:Slurm question for help

2016-02-25 Thread 温圣召
Dear Sir/Madam: I'm using slurm to build a small cluster。my munge, slurmctl, slurmdbd,slurmd all run as root。 I use srun submit jobs with --uid= opention. r...@yq01-sys-hic-k4007.yq01.baidu.com matrixMulCUBLAS]# srun --comment=wsz_111 --account="testAccount" -N1

[slurm-dev] Kill Signals Sent By SLURM

2016-02-25 Thread Mike Dacre
Hi All, I am trying to incorporate checkpointing using DMTCP into my SLURM jobs, specifically, to allow the checkpointing of a job when it is killed by SLURM on timeout or memory overuse (or anything else), to allow resubmission from the checkpoint later. I have been talking with the DMTCP devs

[slurm-dev] Obtaining the WorkDir of a completed job

2016-02-25 Thread Jack Challen
Hi, I've got a lot of sbatch jobs being submitted from different directories. If the job is pending or running then I can get the WorkDir with "scontrol show job XXX". Can I get the same WorkDir information from (say) sacct once the job is completed? I've enabled MySQL storage of accounting data

[slurm-dev] Re: sreport bug on SLURM version 15.08.7 ?

2016-02-25 Thread De Giorgi Jean-Claude
Hi all, Here is some more information. If I use the same command, but adding the time, as shown below, it works well: sreport cluster Utilization start=2015-12-01T01:00 end=2016-01-01 --parsable2 format=Cluster,Allocated,Down,Idle,Reserved,Reported -t Hour ...

[slurm-dev] squeue: Collapsing running array jobs?

2016-02-25 Thread Loris Bennett
Hi, I'm using Slurm 15.08.4 and in the man page for 'squeue' it says -r, --array Display one job array element per line. Without this option, the display will be optimized for use with job arrays (pending job array elements will be combined on one line of

[slurm-dev] Bug and suggested fix in testsuite test 14.10

2016-02-25 Thread Bjørn-Helge Mevik
Test 14.10 in the test suite (of slurm 15.08.8, at least) uses $sinfo -tidle -h -o%n to find idle nodes. This only works if NodeHostname == NodeName on the nodes. The following should work regardless of this: $scontrol show hostnames \$($sinfo -tidle -h -o%N) -- Regards, Bjørn-Helge