[slurm-users] Registration for 2018 Slurm User Group Meeting is Open

2018-05-29 Thread Jacob Jenson
Registration for the 2018 Slurm User Group Meeting is open. You can register at https://slug18.eventbrite.com The meeting will be held on 25-26 September 2018 in Madrid Spain at CIEMAT. - *Early registration* - May 29 through July 2 - $300 USD - *Standard registration* -

Re: [slurm-users] Controller / backup controller q's

2018-05-29 Thread Patrick Goetz
On 05/25/2018 11:19 AM, Will Dennis wrote: Not yet time for us... There's problems with U18.04 that render it unusable for our environment. What problems have you run in to with 18.04?

Re: [slurm-users] Using free memory available when allocating a node to a job

2018-05-29 Thread PULIDO, Alexandre
Thanks for your inputs, the automatic reporting is definitely a great idea and seems easy to implement in Slurm. At our site we have a web portal developed internally where users can see in real time everything that is happening on the cluster, and every metric of their own job. There is

Re: [slurm-users] Using free memory available when allocating a node to a job

2018-05-29 Thread John Hearns
Alexandre, you have made a very good point here. "Oftentimes users only input 1G as they really have no idea of the memory requirements," At my last job we introduced cgroups. (this was in PBSPro). We had to enforce a minumum request for memory. Users then asked us how much memory their jobs

Re: [slurm-users] Using free memory available when allocating a node to a job

2018-05-29 Thread PULIDO, Alexandre
Hello John, this behavior is needed because the memory usage of the codes executed on the nodes are particularly hard to guess. Usually, when exceeded the ratio is between 1.1 and 1.3 more than expected. Sometimes much larger. A) Indeed there is a partition running only exclusive jobs, but

Re: [slurm-users] Using free memory available when allocating a node to a job

2018-05-29 Thread John Hearns
Also regarding memory, there are system tunings you can set for the behaviour of the OurOfMemory Killer and also the VM overcommit. I have seen the VM overcommit parameters being discussed elsewhere, and generally for HPC people advise to disable overcommit

[slurm-users] Job not in squeue and no log file exists

2018-05-29 Thread Mahmood Naderan
Hi, When I submit the following script, I receive a job id. However, it doesn't show that in squeue. Moreover, there is no log file as I specified in the script hamid@rocks7:scripts$ cat slurm_script.sh #!/bin/bash #SBATCH --job-name=hvacSteadyFoam #SBATCH --output=hvacSteadyFoam.log #SBATCH

[slurm-users] Using free memory available when allocating a node to a job

2018-05-29 Thread PULIDO, Alexandre
Hi, in the cluster where I'm deploying Slurm the job allocation has to be based on the actual free memory available on the node, not just the allocated by Slurm. This is nonnegotiable and I understand that it's not how Slurm is designed to work, but I'm trying anyway. Among the solutions that