Thanks Janne,

We had “SlurmdSpoolDir” parameter with value “/tmp/slurmd”. As you say, a 
correct value may be “/var/spool/slurmd” for example.
But, with your reply, we have other questions:

- What is the difference between “SlurmdSpooldDir” and “TmpFS” slurm parameters?
- Who write in the TmpFS? Slurm daemon or user process?
- We have compute nodes with 500GB HDD. How many space is recomendable for 
TmpDisk parameter?




Regards,

Alfonso Pardo Diaz
System Administrator / Researcher
c/ Sola nº 1; 10200 Trujillo, ESPAÑA
Tel: +34 927 65 93 17 Fax: +34 927 32 32 37



El 24/04/2014, a las 08:39, Janne Blomqvist <[email protected]> escribió:

> 
> On 2014-04-23T17:26:36 EEST, Alfonso Pardo wrote:
>> Hi,
>> 
>> I had some errors from a premature terminate jobs with this message:
>> 
>> slurmd[bd-p14-01]: error: unlink(/tmp/slurmd/job60560/slurm_script):
>> No such file or directory
>> slurmd[bd-p14-01]: error: rmdir(/tmp/slurmd/job60560): No such file or
>> directory
>> 
>> 
>> Should “TmpFS” location be a shared file system?
> 
> No. Or maybe it's possible, but why? Typically /tmp is considered a 
> machine-local directory.
> 
> That being said, the error messages you quote have nothing to do with the 
> slurm.conf TmpFS setting but rather tell that your SlurmdSpoolDir is set to 
> "/tmp/slurmd". That is likely a bad idea, as there might be various /tmp 
> cleaner scripts such as tmpwatch emptying /tmp regularly, leading to errors 
> like you see (been there, done that). Just leave it at the default value 
> unless you have good reasons to do otherwise. Note that it requires some 
> trickery to move the contents of the SlurmdSpoolDir if you want to do it on 
> the fly without losing track of running jobs.
> 
>> We don’t have TmpDisk parameter established (default value). How many
>> space is reasonable for this parameter?
> 
> Depends on how large disks you have on your nodes, no? However, the trend 
> seems to be that /tmp is a relatively small space, frequently on a ram disk 
> (tmpfs) rather than backed by a real disk [1]. So you might not want to 
> encourage your users to write code assuming a large /tmp is available. A 
> large machine-local space is probably better to place at /var/tmp or 
> something site-specific such as /local.
> 
> 
> [1] http://0pointer.de/blog/projects/tmp.html
> 
> --
> Janne Blomqvist, D.Sc. (Tech.), Scientific Computing Specialist
> Aalto University School of Science, PHYS & BECS
> +358503841576 || [email protected]

Reply via email to