We see something similar on NFS mounts on our CentOS 6 clusters.
Interestingly, we won't see the SLURM output files when running "ls" for
a while, but if you blindly "cat" one anyway, it's there and has
content. It hasn't been enough of an issue to warrant investigating,
since the files show up within a minute or so, but I suspect the same
thing as Aaron and Jordan. Some NFS tuning might fix it, but possibly
at some cost of throughput, which is another reason I'm not motivated to
mess with it.
Regards,
Jason
On 07/14/15 08:33, Carlos Fenoy wrote:
Re: [slurm-dev] Re: Where does standard out go before its copied over
to the control node
Hi Jordan,
Check
http://stackoverflow.com/questions/25170763/how-to-change-how-frequently-slurm-updates-the-output-file-stdout/25189364#25189364
this question and answer on stackoverflow.
Regards,
Carlos
On Tue, Jul 14, 2015 at 3:15 PM, Aaron Knister
<[email protected] <mailto:[email protected]>> wrote:
Hi Jordan,
The answer is, well, it's not. SLURM open()'s the file in its
final resting spot. What I suspect is this may be an artifact of
NFS caching. If that's it you can try these commands on the node
where the file shows as zero length:
sync
echo 2 > /proc/sys/vm/drop_caches
Another option could be to log in to the NAS head and see exactly
what it thinks the file looks like.
Hope that helps!
Sent from my iPhone
> On Jul 14, 2015, at 12:39 AM, Jordan Willis
<[email protected] <mailto:[email protected]>> wrote:
>
> Hi,
>
> When a job is run, the slurm_%j.out is generated where I would
expect, but remains empty until the job has completed.
>
> This is strange behavior to me since we are using a NAS file
system on all nodes including the slurm controller node. So even
if the file was being written to on just the node it was being run
on, it should show up on the controller node.
>
> On torque it generally was written to /var/spool/ directory and
file and then copied at the end. When I go to the spool directory
defined in slurm.conf, I see the slurm_script file generated but
not the output.
>
> Where is the output before its copied? Is this behavior expected?
>
> Thanks so much,
> Jordan
--
--
Carles Fenoy
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Jason W. Bacon
[email protected]
If a problem can be solved,
there's no need to worry.
If it cannot be solved, then
worrying will do no good.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~