2011/11/3 Ismael Farfán:
> Hello list
>
> I have this little script as both suspend and resume program in 2.2.7:
>
> $ cat /tmp/prog.py
> #!/usr/bin/python3
> import sys
> print ("doing something with the following hosts: ", sys.argv[1])
> try: # save the log
> sys.stdout = open("/tmp/log", "a")
> sys.stderr = sys.stdout
> except: pass
> print ("doing something with the following hosts: ", sys.argv[1])
> f = open("/tmp/log2", "a")
> f.write("doing something with the following hosts: " + str( sys.argv[1]) )
>
> When I execute
> # scontrol update nodename=fg0 state=power_up
> # scontrol update nodename=fg0 state=power_down
>
> I get this message in slurmctld.log
> [2011-11-03T10:50:15] powering down node fg0
>
>
> But no evidence of the script having been executed at all... any ideas?
>
>
> # scontrol show config | egrep "suspend|resume"
> ResumeProgram = /tmp/prog.py
> ResumeRate = 300 nodes/min
> ResumeTimeout = 30 sec
> SuspendExcNodes = (null)
> SuspendExcParts = (null)
> SuspendProgram = /tmp/prog.py
> SuspendRate = 60 nodes/min
> SuspendTime = 30 sec
> SuspendTimeout = 30 sec
>
>
I turned on a node manually and waited for slurm to try to shut it
down, I got this message in slurmctld.log
[2011-11-03T11:42:50] node fg0 returned to service
[2011-11-03T11:43:52] error: power_save: program signalled: Aborted
and a "core" file in /var/log/slurm-llnl, but again, no other sign of
the execution (like the files I try to create)
Any ideas?
> --
> Do not let me induce you to satisfy my curiosity, from an expectation,
> that I shall gratify yours. What I may judge proper to conceal, does
> not concern myself alone.
>
--
Do not let me induce you to satisfy my curiosity, from an expectation,
that I shall gratify yours. What I may judge proper to conceal, does
not concern myself alone.