Thanks!   I've setup my service script to get check on bootup if an
end-user id exists, and do nothing if it doesn't.   I have a generic
user (called "user")  already added to use to make SSH connections.
I invoke it during vcl_post_reserve:

# Allow user "user" to SSH in (using certificates)
sed -i 's/^AllowUsers.*/& user/' /etc/ssh/external_sshd_config
/etc/rc.d/init.d/ext_sshd restart
/etc/rc.d/init.d/init-openmpi-node-service start

The startup script registers the node with the cluster by creating a
file on an NFS share under the end-user's home directory there.

The only issue a the moment is that I haven't found a proper hook to
invoke the  "service stop" routine, which unregisters the node from
the cluster.    I can hook "vcl_post_reload", but at that time the
image has already been cleaned up.   It doesn't seem that the image
goes through a shutdown or reboot (init 0) at the end of the
reservation, so the service script never gets a chance to perform its
stop action.      Any hints before I do some further spelunking in the
code?

Reply via email to