On Fri, Feb 06, 2009 at 03:18:55PM +0100, Achim Stumpf wrote:
> Hi,
>
> I have written a ocf sshd RA script. It is based on the proftpd
> script. Feel free to use it and commit it please.
>
> I have written this script with the special option
> "OCF_RESKEY_killallchilds":
> We have some ugly written cron like jobs here, which access our
> cluster via ssh. Most of them run in loops and open again and again
> ssh sessions to the cluster and through that on the drbd device. Or
> they start through ssh a loop on the cluster and the childs access the
> drbd device.
>
> With the function get_and_stop_pids I am able to get all childs of a
> process. But if the option is set to 0, sshd will terminate then
> without the above story.
>
> The workaround with fuser in RA Filesystem does not solve this issue,
> because the parent process starts new childs which will access the
> drbd device again for example.
the workaround solves it fine.
if you make your "applications" "cluster aware" in the following sense:
iiuc, what you do now is basically
ssh cluster "while true; do some_job_which_uses_the_drbd ; done"
change that to
ssh cluster "cd /your/drbd/mount/point ;
while true; do ( some_job_which_uses_the_drbd ) ; done"
as the process (shell) the loop spawning new processes runs in
now has its cwd on DRBD, the "fuser -k" will find and kill it.
I think that would be much easier than modifying the ssh RA.
--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/