Hi,

Lars Ellenberg wrote:
On Fri, Feb 06, 2009 at 03:18:55PM +0100, Achim Stumpf wrote:
Hi,

I have written a ocf sshd RA script. It is based on the proftpd
script. Feel free to use it and commit it please.

I have written this script with the special option
"OCF_RESKEY_killallchilds":

We have some ugly written cron like jobs here, which access our
cluster via ssh. Most of them run in loops and open again and again
ssh sessions to the cluster and through that on the drbd device. Or
they start through ssh a loop on the cluster and the childs access the
drbd device.

With the function get_and_stop_pids I am able to get all childs of a
process. But if the option is set to 0, sshd will terminate then
without the above story.

The workaround with fuser in RA Filesystem does not solve this issue,
because the parent process starts new childs which will access the
drbd device again for example.


the workaround solves it fine.
if you make your "applications" "cluster aware" in the following sense:

 iiuc, what you do now is basically
   ssh cluster "while true; do some_job_which_uses_the_drbd ; done"


 change that to
   ssh cluster "cd /your/drbd/mount/point ;
        while true; do ( some_job_which_uses_the_drbd ) ; done"


I am working for a company in the financial industry, and theses jobs are 
accessing the clusters via ssh and they access often in loops, as you and me 
have shown above.


as the process (shell) the loop spawning new processes runs in
now has its cwd on DRBD, the "fuser -k" will find and kill it.

I think that would be much easier than modifying the ssh RA.


If you have only a couple of scripts, which you could modify yourself, yes. But I am talking here of hundreds of jobs of people of my company and other companies, and if I tell them to change there jobs, this would never come to an end.
I think it is not such a good idea to rely on code which is written to access 
the drbd device through sshd. Someone makes a mistake and a failover would fail.

Cheers,

Achim


_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to