On Thu, Dec 10, 2009 at 03:01:24AM +0100, Dejan Muhamedagic wrote:
> Hi,
> 
> On Wed, Dec 09, 2009 at 05:22:18PM +0100, Achim Stumpf wrote:
> > Hi,
> > 
> > Why this script is still not committed from the first post in
> > February to your development tree at
> > 
> > http://hg.linux-ha.org/agents/file/e13565f0ea8a/heartbeat
> > 
> > Or did I check at the wrong place?
> 
> You didn't check in the wrong place. I never actually got around
> to reviewing your script, there has been some discussion which
> didn't look conclusive.
> 
> To reiterate, using KILL to remove processes is definitely
> excessive, unless all other means have been exhausted. I still
> don't see why the sequence STOP, TERM, CONT wouldn't fit.
>
> I just reviewed the script and parts of it are unmaintainable.
> In particular the sshd_stop function. That has to be
> significantly simplified. The get_and_stop_pids is recursive but
> there is no explanation what's happening there. There's also one
> echo command in there, probably remnant of debugging.

Yep.

Btw.
you hardcode 19 as STOP, which is incorrect -- according to kill(1)
it is not constant over various systems.

Did you know about process groups?
why not simply

        kill -STOP -$SSH_PID

Which should do the same as you are trying to do recursively,
but using a single syscall.


And I still say the suggested work around
using a wrapper would be better suited.

you don't even need to modify existing scripts.
you could easily add a forced command to the authorized_keys.
yes, of course you still have access to the original command
as $SSH_ORIGINAL_COMMAND, and thus wrap around it.



or adding additional calls to fuser -k to the Filesystem RA,
so other situations may benefit as well.

I'm not sure wether a lazy umount might help
in the Filesystem RA for the general case,
like (pseudo code, obviously):

force_umount() {
        umount && return

        fuser -km /mnt/point
        umount && return

        (
                cd /mnt/point || exit 0
                while fuser -km . ; do
                        sleep 1;
                done
        ) &

        umount -l /mnt/point
        wait

        return 0
}

Note that sometimes fuser is unable to find users,
e.g. unix domain sockets with relative names,
or similar things...


-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to