On Wed, Jan 06, 2010 at 03:18:00PM +0100, Dejan Muhamedagic wrote:
> Hi Jiaju,
> 
> On Wed, Jan 06, 2010 at 03:21:53PM +0800, Jiaju Zhang wrote:
> > On Mon, Jan 4, 2010 at 12:44 AM, Jiaju Zhang <[email protected]> 
> > wrote:
> > > Hi,
> > >
> > > Firstly, let me say thank you for all the comment and happy new year
> > > :)
> > > This is some progress so far, it is not the final version, just a
> > > status update.
> > >
> > > About the following patch:
> > > 1) The "tickle ACK" function was integrated in portblock RA.
> > > 2) For now, it doens't support IPv6 address and cluster ip scenario.
> > > But you may notice some code of sending tickle ACK can handle IPv6
> > > address, I keep the code as so is for future enhancement.
> > > 3) Some implementation details:
> > >   - still record "server-ip:port client-ip:port" pair in state file
> > >     is because we not only need the server _ip_ but also the _port_
> > >     when sending tickle ACK.
> > >   - not use "losf" but still "netstat" to collect the established TCP
> > >     connections info is becuase the result of "losf" is not the same
> > >     as "netstat".

lsof _does_ give the same information,
but only when run as root (which it would,
as you need to run tickle_ack as root anyways...)

when run as normal user (and depending on some compile time setting),
it will only show those connections belonging to processes that user
has access to.

but netstat is sufficient, and possibly faster (as it does normally not
walk the whole /proc/*/fd/*, but only reads /proc/net/tcp )

> > Dear all,
> > 
> > I have done some testing to the patch, it works as expected. So I
> > regenerate this patch based on the current tip of resource agents
> > repository. Attched is the "hg export" of it.
> 
> Great! Some comments below.

me too ;)

> > Thanks,
> > Jiaju
> 
> > # HG changeset patch
> > # User Jiaju Zhang <[email protected]>
> > # Date 1262758741 -28800
> > # Node ID 903387343622ecf7f1838eb25ab03a6beaa988da
> > # Parent  c76b4a6eb576feb3b39852aa2349a0716bda1078
> > Dev: portblock: Tickle ACK to TCP connections
> > 
> > diff -r c76b4a6eb576 -r 903387343622 heartbeat/portblock
> > --- a/heartbeat/portblock   Mon Jan 04 14:42:10 2010 +0100
> > +++ b/heartbeat/portblock   Wed Jan 06 14:19:01 2010 +0800
> > @@ -14,6 +14,8 @@
> >  #          OCF_RESKEY_portno
> >  #          OCF_RESKEY_action
> >  #          OCF_RESKEY_ip
> > +#          OCF_RESKEY_tickle_dir
> > +#          OCF_RESKEY_sync_script
> >  #######################################################################
> >  # Initialization:
> >  
> > @@ -26,6 +28,7 @@
> >  : ${OCF_RESKEY_ip=${OCF_RESKEY_ip_default}}
> >  #######################################################################
> >  CMD=`basename $0`
> > +TICKLETCP=$HA_BIN/tickle_tcp
> >  
> >  usage()
> >  {
> > @@ -100,6 +103,23 @@
> >  <content type="string" default="${OCF_RESKEY_ip_default}" />
> >  </parameter>
> >  
> > +<parameter name="tickle_dir" unique="0" required="0">
> > +<longdesc lang="en">
> > +The shared directory which stores the established TCP connections.


MUST be and absolute path!
(also validate that in validate-all)

> > +</longdesc>
> > +<shortdesc lang="en">Tickle directory</shortdesc>
> > +<content type="string" default="" />
> > +</parameter>
> 
> Perhaps to give a short example on how to stack resources when
> using this feature and whatever else is relevant.


also, if you have multiple portblock resources for different ports
on the same ip, you may want to document that specifying tickle_dir
(thus enabling the feature) on _one_ of them is sufficient.
it reduces the overhead, and you can only failover on IP granularity,
anyways. besides, multiple instances on different ports but the same ip
could step on each others toes when updating the state file.

btw. I just notice that in theory the portblock RA would be able to
handle multiple comma separated ports, or port ranges (at least the
iptables commands suggest this).
but the CheckPort part (validate-all) would not allow it.
so that probably should be fixed as well,
though it is independent of the tickle stuff.

> > +
> > +<parameter name="sync_script" unique="0" required="0">
> > +<longdesc lang="en">
> > +The script used for synchronizing TCP connection state file, such as 
> > +csync2, some wrapper of rsync, or whatever.
> > +</longdesc>
> > +<shortdesc lang="en">File sync script</shortdesc>
> > +<content type="string" default="" />
> > +</parameter>
> 
> The metadata for sync_script should contain a more detailed
> explanation on when (no shared storage) and how (the script is
> invoked with a single argument, i.e. statefile) to be used.
> Otherwise, it would be good to actually use parameters which
> would work for rsync/csync2 without need for a wrapper, even in
> case you have to test for the string and adjust invocation. I
> doubt that many people would use anything other than these two.

you could specify "csync2 -xv" as sync script.
because the variable is not quoted on invokation,
word split will to the right thing.

for rsync, you always need a wrapper,
as rsync needs to know which host(s) to sync to.

> > +
> >  <parameter name="action" unique="0" required="1">
> >  <longdesc lang="en">
> >  The action (block/unblock) to be done on the protocol::portno.
> > @@ -149,6 +169,27 @@
> >  {
> >    PAT=`active_grep_pat "$1" "$2" "$3"`
> >    $IPTABLES -n -L INPUT | grep "$PAT" >/dev/null
> > +}
> > +
> > +save_tcp_connections()
> > +{
> > +   [ -z "$OCF_RESKEY_tickle_dir" ] && return
> > +   statefile=$OCF_RESKEY_tickle_dir/$OCF_RESKEY_ip
> > +   netstat -tn |egrep '^tcp.*ESTABLISHED' |awk '{print $4" "$5}' |
> 
> More common to use ',':
> 
> +     netstat -tn |egrep '^tcp.*ESTABLISHED' |awk '{print $4,$5}' |
> 
> > +           while read server client; do
> > +                   ipaddr=${server%:*}
> 
> portblock is /bin/sh, so no bash stuff. Don't know what does this
> exactly mean, perhaps: `echo $server | sed 's/:.*//'`.

this whole shell mess is to filter on local addr = OCF_RESKEY_ip

because the shell is exceptionally slow when used to parse things,
I suggest to replace this with 
netstat -t4n | awk -F '[:[:space:]]+' '
        $8 == "ESTABLISHED" && $4 == "'$OCF_RESKEY_ip'" \
        { printf "%s:%s\t%s:%s\n", $4,$5, $6,$7}'

(careful with line breaks there, don't forget the \ ;)

> > +                   [ "$ipaddr" == "$OCF_RESKEY_ip" ] && echo $server 
> > $client
> > +           done |
> > +           dd of="$statefile".new conv=fsync && mv "$statefile".new 
> > "$statefile"

we could drop the fsync bit when using the sync script, btw.
it would only be necessary if we are on shared storage to avoid
truncated and not yet fsync'd files after crash recovery.

but again this digresses into micro optimization...

> > +   [ -n "$OCF_RESKEY_sync_script" ] && $OCF_RESKEY_sync_script $statefile
> 
> Perhaps to split the last two lines in two after '&&' for better
> clarity (after '&&' no need for '\').

        [ -z "$OCF_RESKEY_sync_script" ] && return
        $OCF_RESKEY_sync_script $statefile

> 
> > +}
> > +
> > +run_tickle_tcp()
> > +{
> > +   [ -z "$OCF_RESKEY_tickle_dir" ] && return
> > +   echo 1 > /proc/sys/net/ipv4/tcp_tw_recycle
> > +   f=$OCF_RESKEY_tickle_dir/$OCF_RESKEY_ip
> > +   [ -f $f ] && cat $f | $TICKLETCP -n 3
> >  }
> >  
> >  SayActive()
> > @@ -195,8 +236,9 @@
> >             ;;
> >         
> >         *)
> > -           if ha_pseudo_resource "${OCF_RESOURCE_INSTANCE}" status; then   
> > +           if ha_pseudo_resource "${OCF_RESOURCE_INSTANCE}" status; then
> >                     SayActive $*
> > +                   save_tcp_connections
> >                     rc=$OCF_SUCCESS
> >             else
> >                     SayInactive $*
> > @@ -243,7 +285,10 @@
> >    ha_pseudo_resource "${OCF_RESOURCE_INSTANCE}" start
> >    case $4 in
> >      block) IptablesBLOCK "$@";;
> > -    unblock)       IptablesUNBLOCK "$@";;
> > +    unblock)
> > +           IptablesUNBLOCK "$@"
> > +           run_tickle_tcp
> > +           ;;
> >      *)             usage; return 1;
> >    esac
> >  
> > @@ -256,7 +301,10 @@
> >    ha_pseudo_resource "${OCF_RESOURCE_INSTANCE}" stop
> >    case $4 in
> >      block) IptablesUNBLOCK "$@";;
> > -    unblock)       IptablesBLOCK "$@";;
> > +    unblock)
> > +           save_tcp_connections
> > +           IptablesBLOCK "$@"
> > +           ;;
> >      *)             usage; return 1;;
> >    esac
> 
> As far as I could see, the C code looks good.

good job!

        Lars

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to