Hi, 22.02.2011 17:03, Dejan Muhamedagic wrote: > A few general observations: > > - the name (ifstatus) doesn't fit exactly (perhaps ifspeed?)
Main purpose of this RA (at least as I see it) is interface status check, speed is used only to reflect that. Any non-zero value means that interface is operational (has some active underlying interfaces). Yes, actual value is speed of active interface(s), but this is not the main value from this RA. > - since this is a stateless agent, you should use the existing > functions for that purpose: > > http://www.linux-ha.org/doc/dev-guides/_pseudo_resources_literal_ha_pseudo_resource_literal.html Will look at that, thank you for pointer. ping RA seems to be missing this too BTW. Not critical anyway I think. > - almost all function names start with "ifstatus" which sometimes > makes it difficult to follow the code (e.g. ifstatus_iface_get_speed) Just copy-pasted skeleton from ping RA and did 's/ping/ifstatus/g'. And then named all other functions accordingly. This is completely to taste I think. And RA is meant to be used, not re-coded every now and then ;). > > Seems like good work. Thanks, Vladislav > > Cheers, > > Dejan > >> Best, >> Vladislav >> >> 22.02.2011 15:01, Vladislav Bogdanov wrote: >>> Hi Dejan, >>> >>> 22.02.2011 13:02, Dejan Muhamedagic wrote: >>>> Hi, >>>> >>>> Where can you get STP stuff from? How to interpret it? And then >>> >>> Please look at attached RA. >>> >>> I decided that today is a good time to finally brace myself to find 5 >>> hours to write it, thanks Frederik ;) . >>> Tested, "works for me" (c) in that configuration I talked earlier - STP >>> bridge over 1x10Gbps eth + 2x1Gpbs bond. >>> (Hopefully) Supports any combination of bridges, bonds, vlans and >>> physical ethernet interfaces. >>> Tries to guess correct upstream bridge ports, can't test it more >>> thoroughly due to absence of more switch hardware (I currently have only >>> one c3570x stack per cluster). >>> May require some additional checks to be included. >>> Also it is linux-specific and requires bash because of my laziness and >>> fact that I mainly use Fedora which has nothing against bash yet. >>> Can be considered for inclusion in resource-agents (with common license, >>> GPLv2?). >>> >>>> how do you know it's something that won't change within next five >>>> minutes? Finally, every failover can incur downtime, is it worth >>>> the trouble because what you want is just more performance? >>> >>> This could be controlled by non-inf location score and f.e. time-based >>> stickiness. >>> Anyways, I'd better have 10 seconds lockup rather than 10Mb/s per-client >>> read for long time when second cluster node (32 disks in HW RAID10) is >>> able to easily give another 250-400Mb/s of aggregate throughput. >>> >>>> Perhaps you don't even need the extra performance at the time. >>> >>> This depends on what SLA I provide services with... >>> >>>> Other than that it sounds interesting :-) >>> >>> Then, please look at the implementation ;) >>> >>> Best, >>> Vladislav >>> >>> >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: >>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >> > >> #!/bin/bash >> # >> # OCF resource agent which monitors state of network interface and records >> it as a value in CIB >> # based on summ of speeds of its active underlying interfaces. >> # >> # Copyright (c) 2011 Vladislav Bogdanov <bub...@hoster-ok.com> >> # Partially based on 'ping' RA by Andrew Beekhof >> # >> # OCF instance parameters: >> # OCF_RESKEY_name: name of attribute to set in CIB >> # OCF_RESKEY_iface: network interface to monitor >> # OCF_RESKEY_bridge_ports: if not null and OCF_RESKEY_iface is a bridge, >> list of bridge ports to consider. >> # Default is all ports which have >> designated_bridge=root_id >> # OCF_RESKEY_weight_base: weight of each 10Mbps in interface speed >> (1Gbps = 100 * 10Mbps) in CIB score points >> # >> # Initialization: >> >> : ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/resource.d/heartbeat} >> . ${OCF_FUNCTIONS_DIR}/.ocf-shellfuncs >> >> # Defaults >> OCF_RESKEY_name_default="ifstatus" >> OCF_RESKEY_bridge_ports_default="detect" >> OCF_RESKEY_weight_base_default=10 >> OCF_RESKEY_dampen_default=5 >> >> : ${OCF_RESKEY_name=${OCF_RESKEY_name_default}} >> : ${OCF_RESKEY_bridge_ports=${OCF_RESKEY_bridge_ports_default}} >> : ${OCF_RESKEY_weight_base=${OCF_RESKEY_weight_base_default}} >> : ${OCF_RESKEY_dampen=${OCF_RESKEY_dampen_default}} >> >> meta_data() { >> cat <<END >> <?xml version="1.0"?> >> <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd"> >> <resource-agent name="ping"> >> <version>1.0</version> >> >> <longdesc lang="en"> >> Every time the monitor action is run, this resource agent records (in the >> CIB) speed of active network interfaces from a list. >> </longdesc> >> <shortdesc lang="en">Network interface status</shortdesc> >> >> <parameters> >> >> <parameter name="name" unique="1"> >> <longdesc lang="en"> >> The name of the attributes to set. This is the name to be used in the >> constraints. >> </longdesc> >> <shortdesc lang="en">Attribute name</shortdesc> >> <content type="string" default="${OCF_RESKEY_name_default}"/> >> </parameter> >> >> <parameter name="iface" unique="0" required="1"> >> <longdesc lang="en"> >> Network interface to monitor. >> </longdesc> >> <shortdesc lang="en">Network interface</shortdesc> >> <content type="string" default=""/> >> </parameter> >> >> <parameter name="bridge_ports" unique="0"> >> <longdesc lang="en"> >> If not null and OCF_RESKEY_iface is a bridge, list of bridge ports to >> consider. >> Default is all ports which have designated_bridge=root_id. >> </longdesc> >> <shortdesc lang="en">Bridge ports</shortdesc> >> <content type="string" default="${OCF_RESKEY_bridge_ports_default}"/> >> </parameter> >> >> <parameter name="weight_base" unique="0"> >> <longdesc lang="en"> >> Weight of each 10Mbps in interface speed (1Gbps = 100 * 10Mbps). >> With default value 1Gbps interface will be counted as 1000. >> </longdesc> >> <shortdesc lang="en">Weight of 10Mbps interface</shortdesc> >> <content type="integer" default="${OCF_RESKEY_weight_base_default}"/> >> </parameter> >> >> <parameter name="dampen" unique="0"> >> <longdesc lang="en"> >> The time to wait (dampening) further changes occur >> </longdesc> >> <shortdesc lang="en">Dampening interval</shortdesc> >> <content type="integer" default="${OCF_RESKEY_dampen_default}"/> >> </parameter> >> >> <parameter name="debug" unique="0"> >> <longdesc lang="en"> >> Enables to use default attrd_updater verbose logging on every call. >> </longdesc> >> <shortdesc lang="en">Verbose logging</shortdesc> >> <content type="string" default="false"/> >> </parameter> >> >> </parameters> >> >> <actions> >> <action name="start" timeout="30" /> >> <action name="stop" timeout="30" /> >> <action name="reload" timeout="30" /> >> <action name="monitor" depth="0" timeout="30" interval="10"/> >> <action name="meta-data" timeout="5" /> >> <action name="validate-all" timeout="30" /> >> </actions> >> </resource-agent> >> END >> } >> >> ifstatus_usage() { >> cat <<END >> usage: $0 {start|stop|monitor|migrate_to|migrate_from|validate-all|meta-data} >> >> Expects to have a fully populated OCF RA-compliant environment set. >> END >> } >> >> ifstatus_start() { >> ifstatus_monitor >> if [ $? = $OCF_SUCCESS ]; then >> return $OCF_SUCCESS >> fi >> touch ${OCF_RESKEY_pidfile} >> ifstatus_update >> } >> >> ifstatus_stop() { >> rm -f ${OCF_RESKEY_pidfile} >> attrd_updater -D -n $OCF_RESKEY_name -d $OCF_RESKEY_dampen $attrd_options >> return $OCF_SUCCESS >> } >> >> ifstatus_monitor() { >> if [ -f ${OCF_RESKEY_pidfile} ]; then >> ifstatus_update >> return $OCF_SUCCESS >> fi >> return $OCF_NOT_RUNNING >> } >> >> ifstatus_validate() { >> # Is the state directory writable? >> state_dir=`dirname "$OCF_RESKEY_pidfile"` >> touch "$state_dir/$$" >> if [ $? != 0 ]; then >> ocf_log err "Invalid location for 'state': $state_dir is not >> writable" >> return $OCF_ERR_ARGS >> fi >> rm "$state_dir/$$" >> >> # Pidfile better be an absolute path >> case $OCF_RESKEY_pidfile in >> /*) ;; >> *) ocf_log warn "You should use an absolute path for pidfile not: >> $OCF_RESKEY_pidfile" ;; >> esac >> >> # Check the check interval >> if ocf_is_decimal "$OCF_RESKEY_CRM_meta_interval" && [ >> $OCF_RESKEY_CRM_meta_interval -gt 0 ]; then >> : >> else >> ocf_log err "Invalid check interval $OCF_RESKEY_interval. It should >> be positive integer!" >> exit $OCF_ERR_CONFIGURED >> fi >> >> # Check the intarfaces list >> if [ "x" = "x$OCF_RESKEY_iface" ]; then >> ocf_log err "Empty iface parameter. Please specify some network >> interface to check" >> exit $OCF_ERR_CONFIGURED >> fi >> >> return $OCF_SUCCESS >> } >> >> ifstatus_iface_get_speed() { >> local iface=$1 >> local operstate >> local carrier >> local speed >> >> if [ ! -e "/sys/class/net/$iface" ] ; then >> echo 0 >> elif ifstatus_iface_is_bridge $iface ; then >> ifstatus_bridge_get_speed $iface >> elif ifstatus_iface_is_bond $iface ; then >> ifstatus_bond_get_speed $iface >> elif ifstatus_iface_is_vlan $iface ; then >> ifstatus_iface_get_speed $( ifstatus_vlan_get_phy $iface ) >> else >> read operstate < "/sys/class/net/$iface/operstate" >> read carrier < "/sys/class/net/$iface/carrier" >> if [ "$operstate" != "up" ] || [ "$carrier" != "1" ] ; then >> speed="0" >> else >> read speed < "/sys/class/net/$iface/speed" >> fi >> echo $speed >> fi >> } >> >> ifstatus_iface_is_vlan() { >> local iface=$1 >> [ -e "/proc/net/vlan/$iface" ] && return 0 || return 1 >> } >> >> ifstatus_iface_is_bridge() { >> local iface=$1 >> [ -e "/sys/class/net/$iface/bridge" ] && return 0 || return 1 >> } >> >> ifstatus_iface_is_bond() { >> local iface=$1 >> [ -e "/sys/class/net/$iface/bonding" ] && return 0 || return 1 >> } >> >> ifstatus_vlan_get_phy() { >> local iface=$1 >> grep "^$iface " "/proc/net/vlan/config" | sed -r 's/.*\| +(.*)/\1/' >> } >> >> ifstatus_bridge_is_stp_enabled() { >> local iface=$1 >> local stp >> read stp < "/sys/class/net/$iface/bridge/stp_state" >> [ "$stp" = "1" ] && return 0 || return 1 >> } >> >> ifstatus_bridge_get_root_ports() { >> local bridge=$1 >> local root_id >> local root_ports="" >> local bridge_id >> >> read root_id < "/sys/class/net/$bridge/bridge/root_id" >> >> for port in /sys/class/net/$bridge/brif/* ; do >> read bridge_id < "$port/designated_bridge" >> if [ "$bridge_id" = "$root_id" ] ; then >> root_ports="$root_ports ${port##*/}" >> fi >> done >> echo "${root_ports# }" >> } >> >> # From /inlude/linux/if_bridge.h: >> #define BR_STATE_DISABLED 0 >> #define BR_STATE_LISTENING 1 >> #define BR_STATE_LEARNING 2 >> #define BR_STATE_FORWARDING 3 >> #define BR_STATE_BLOCKING 4 >> >> ifstatus_bridge_get_active_ports() { >> local bridge=$1 >> shift 1 >> local ports="$*" >> local active_ports="" >> local port_state >> local stp_state=ifstatus_bridge_is_stp_enabled $bridge >> local warn=0 >> >> if [ -z "$ports" ] || [ "$ports" = "detect" ] ; then >> ports=$( ifstatus_bridge_get_root_ports $bridge ) >> fi >> >> for port in $ports ; do >> if [ ! -e "/sys/class/net/$bridge/brif/$port" ] ; then >> ocf_log warning "Port $port doesn't belong to bridge $bridge" >> continue >> fi >> read port_state < "/sys/class/net/$bridge/brif/$port/state" >> if [ "$port_state" = "3" ] ; then >> if [ -n "$active_ports" ] && $stp_state ; then >> warn=1 >> fi >> active_ports="$active_ports $port" >> fi >> done >> if [ $warn -eq 1 ] ; then >> ocf_log warning "More then one upstream port in bridge '$bridge' is >> in forwarding state while STP is enabled: $active_ports" >> fi >> echo "${active_ports# }" >> } >> >> ifstatus_bridge_get_speed() { >> local $iface=$1 >> >> if ! ifstatus_iface_is_bridge $iface ; then >> echo 0 >> return >> fi >> >> local ports=$( ifstatus_bridge_get_active_ports $iface >> ${OCF_RESKEY_bridge_ports} ) >> for port in $ports ; do >> : $(( aggregate_speed += $( ifstatus_iface_get_speed $port ) )) >> done >> echo $aggregate_speed >> } >> >> ifstatus_bond_get_slaves() { >> local iface=$1 >> local slaves >> read slaves < "/sys/class/net/$iface/bonding/slaves" >> echo $slaves >> } >> >> ifstatus_bond_get_active_iface() { >> local iface=$1 >> local active >> read active < "/sys/class/net/$iface/bonding/active_slave" >> echo $active >> } >> >> ifstatus_bond_is_balancing() { >> local iface=$1 >> read mode mode_index < "/sys/class/net/$iface/bonding/mode" >> case $mode in >> "balance-rr"|"balance-xor"|"802.3ad"|"balance-tlb"|"balance-alb") >> return 0 >> ;; >> *) >> return 1 >> ;; >> esac >> } >> >> ifstatus_bond_get_speed() { >> local iface=$1 >> local aggregate_speed=0 >> >> if ! ifstatus_iface_is_bond $iface ; then >> echo 0 >> return >> fi >> >> local slaves=$( ifstatus_bond_get_slaves $iface ) >> if ifstatus_bond_is_balancing $iface ; then >> for slave in $slaves ; do >> : $(( aggregate_speed += $( ifstatus_iface_get_speed $slave ) )) >> done >> # Bonding is unable to get speed*n >> : $(( aggregate_speed = aggregate_speed*8/10 )) >> else >> : $(( aggregate_speed = $( ifstatus_iface_get_speed $( >> ifstatus_bond_get_active_iface $iface ) ) )) >> fi >> echo $aggregate_speed >> } >> >> ifstatus_update() { >> local speed=$( ifstatus_iface_get_speed $OCF_RESKEY_iface) >> >> : $(( score = speed * $OCF_RESKEY_weight_base / 10 )) >> attrd_updater -n $OCF_RESKEY_name -v $score -d $OCF_RESKEY_dampen >> $attrd_options >> rc=$? >> case $rc in >> 0) >> ocf_is_true ${OCF_RESKEY_debug} && ocf_log debug "Updated >> $OCF_RESKEY_name = $score" >> ;; >> *) >> ocf_log warn "Could not update $OCF_RESKEY_name = $score: rc=$rc" >> ;; >> esac >> return $rc >> } >> >> if [ `uname` != "Linux" ] ; then >> ocf_log err "This RA works only on linux." >> exit $OCF_ERR_INSTALLED >> fi >> >> if ! ocf_is_true ${OCF_RESKEY_CRM_meta_globally_unique} ; then >> : ${OCF_RESKEY_pidfile:="$HA_VARRUN/ifstatus-${OCF_RESKEY_name}"} >> else >> : ${OCF_RESKEY_pidfile:="$HA_VARRUN/ifstatus-${OCF_RESOURCE_INSTANCE}"} >> fi >> >> attrd_options='-q' >> if ocf_is_true ${OCF_RESKEY_debug} ; then >> attrd_options='' >> fi >> >> case $__OCF_ACTION in >> meta-data) >> meta_data >> exit $OCF_SUCCESS >> ;; >> start) >> ifstatus_start >> ;; >> stop) >> ifstatus_stop >> ;; >> monitor) >> ifstatus_monitor >> ;; >> reload) >> ifstatus_start >> ;; >> validate-all) >> ifstatus_validate >> ;; >> usage|help) >> ifstatus_usage >> exit $OCF_SUCCESS >> ;; >> *) >> ifstatus_usage >> exit $OCF_ERR_UNIMPLEMENTED >> ;; >> esac >> exit $? > >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: >> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker