Marian,

I suggest we continue this discussion on the -dev list. :)

On 2010-02-24 01:27, Marian Marinov wrote:
> Hello,
> this is a cleaned up revision of my patch to the mysql RA.
> I'm changing only 2 commands from the old RA simply to remove one unnecessary 
> echo to the mysql command.
> 
> I didn't split it into multiple patches, yet, since I'm sure that there is 
> more work to be done. After I'm finished I'll publish it as several 
> sequential 
> patches.

OK, please do. A few remarks on the patch as is.

>  # Fill in some defaults if no values are specified
>  HOSTOS=`uname`
> +HOSTNAME=`hostname`

You probably want to use "uname -n" here, as that is what everything
else in Linux-HA takes as the canonical host name.

> +<action name="monitor" depth="0"  timeout="20" interval="20" start-delay="0" 
> role="Slave" />
> +<action name="monitor" depth="0"  timeout="20" interval="10" start-delay="0" 
> role="Master" />

Just kill start-delay. :)

> +
> +     if [ ! -z "${OCF_RESKEY_CRM_meta_master_max}" ] && [ 
> "${OCF_RESKEY_CRM_meta_master_max}" > 0 ]; then

We should probably add a generic "is this a m/s resource?" test to the
OCF shellfuncs library.

> +is_slave() {
> +     slave_info=($(mysql \
> +             --user=$OCF_RESKEY_replication_user \
> +             --password=$OCF_RESKEY_replication_passwd \
> +             --socket=$OCF_RESKEY_socket -O connect_timeout=1 \
> +             -e 'SHOW SLAVE STATUS\G'|awk '/Running/ || /Master_[UHP]/{print 
> $2}'))
> +
> +     if [ "$?" != 0 ]; then
> +             ocf_log err "Unable to get local slave status"
> +             return 1
> +     fi
> +
> +     if [ -z "${slave_info[*]}" ]; then
> +             # no slave configuration, can not be slave
> +             return 1;
> +     fi
> +
> +     if [ -z "${slave_info[3]}" ] || [ -z "${slave_info[4]}" ] || [ -z 
> "${slave_info[0]}" ] || [ -z "${slave_info[0]}" ]; then
> +             ocf_log err "Unable to get slave status"
> +             return 1
> +     fi

As Dejan has already pointed out, arrays may not be available in
non-bash shells, and it's a potential regression to rely on bash
features in an RA that was previously Bourne shell clean. Can you come
of with a different way of handling this?

>  mysql_status() {
>       if [ ! -e $OCF_RESKEY_pid ]; then
>               ocf_log debug "MySQL is not running"
> @@ -301,14 +380,37 @@ mysql_monitor() {
>      mysql_status
>      rc=$?
>  
> -    if [ $OCF_CHECK_LEVEL = 0 -o $rc != 0 ]; then
> +     if ! is_slave; then
> +             # if the check came from probe
> +             if [ "$OCF_RESKEY_CRM_meta_interval" = "0" ]; then
> +                     return $OCF_RUNNING_MASTER;

Use ocf_is_probe() please.

> +             fi
> +             # if the check requires a master/slave status and this is the 
> Master node
> +             if [ "$OCF_RESKEY_CRM_meta_role" == "Master" ]; then
> +                     return $OCF_RUNNING_MASTER

IMHO that's pointless. You are supposed to _tell_ the CRM what the state
of the resource is, instead you are _asking_ what it thinks it is.

>      # Do a detailed status check
> -    buf=`echo "SELECT * FROM $OCF_RESKEY_test_table" | mysql 
> --user=$OCF_RESKEY_test_user --password=$OCF_RESKEY_test_passwd 
> --socket=$OCF_RESKEY_socket -O connect_timeout=1 2>&1`
> +    buf=$(mysql --user=$OCF_RESKEY_test_user \
> +             --password=$OCF_RESKEY_test_passwd \
> +             --socket=$OCF_RESKEY_socket \
> +             -O connect_timeout=1 \
> +             -e "SELECT * FROM $OCF_RESKEY_test_table" 2>&1)

Yeah this one should go in a separate patch, as it's cosmetic and
unrelated to replication or M/S capability.

>  mysql_start() {
> +     mysql_validate

Validate should be run for every operation except meta-data and usage.
Don't insert it here. Look at how other RAs do this (and the upstream
mysql RA too).

>      mysql_status
> -    if [ $? = $OCF_SUCCESS ]; then
> -     ocf_log info "MySQL already running"
> -     return $OCF_SUCCESS
> -    fi

Why are you removing this? On start, you _must_ check if the resource is
already running, and return successfully if it is.

> -    ${OCF_RESKEY_binary}  --defaults-file=$OCF_RESKEY_config 
> --pid-file=$OCF_RESKEY_pid --socket=$OCF_RESKEY_socket 
> --datadir=$OCF_RESKEY_datadir --user=$OCF_RESKEY_user 
> $OCF_RESKEY_additional_parameters >/dev/null &
> +    ${OCF_RESKEY_binary} --defaults-file=$OCF_RESKEY_config \
> +             --pid-file=$OCF_RESKEY_pid \
> +             --socket=$OCF_RESKEY_socket \
> +             --datadir=$OCF_RESKEY_datadir \
> +             --user=$OCF_RESKEY_user $OCF_RESKEY_additional_parameters 
> >/dev/null 2>&1 &

Another cosmetic change, let's keep those out of your M/S patch. :)

>  mysql_stop() {
> +     mysql_validate

I repeat my earlier statement.

>      if [ ! -f $OCF_RESKEY_pid ]; then
>       ocf_log info "MySQL is not running"
>          return $OCF_SUCCESS
> @@ -408,60 +510,134 @@ mysql_stop() {
>       return $OCF_ERR_GENERIC
>      fi
>  
> -     # stop waiting
> -     shutdown_timeout=$((($OCF_RESKEY_CRM_meta_timeout/1000)-5))
> -     count=0
> -     while [ $count -lt $shutdown_timeout ]
> -     do

Why are you removing this shutdown escalation?

> -case "$1" in
> -  meta-data) meta_data
> -             exit $OCF_SUCCESS;;
> -  usage|help)        usage
> -             exit $OCF_SUCCESS;;
> -esac

mysql_validate being called right after this, as in the original, is
quite right.

> +mysql_promote() {
> +     if ( ! mysql_status ); then
> +             return $OCF_NOT_RUNNING
> +     fi
> +     if [ "$OCF_RESKEY_CRM_meta_notify_promote_uname" != "$HOSTNAME " ]; then
> +             return $OCF_ERR_GENERIC
> +     fi

That's a "should never happen." You can _exit_ with $OCF_ERR_GENERIC
here, and be sure to scream very loudly before you do ("ocf_log err").

> +mysql_demote() {
> +     search_uname=''
> +     if [[ "$OCF_RESKEY_CRM_meta_notify_master_uname" =~ "^\s*$" ]] &&
> +             [[ "$OCF_RESKEY_CRM_meta_notify_promote_uname" =~ "^\s*$" ]]; 
> then
> +             echo "No master or promote uname found" >> /tmp/replica.log

No. Don't use echo. Use ocf_log.

> +             return $OCF_ERR_GENERIC
> +     fi
> +     if [ "$OCF_RESKEY_CRM_meta_notify_promote_uname" == "$HOSTNAME " ]; then
> +             echo "Should not connect to my self(promote)" >> 
> /tmp/replica.log
> +             return $OCF_ERR_GENERIC
> +     else
> +             search_uname=$OCF_RESKEY_CRM_meta_notify_promote_uname
> +     fi
> +     if [ "$OCF_RESKEY_CRM_meta_notify_master_uname" == "$HOSTNAME " ]; then
> +             echo "Should not connect to my self(master)" >> /tmp/replica.log
> +             return $OCF_ERR_GENERIC
> +     else
> +             search_uname=$OCF_RESKEY_CRM_meta_notify_master_uname
> +     fi
> +
> +     if ( ! mysql_status ); then
> +             return $OCF_NOT_RUNNING
>  fi

If you're not running, then you shouldn't be promoted. So this is
another "shouldn't happen", if I'm not mistaken. In which case I would
error out here. Maybe Dejan or Andrew can correct me if I'm wrong.

> +             if [ -z "$search_uname" ]; then return $OCF_ERR_GENERIC; fi
> +             master_host=$(grep $OCF_RESKEY_CRM_meta_notify_master_uname 
> /etc/hosts|awk '{print $1;exit}')

Why this grep/awk tango? Can't you just use "host"? Or really, why do
you care to resolve the IP address... can't mysql do that for you? At
which point you could ditch the following block:

> +             if [ -z "$master_host" ]; then
> +                     ocf_log err "Unable to get IP address of host 
> $OCF_RESKEY_CRM_meta_notify_master_uname"
> +             return $OCF_ERR_GENERIC;
> +             fi

> +             master_info=($(mysql \
> +                     --password=$OCF_RESKEY_replication_passwd \
> +                     --user=$OCF_RESKEY_replication_user \
> +                     -h $master_host \
> +                     -O connect_timeout=1 \
> +             -e 'SHOW MASTER STATUS\G'|awk '/File/||/Position/{print $2}'))
> +             if [ -z "${master_info[0]}" ] || [ -z "${master_info[1]}" ]; 
> then
> +                     ocf_log err "Empty master file or master position"
> +             return $OCF_ERR_GENERIC;
> +             fi
> +             mysql --socket=$OCF_RESKEY_socket -O connect_timeout=1 -e 'STOP 
> SLAVE';
> +             mysql --socket=$OCF_RESKEY_socket -O connect_timeout=1 \
> +                     -e "CHANGE MASTER TO MASTER_HOST='$master_host', \
> +                             MASTER_USER='$OCF_RESKEY_replication_user', \
> +                             
> MASTER_PASSWORD='$OCF_RESKEY_replication_passwd', \
> +                             MASTER_PORT=3306, \

Maybe add a "replication_port" resource parameter? Conceivably people
might be running their MySQL on a non default port.

While I'm add it, how are you dealing with people who run MySQL
replication over SSL/TLS?

> +                             MASTER_LOG_FILE='${master_info[0]}', \
> +                             MASTER_LOG_POS=${master_info[1]}, \
> +                             MASTER_CONNECT_RETRY=4"
> +             mysql --socket=$OCF_RESKEY_socket -O connect_timeout=1 -e 
> 'START SLAVE';
> +             if [ $? == $OCF_ERR_GENERIC ]; then
> +                     return $OCF_ERR_GENERIC
> +             fi
> +     fi
> +     if is_slave 1; then
> +             return $OCF_SUCCESS
> +     else
> +             return $OCF_ERR_GENERIC
> +     fi
> +}
> +
> +mysql_notify() {
> +     if [ "$OCF_RESKEY_CRM_meta_notify_type" != 'post' ]; then
> +             return $OCF_SUCCESS
> +     fi
> +     case "$OCF_RESKEY_CRM_meta_notify_operation" in
> +             'promote')
> +                     if [ "$OCF_RESKEY_CRM_meta_notify_promote_uname" != 
> "$HOSTNAME " ] &&
> +                             ! is_slave 1; then
> +                             mysql_demote 1
> +                     fi
> +             ;;
> +             'demote')
> +                     if [ "$OCF_RESKEY_CRM_meta_notify_promote_uname" == 
> "$HOSTNAME " ] &&
> +                             ! is_slave 1; then
> +                             mysql_demote 1
> +                     fi
> +             ;;
> +             *) return $OCF_SUCCESS ;;
> +     esac
> +
> +}

One more question: how do you update your master preference? You'll have
to figure out a way how to tell Pacemaker which node to promote in case
the master fails. I have my own ideas about this, but would like to hear
yours so I don't poison your ideas with mine, in case mine are stupid. :)

Cheers,
Florian

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to