Hi,

11.11.2011 16:06, Nick Khamis wrote:
> Hello Andrew,
> 
> I do appologize for this, and really appreciate how far I have got into
> this project thanks to everyone's help. Just as a quick summary:
> 
> the patch that you suggested did in fact fix the following (ais.c:346):

Do you mean patch which fixes that bug with quorum_type is not passed to RA?

[snip]

> 
> 
> * The controld RA is using the standard dlm_controld, and this is now working.
> * The o2cb RA is using ocfs2_controld.pcmk, and this is where I am running 
> into
> the runtime error with corosync.c

You should force system to use correct controld's.
Can you please try attached replacement RA? It has support for dlm, gfs2
and ocfs2 controld's. I run it without any additional parameters, and it
works with both ais and cman stacks with all three controld's (although
Andrew doubts it can work at all ;).
This RA tries to work around that bug above. It also has some
experimental chrt hacks to try to play well under very high load.

> 
>>
>> IMO (and as Florian alluded to in another message), you'd probably save
>> yourself a lot of trouble taking prebuilt packages from a distro where
>> the pieces you need are known to work together.
> 
>> Indeed.
> 
> There is no resenting that! But I am so close. Actually, I do have things
> working without the o2cb primitive, i.e., pcmk is starting the dual primary
> drbd, cloned dlm, and mounting the cloned ocfs2 filesystem:
> 
> root@astdrbd1:~# /etc/init.d/cman start
> Starting cluster:
>    Checking if cluster has been disabled at boot... [  OK  ]
>    Checking Network Manager... [  OK  ]
>    Global setup... [  OK  ]
>    Loading kernel modules... [  OK  ]
>    Mounting configfs... [  OK  ]
>    Starting cman... [  OK  ]
>    Waiting for quorum... [  OK  ]
>    Starting fenced... [  OK  ]
>    Starting dlm_controld... [  OK  ]
>    Unfencing self... [  OK  ]
>    Joining fence domain... [  OK  ]
> 

Ahm... You start dlm_controld from cman's initscript, and then monitor
it from pacemaker? Doesn't look consistent...
Please look in cman configuration, there is a variable to prevent
controld's to be started from script (in new versions).

> root@astdrbd1:~# /etc/init.d/pacemaker start
> Starting Pacemaker Cluster Manager: touch: missing file operand
> Try `touch --help' for more information.
> [  OK  ]
> 
> 
> ============
> Last updated: Fri Nov 11 07:36:11 2011
> Last change: Fri Nov 11 07:33:06 2011 via crmd on astdrbd1
> Stack: cman
> Current DC: astdrbd1 - partition with quorum
> Version: 1.1.6-2d8fad5
> 2 Nodes configured, 2 expected votes
> 7 Resources configured.
> ============
> 
> Online: [ astdrbd1 astdrbd2 ]
> 
> astIP   (ocf::heartbeat:IPaddr2):       Started astdrbd1
>  Master/Slave Set: msASTDRBD [astDRBD]
>      Masters: [ astdrbd2 astdrbd1 ]
>  Clone Set: astDLMClone [astDLM]
>      Started: [ astdrbd2 astdrbd1 ]
>  Clone Set: astFilesystemClone [astFilesystem]
>      Started: [ astdrbd2 astdrbd1 ]

This one is an o2cb RA?

> 
> 
> Of course, o2cb is not pcmk cluster aware right now and needs to be
> started manually.

I really stumped by this. :(

> 
> Vladislav, if you are getting this I can test if the kernel bug that slows 
> down
> ocfs2 reported by you earlier. Is there any test you would like me to perform?

Can't recall really any slowdowns. But remember simultaneous kernel
panic on all nodes. And after that OCFS2 is a no-go for me. At least
until somebody tells me that it really works for more than one year.

Best,
Vladislav
#!/bin/sh
#
#        Resource Agent for managing the DLM controld process.
#
# Copyright (c) 2009 Novell, Inc
#                    All Rights Reserved.
#
# Copyright (c) 2011 Vladislav Bogdanov <[email protected]>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of version 2 of the GNU General Public License as
# published by the Free Software Foundation.
#
# This program is distributed in the hope that it would be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# Further, this software is distributed without any warranty that it is
# free of the rightful claim of any third person regarding infringement
# or the like.  Any license provided herein, whether implied or
# otherwise, applies only to this software file.  Patent licenses, if
# any, provided herein do not apply to combinations of this program with
# other software, or any other product whatsoever.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write the Free Software Foundation,
# Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
#

#######################################################################
# Initialization:
: ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/lib/heartbeat}
. ${OCF_FUNCTIONS_DIR}/ocf-shellfuncs

#######################################################################

meta_data() {
        cat <<END
<?xml version="1.0"?>
<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
<resource-agent name="controld" version="0.9">
<version>1.0</version>

<longdesc lang="en">
This Resource Agent can control the dlm_controld services needed by gfs2 and 
ocfs2.
ocfs2_controld and gfs_controld also can be handled by this RA.
It assumes that daemon is in your default PATH.
In most cases, it should be run as an anonymous clone.
</longdesc>
<shortdesc lang="en">DLM control daemon</shortdesc>

<parameters>

<parameter name="args" unique="0">
<longdesc lang="en">
Any additional options to start the service with.
</longdesc>
<shortdesc lang="en">Daemon options</shortdesc>
<content type="string" default="-q 0" />
</parameter>

<parameter name="configdir" unique="1">
<longdesc lang="en">
The location where configfs is or should be mounted
</longdesc>
<shortdesc lang="en">Location of configfs</shortdesc>
<content type="string" default="/sys/kernel/config" />
</parameter>

<parameter name="daemon" unique="1">
<longdesc lang="en">
The daemon to start - supports gfs_controld(.pcmk), ocfs2_controld(.cman|.pcmk) 
and dlm_controld(.pcmk).
Daemon name is automatically chosen based on clone instance name and cluster 
stack.
</longdesc>
<shortdesc lang="en">The daemon to start</shortdesc>
<content type="string" default="dlm_controld.pcmk" />
</parameter>

</parameters>

<actions>
<action name="start"        timeout="90" />
<action name="stop"         timeout="100" />
<action name="monitor"      timeout="20" interval="10" depth="0" 
start-delay="0" />
<action name="meta-data"    timeout="5" />
<action name="validate-all" timeout="30" />
</actions>
</resource-agent>
END
}

#######################################################################

controld_usage() {
        cat <<END
usage: $0 {start|stop|monitor|validate-all|meta-data}

Expects to have a fully populated OCF RA-compliant environment set.
END
}

controld_start() {
    controld_monitor; rc=$?

    case $rc in
        $OCF_SUCCESS)
            return $OCF_SUCCESS
            ;;
        $OCF_NOT_RUNNING)
            ;;
        *)
            return $OCF_ERR_GENERIC
            ;;
    esac

    if [ ! -e $OCF_RESKEY_configdir ]; then
        modprobe configfs
        if [ ! -e $OCF_RESKEY_configdir ]; then
            ocf_log err "$OCF_RESKEY_configdir is not available"
            return $OCF_ERR_INSTALLED
        fi
    fi

    mount | grep "type configfs" > /dev/null
    if [ $? != 0 ]; then
        mount -t configfs none $OCF_RESKEY_configdir
    fi

    if [ ! -e $OCF_RESKEY_configdir/dlm ]; then
        # This one is needed by everyone else
        modprobe dlm
        if [ ! -e $OCF_RESKEY_configdir/dlm ]; then
            ocf_log err "$OCF_RESKEY_configdir/dlm is not available"
            return $OCF_ERR_INSTALLED
        fi
    fi
    case ${instance} in
        ocfs2)
            if [ ! -e "/sys/fs/ocfs2/cluster_stack" ]; then
                modprobe ocfs2_stackglue
                if [ ! -e "/sys/fs/ocfs2/cluster_stack" ]; then
                    ocf_log err "/sys/fs/ocfs2/cluster_stack is not available"
                    return $OCF_ERR_INSTALLED
                fi
            fi
            awk '/^'user'$/{success=1; exit} END {if (success) exit 0 ;else 
exit 1}' /sys/fs/ocfs2/loaded_cluster_plugins
            if [ $? -ne 0 ] ; then
                modprobe ocfs2_stack_user
                awk '/^'user'$/{success=1; exit} END {if (success) exit 0 ;else 
exit 1}' /sys/fs/ocfs2/loaded_cluster_plugins
                if [ $? -ne 0 ] ; then
                    ocf_log err "Switch to userspace stack unsuccessful"
                    return $OCF_ERR_INSTALLED
                fi
            fi
            if [ -f "/sys/fs/ocfs2/cluster_stack" ] ; then
                echo ${HA_quorum_type} > /sys/fs/ocfs2/cluster_stack
                if [ $? != 0 ]; then
                    ocf_log err "Userspace stack '${HA_quorum_type}' not 
supported"
                    return $OCF_ERR_INSTALLED
                fi
            else
                ocf_log err "Switch to userspace stack not supported"
                return $OCF_ERR_INSTALLED
            fi
            # controld tries to determine fs version from ocfs2.ko
            awk '(NF == 1 && $1 ~ /^ocfs2$/) || $2 ~ /^ocfs2$/ {success=1; 
exit} ; END {if (success) exit 0 ;else exit 1}' < /proc/filesystems
            if [ $? != 0 ]; then
                modprobe ocfs2
                if [ $? != 0 ]; then
                    ocf_log err "Unable to load ocfs2 module"
                    return $OCF_ERR_INSTALLED
                fi
            fi
            ;;
        gfs)
            : gfs_controld does not need gfs2.ko to be loaded to operate
            ;;
    esac

    # Run control daemon with real-tim epriority
    chrt -r 99 ${OCF_RESKEY_daemon} $OCF_RESKEY_args

    sleep 1
    # DLM threads are not available before daemon is started
    case ${instance} in
        dlm)
            # Assign real-time priority to DLM kernel threads
            for p in $(ps -e | grep -E '[d]lm_(astd|recoverd|recv/|scand|send)' 
| awk '{print $1}') ; do chrt -p -r 99 $p ; done
            ;;
    esac

    controld_monitor
}

controld_stop() {
    controld_monitor; rc=$?

    if [ $rc = $OCF_NOT_RUNNING ]; then
        return $OCF_SUCCESS
    fi

    killall -TERM ${OCF_RESKEY_daemon}; rc=$?

    if [ $rc != 0 ]; then
        return $OCF_ERR_GENERIC
    fi

    rc=$OCF_SUCCESS
    while [ $rc = $OCF_SUCCESS ]; do
        controld_monitor; rc=$?
        sleep 1
    done

    if [ $rc = $OCF_NOT_RUNNING ]; then
        rc=$OCF_SUCCESS
    fi

    case ${instance} in
        ocfs2)
            modprobe -r ocfs2
            if [ $? -ne 0 ] ; then
                ocf_log err "Unable to unload ocfs2 module"
                return $OCF_ERR_GENERIC
            fi

            while read plugin ; do
                modprobe -r ocfs2_stack_${plugin}
                if [ $? -ne 0 ]; then
                    ocf_log err "Unable to unload ocfs2_stack_${plugin} module"
                    return $OCF_ERR_GENERIC
                fi
            done < /sys/fs/ocfs2/loaded_cluster_plugins

            modprobe -r ocfs2_stackglue
            if [ $? -ne 0 ] ; then
                ocf_log err "Unable to unload ocfs2_stackglue module"
                return $OCF_ERR_GENERIC
            fi
            ;;
    esac

    return $rc
}

controld_monitor() {
    killall -0 ${OCF_RESKEY_daemon}; rc=$?

    case $rc in
        0)
            return $OCF_SUCCESS
            ;;
        1)
            return $OCF_NOT_RUNNING
            ;;
        *)
            return $OCF_ERR_GENERIC
            ;;
    esac
}

controld_validate() {
    check_binary ${OCF_RESKEY_daemon}

    if ocf_is_true ${OCF_RESKEY_CRM_meta_globally_unique} ; then 
        ocf_log err "$OCF_RESOURCE_INSTANCE must be configured with the 
globally_unique=false meta attribute"
        exit $OCF_ERR_CONFIGURED
    fi

    return $OCF_SUCCESS
}

: ${OCF_RESKEY_configdir=/sys/kernel/config}
: ${OCF_RESKEY_CRM_meta_globally_unique:="false"}
: ${OCF_RESKEY_args=""}

if [ -z "${OCF_RESKEY_daemon}" ] ; then
    case "$OCF_RESOURCE_INSTANCE" in
        *[gG][fF][sS]*)
            instance=gfs
            ;;
        *[oO][cC][fF][sS]*)
            instance=ocfs2
            ;;
        *[dD][lL][mM]*|*)
            instance=dlm
            ;;
        *)
            ocf_log err "Unable to guess control daemon name"
            exit $OCF_ERR_CONFIGURED
    esac

else
    instance=${OCF_RESKEY_daemon%%_*}
fi

STATEFILE="${HA_RSCTMP}/controld-${OCF_RESOURCE_INSTANCE}.state"
if [ -n "${OCF_RESOURCE_INSTANCE}" ] && [ "${OCF_RESOURCE_INSTANCE}" != "undef" 
] ; then
    # Do not read statefile on probe, this will force rewrite
    [ -f "${STATEFILE}" ] && ! ocf_is_probe && . "${STATEFILE}"
fi

if [ -z "${HA_quorum_type}" ] ; then
    HA_quorum_type=$( crm_attribute --type crm_config --name 
cluster-infrastructure --query --quiet )
    if [ -n "${OCF_RESOURCE_INSTANCE}" ] && [ "${OCF_RESOURCE_INSTANCE}" != 
"undef" ] ; then
        echo "HA_quorum_type=${HA_quorum_type}" > "${STATEFILE}"
    fi
fi

case "$HA_quorum_type" in
    cman)
        case ${instance} in
            ocfs2)
                daemon_ext=".cman"
                ;;
            *)
                daemon_ext=""
                ;;
        esac
        ;;
    *)
        daemon_ext=".pcmk"
        # ocfs2 will switch to correct stack, others do not need it anymore
        HA_quorum_type="pcmk"
        ;;
esac

if [ -z "${OCF_RESKEY_daemon}" ] ; then
    OCF_RESKEY_daemon=${instance}_controld${daemon_ext}
fi

case $__OCF_ACTION in
    meta-data)
        meta_data
        exit $OCF_SUCCESS
        ;;
    start)
        controld_validate
        controld_start
        ;;
    stop)
        controld_stop
        rm -f "${STATEFILE}"
        ;;
    monitor)
        controld_validate
        controld_monitor
        ;;
    validate-all)
        controld_validate
        ;;
    usage|help)
        controld_usage
        exit $OCF_SUCCESS
        ;;
    *)
        controld_usage
        exit $OCF_ERR_UNIMPLEMENTED
        ;;
esac
rc=$?

exit $rc

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to