On Fri, Apr 1, 2011 at 8:50 AM, Lars Ellenberg
<[email protected]> wrote:
> On Fri, Apr 01, 2011 at 09:37:09AM -0400, Vadym Chepkov wrote:
>>
>> On Apr 1, 2011, at 3:22 AM, Tim Serong wrote:
>>
>> > On 4/1/2011 at 11:37 AM, Vadym Chepkov <[email protected]> wrote:
>> >> On Mar 31, 2011, at 2:30 PM, Christoph Bartoschek wrote:
>> >>
>> >>> Am 29.03.2011 15:31, schrieb Dejan Muhamedagic:
>> >>>> On Tue, Mar 29, 2011 at 08:13:49AM +0200, Christoph Bartoschek wrote:
>> >>>>> Am 29.03.2011 02:35, schrieb Vadym Chepkov:
>> >>>>>>
>> >>>>>> On Mar 28, 2011, at 10:55 AM, Christoph Bartoschek wrote:
>> >>>>>>
>> >>>>>>> Am 28.03.2011 16:30, schrieb Dejan Muhamedagic:
>> >>>>>>>> Hi,
>> >>>>>>>>
>> >>>>>>>> On Mon, Mar 21, 2011 at 11:33:49PM +0100, Christoph Bartoschek 
>> >>>>>>>> wrote:
>> >>>>>>>>> Hi,
>> >>>>>>>>>
>> >>>>>>>>> I am testing a NFS failover setup. During the tests I created a
>> >>>>>>>>> split-brain situation and now node A thinks it is primary and 
>> >>>>>>>>> uptodate
>> >>>>>>>>> while node B thinks that it is Outdated.
>> >>>>>>>>>
>> >>>>>>>>> crm_mon however does not indicate any error to me. Why is this the 
>> >>>>>>>>> case?
>> >>>>>>>>> I expect to see anything that shows me the degraded status. How 
>> >>>>>>>>> can this
>> >>>>>>>>> be fixed?
>> >>>>>>>>
>> >>>>>>>> The cluster relies on the RA (in this case drbd) to report any
>> >>>>>>>> problems. Do you have a monitor operation defined for that
>> >>>>>>>> resource?
>> >>>>>>>
>> >>>>>>> I have the resource defined as:
>> >>>>>>>
>> >>>>>>> primitive p_drbd ocf:linbit:drbd \
>> >>>>>>>    params drbd_resource="home-data"
>> >>>>>>>    op monitor interval="15" role="Master" \
>> >>>>>>>    op monitor interval="30" role="Slave"
>> >>>>>>>
>> >>>>>>> Is this a correct monitor operation?
>> >>>>
>> >>>> Yes, though you should also add timeout specs.
>> >>>>
>> >>>>>> Just out of curiosity, you do have ms resource defined?
>> >>>>>>
>> >>>>>> ms ms_p_drbd p_drbd \
>> >>>>>>         meta master-max="1" master-node-max="1" clone-max="2" 
>> >>>>>> clone-node-max="1"
>> >> notify="true"
>> >>>>>>
>> >>>>>> Because if you do and cluster is not aware of the split-brain, drbd 
>> >>>>>> RA has a
>> >> serious flaw.
>> >>>>>>
>> >>>>>
>> >>>>> I'm sorry. Yes, the ms resource is also defined.
>> >>>>
>> >>>> Well, I'm really confused. You basically say that the drbd disk
>> >>>> gets into a degraded mode (i.e. it detects split brain), but the
>> >>>> cluster (pacemaker) never learns about that. Perhaps you should
>> >>>> open a bugzilla for this and supply hb_report. Though it's
>> >>>> really hard to believe. It's like basic functionality failing.
>> >>>
>> >>>
>> >>> What would you expect to see?
>> >>>
>> >>> Currently I see the following in crm_mon:
>> >>>
>> >>> Master/Slave Set: ms_drbd_nfs [p_drbd_nfs]
>> >>>    Masters: [ ries ]
>> >>>    Slaves: [ laplace ]
>> >>>
>> >>>
>> >>> At the same time "cat /proc/drbd" on ries says:
>> >>>
>> >>> ries:~ # cat /proc/drbd
>> >>> version: 8.3.9 (api:88/proto:86-95)
>> >>> srcversion: A67EB2D25C5AFBFF3D8B788
>> >>> 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
>> >>>    ns:0 nr:0 dw:4 dr:1761 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:4
>> >>>
>> >>>
>> >>> And on node laplace it says:
>> >>>
>> >>> laplace:~ # cat /proc/drbd
>> >>> version: 8.3.9 (api:88/proto:86-95)
>> >>> srcversion: A67EB2D25C5AFBFF3D8B788
>> >>> 0: cs:StandAlone ro:Secondary/Unknown ds:Outdated/DUnknown   r-----
>> >>>    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:4
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> yes, and according to the RA script everything is perfect:
>> >>
>> >> drbd_status() {
>> >>        local rc
>> >>        rc=$OCF_NOT_RUNNING
>> >>
>> >>        if ! is_drbd_enabled || ! [ -b "$DRBD_DEVICE" ]; then
>> >>                return $rc
>> >>        fi
>> >>
>> >>        # ok, module is loaded, block device node exists.
>> >>        # lets see its status
>> >>        drbd_set_status_variables
>> >>        case "${DRBD_ROLE_LOCAL}" in
>> >>        Primary)
>> >>                rc=$OCF_RUNNING_MASTER
>> >>                ;;
>> >>        Secondary)
>> >>                rc=$OCF_SUCCESS
>> >>                ;;
>> >>        Unconfigured)
>> >>                rc=$OCF_NOT_RUNNING
>> >>                ;;
>> >>        *)
>> >>                ocf_log err "Unexpected role ${DRBD_ROLE_LOCAL}"
>> >>                rc=$OCF_ERR_GENERIC
>> >>        esac
>> >>
>> >>        return $rc
>> >> }
>> >>
>> >> Staggering.
>> >>
>> >> drbd_set_status_variable subroutine does set DRBD_CSTATE
>> >>
>> >> I think the RA needs to be modified to something like this:
>> >>
>> >> Secondary)
>> >>    if [[ $DRBD_CSTATE == Connected ]]; then
>> >>            rc=$OCF_SUCCESS
>> >>    else
>> >>            rc=$OCF_NOT_RUNNING
>> >>    fi
>> >
>> > That wouldn't strictly be correct - DRBD *is* currently running on
>> > both nodes, Primary (master) on one and Secondary (slave) on the
>> > other.  This state is correctly reported in crm_mon.  The thing
>> > that crm_mon can't tell you is that *third* piece of information,
>> > i.e. that there's some sort of communication breakdown between
>> > the two instances.
>> >
>>
>> Well, it is definitely not doing it's "Slave" job when it is not connected.
>
> Well, maybe it was connected up to now, and the Primary just failed?
> So maybe it is about to be promoted,
> but you chose to fail the Secondary as well,
> just to be sure that service will go down properly?
>
>> > That being said, I'll defer to the DRBD crew as to whether or not
>> > returning $OCF_NOT_RUNNING in this case is technically safe and/or
>> > desirable.
>> >
>> > (I know its administratively highly desirable to see these failures,
>> > of course, I'm just not clear on how best to expose them).
>> >
>>
>> Well, the current situation is unacceptable, at least for me.
>> I shut everything down, disconnected direct link , started cluster
>> back and there is no indication whatsoever in the cluster status that
>> drbd is in trouble, except location constraint added by
>> crm-fence-peer.sh
>> Even scores attributes for the master resource are not negative on the 
>> disconnected secondary.
>>
>> After I applied my fix all, is kosher - I get Slave as stopped, I get 
>> fail-count.
>
> And you won't ever be able to promote an unconnected Secondary,
> or recover from replication link hickups.

Do you think that an attempt to promote an outdated resource:

0: cs:StandAlone ro:Secondary/Unknown ds:Outdated/DUnknown   r-----

is a better solution? Will it succeed?

> How are you going to do failovers?
> How are you going to do a reboot of a degraded cluster?
>
> But of course you are free to deploy any hacks you want.
>
> --
> : Lars Ellenberg
> : LINBIT | Your Way to High Availability
> : DRBD/HA support and consulting http://www.linbit.com
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>



-- 
Serge Dubrouski.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to