On Fri, Apr 01, 2011 at 09:37:09AM -0400, Vadym Chepkov wrote:
> 
> On Apr 1, 2011, at 3:22 AM, Tim Serong wrote:
> 
> > On 4/1/2011 at 11:37 AM, Vadym Chepkov <[email protected]> wrote: 
> >> On Mar 31, 2011, at 2:30 PM, Christoph Bartoschek wrote: 
> >> 
> >>> Am 29.03.2011 15:31, schrieb Dejan Muhamedagic: 
> >>>> On Tue, Mar 29, 2011 at 08:13:49AM +0200, Christoph Bartoschek wrote: 
> >>>>> Am 29.03.2011 02:35, schrieb Vadym Chepkov: 
> >>>>>> 
> >>>>>> On Mar 28, 2011, at 10:55 AM, Christoph Bartoschek wrote: 
> >>>>>> 
> >>>>>>> Am 28.03.2011 16:30, schrieb Dejan Muhamedagic: 
> >>>>>>>> Hi, 
> >>>>>>>> 
> >>>>>>>> On Mon, Mar 21, 2011 at 11:33:49PM +0100, Christoph Bartoschek 
> >>>>>>>> wrote: 
> >>>>>>>>> Hi, 
> >>>>>>>>> 
> >>>>>>>>> I am testing a NFS failover setup. During the tests I created a 
> >>>>>>>>> split-brain situation and now node A thinks it is primary and 
> >>>>>>>>> uptodate 
> >>>>>>>>> while node B thinks that it is Outdated. 
> >>>>>>>>> 
> >>>>>>>>> crm_mon however does not indicate any error to me. Why is this the 
> >>>>>>>>> case? 
> >>>>>>>>> I expect to see anything that shows me the degraded status. How can 
> >>>>>>>>> this 
> >>>>>>>>> be fixed? 
> >>>>>>>> 
> >>>>>>>> The cluster relies on the RA (in this case drbd) to report any 
> >>>>>>>> problems. Do you have a monitor operation defined for that 
> >>>>>>>> resource? 
> >>>>>>> 
> >>>>>>> I have the resource defined as: 
> >>>>>>> 
> >>>>>>> primitive p_drbd ocf:linbit:drbd \ 
> >>>>>>>    params drbd_resource="home-data" 
> >>>>>>>    op monitor interval="15" role="Master" \ 
> >>>>>>>    op monitor interval="30" role="Slave" 
> >>>>>>> 
> >>>>>>> Is this a correct monitor operation? 
> >>>> 
> >>>> Yes, though you should also add timeout specs. 
> >>>> 
> >>>>>> Just out of curiosity, you do have ms resource defined? 
> >>>>>> 
> >>>>>> ms ms_p_drbd p_drbd \ 
> >>>>>>         meta master-max="1" master-node-max="1" clone-max="2" 
> >>>>>> clone-node-max="1"  
> >> notify="true" 
> >>>>>> 
> >>>>>> Because if you do and cluster is not aware of the split-brain, drbd RA 
> >>>>>> has a  
> >> serious flaw. 
> >>>>>> 
> >>>>> 
> >>>>> I'm sorry. Yes, the ms resource is also defined. 
> >>>> 
> >>>> Well, I'm really confused. You basically say that the drbd disk 
> >>>> gets into a degraded mode (i.e. it detects split brain), but the 
> >>>> cluster (pacemaker) never learns about that. Perhaps you should 
> >>>> open a bugzilla for this and supply hb_report. Though it's 
> >>>> really hard to believe. It's like basic functionality failing. 
> >>> 
> >>> 
> >>> What would you expect to see? 
> >>> 
> >>> Currently I see the following in crm_mon: 
> >>> 
> >>> Master/Slave Set: ms_drbd_nfs [p_drbd_nfs] 
> >>>    Masters: [ ries ] 
> >>>    Slaves: [ laplace ] 
> >>> 
> >>> 
> >>> At the same time "cat /proc/drbd" on ries says: 
> >>> 
> >>> ries:~ # cat /proc/drbd 
> >>> version: 8.3.9 (api:88/proto:86-95) 
> >>> srcversion: A67EB2D25C5AFBFF3D8B788 
> >>> 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r----- 
> >>>    ns:0 nr:0 dw:4 dr:1761 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:4 
> >>> 
> >>> 
> >>> And on node laplace it says: 
> >>> 
> >>> laplace:~ # cat /proc/drbd 
> >>> version: 8.3.9 (api:88/proto:86-95) 
> >>> srcversion: A67EB2D25C5AFBFF3D8B788 
> >>> 0: cs:StandAlone ro:Secondary/Unknown ds:Outdated/DUnknown   r----- 
> >>>    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:4 
> >>> 
> >>> 
> >> 
> >> 
> >> 
> >> yes, and according to the RA script everything is perfect: 
> >> 
> >> drbd_status() { 
> >>        local rc 
> >>        rc=$OCF_NOT_RUNNING 
> >> 
> >>        if ! is_drbd_enabled || ! [ -b "$DRBD_DEVICE" ]; then 
> >>                return $rc 
> >>        fi 
> >> 
> >>        # ok, module is loaded, block device node exists. 
> >>        # lets see its status 
> >>        drbd_set_status_variables 
> >>        case "${DRBD_ROLE_LOCAL}" in 
> >>        Primary) 
> >>                rc=$OCF_RUNNING_MASTER 
> >>                ;; 
> >>        Secondary) 
> >>                rc=$OCF_SUCCESS 
> >>                ;; 
> >>        Unconfigured) 
> >>                rc=$OCF_NOT_RUNNING 
> >>                ;; 
> >>        *) 
> >>                ocf_log err "Unexpected role ${DRBD_ROLE_LOCAL}" 
> >>                rc=$OCF_ERR_GENERIC 
> >>        esac 
> >> 
> >>        return $rc 
> >> } 
> >> 
> >> Staggering. 
> >> 
> >> drbd_set_status_variable subroutine does set DRBD_CSTATE 
> >> 
> >> I think the RA needs to be modified to something like this: 
> >> 
> >> Secondary) 
> >>    if [[ $DRBD_CSTATE == Connected ]]; then 
> >>            rc=$OCF_SUCCESS 
> >>    else 
> >>            rc=$OCF_NOT_RUNNING 
> >>    fi 
> > 
> > That wouldn't strictly be correct - DRBD *is* currently running on
> > both nodes, Primary (master) on one and Secondary (slave) on the
> > other.  This state is correctly reported in crm_mon.  The thing
> > that crm_mon can't tell you is that *third* piece of information,
> > i.e. that there's some sort of communication breakdown between
> > the two instances.
> > 
> 
> Well, it is definitely not doing it's "Slave" job when it is not connected.

Well, maybe it was connected up to now, and the Primary just failed?
So maybe it is about to be promoted,
but you chose to fail the Secondary as well,
just to be sure that service will go down properly?

> > That being said, I'll defer to the DRBD crew as to whether or not
> > returning $OCF_NOT_RUNNING in this case is technically safe and/or
> > desirable.
> > 
> > (I know its administratively highly desirable to see these failures,
> > of course, I'm just not clear on how best to expose them).
> > 
> 
> Well, the current situation is unacceptable, at least for me.
> I shut everything down, disconnected direct link , started cluster
> back and there is no indication whatsoever in the cluster status that
> drbd is in trouble, except location constraint added by
> crm-fence-peer.sh
> Even scores attributes for the master resource are not negative on the 
> disconnected secondary.
> 
> After I applied my fix all, is kosher - I get Slave as stopped, I get 
> fail-count.

And you won't ever be able to promote an unconnected Secondary,
or recover from replication link hickups.
How are you going to do failovers?
How are you going to do a reboot of a degraded cluster?

But of course you are free to deploy any hacks you want.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to