On Fri, Apr 1, 2011 at 8:50 AM, Lars Ellenberg <[email protected]> wrote: > On Fri, Apr 01, 2011 at 09:37:09AM -0400, Vadym Chepkov wrote: >> >> On Apr 1, 2011, at 3:22 AM, Tim Serong wrote: >> >> > On 4/1/2011 at 11:37 AM, Vadym Chepkov <[email protected]> wrote: >> >> On Mar 31, 2011, at 2:30 PM, Christoph Bartoschek wrote: >> >> >> >>> Am 29.03.2011 15:31, schrieb Dejan Muhamedagic: >> >>>> On Tue, Mar 29, 2011 at 08:13:49AM +0200, Christoph Bartoschek wrote: >> >>>>> Am 29.03.2011 02:35, schrieb Vadym Chepkov: >> >>>>>> >> >>>>>> On Mar 28, 2011, at 10:55 AM, Christoph Bartoschek wrote: >> >>>>>> >> >>>>>>> Am 28.03.2011 16:30, schrieb Dejan Muhamedagic: >> >>>>>>>> Hi, >> >>>>>>>> >> >>>>>>>> On Mon, Mar 21, 2011 at 11:33:49PM +0100, Christoph Bartoschek >> >>>>>>>> wrote: >> >>>>>>>>> Hi, >> >>>>>>>>> >> >>>>>>>>> I am testing a NFS failover setup. During the tests I created a >> >>>>>>>>> split-brain situation and now node A thinks it is primary and >> >>>>>>>>> uptodate >> >>>>>>>>> while node B thinks that it is Outdated. >> >>>>>>>>> >> >>>>>>>>> crm_mon however does not indicate any error to me. Why is this the >> >>>>>>>>> case? >> >>>>>>>>> I expect to see anything that shows me the degraded status. How >> >>>>>>>>> can this >> >>>>>>>>> be fixed? >> >>>>>>>> >> >>>>>>>> The cluster relies on the RA (in this case drbd) to report any >> >>>>>>>> problems. Do you have a monitor operation defined for that >> >>>>>>>> resource? >> >>>>>>> >> >>>>>>> I have the resource defined as: >> >>>>>>> >> >>>>>>> primitive p_drbd ocf:linbit:drbd \ >> >>>>>>> params drbd_resource="home-data" >> >>>>>>> op monitor interval="15" role="Master" \ >> >>>>>>> op monitor interval="30" role="Slave" >> >>>>>>> >> >>>>>>> Is this a correct monitor operation? >> >>>> >> >>>> Yes, though you should also add timeout specs. >> >>>> >> >>>>>> Just out of curiosity, you do have ms resource defined? >> >>>>>> >> >>>>>> ms ms_p_drbd p_drbd \ >> >>>>>> meta master-max="1" master-node-max="1" clone-max="2" >> >>>>>> clone-node-max="1" >> >> notify="true" >> >>>>>> >> >>>>>> Because if you do and cluster is not aware of the split-brain, drbd >> >>>>>> RA has a >> >> serious flaw. >> >>>>>> >> >>>>> >> >>>>> I'm sorry. Yes, the ms resource is also defined. >> >>>> >> >>>> Well, I'm really confused. You basically say that the drbd disk >> >>>> gets into a degraded mode (i.e. it detects split brain), but the >> >>>> cluster (pacemaker) never learns about that. Perhaps you should >> >>>> open a bugzilla for this and supply hb_report. Though it's >> >>>> really hard to believe. It's like basic functionality failing. >> >>> >> >>> >> >>> What would you expect to see? >> >>> >> >>> Currently I see the following in crm_mon: >> >>> >> >>> Master/Slave Set: ms_drbd_nfs [p_drbd_nfs] >> >>> Masters: [ ries ] >> >>> Slaves: [ laplace ] >> >>> >> >>> >> >>> At the same time "cat /proc/drbd" on ries says: >> >>> >> >>> ries:~ # cat /proc/drbd >> >>> version: 8.3.9 (api:88/proto:86-95) >> >>> srcversion: A67EB2D25C5AFBFF3D8B788 >> >>> 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r----- >> >>> ns:0 nr:0 dw:4 dr:1761 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:4 >> >>> >> >>> >> >>> And on node laplace it says: >> >>> >> >>> laplace:~ # cat /proc/drbd >> >>> version: 8.3.9 (api:88/proto:86-95) >> >>> srcversion: A67EB2D25C5AFBFF3D8B788 >> >>> 0: cs:StandAlone ro:Secondary/Unknown ds:Outdated/DUnknown r----- >> >>> ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:4 >> >>> >> >>> >> >> >> >> >> >> >> >> yes, and according to the RA script everything is perfect: >> >> >> >> drbd_status() { >> >> local rc >> >> rc=$OCF_NOT_RUNNING >> >> >> >> if ! is_drbd_enabled || ! [ -b "$DRBD_DEVICE" ]; then >> >> return $rc >> >> fi >> >> >> >> # ok, module is loaded, block device node exists. >> >> # lets see its status >> >> drbd_set_status_variables >> >> case "${DRBD_ROLE_LOCAL}" in >> >> Primary) >> >> rc=$OCF_RUNNING_MASTER >> >> ;; >> >> Secondary) >> >> rc=$OCF_SUCCESS >> >> ;; >> >> Unconfigured) >> >> rc=$OCF_NOT_RUNNING >> >> ;; >> >> *) >> >> ocf_log err "Unexpected role ${DRBD_ROLE_LOCAL}" >> >> rc=$OCF_ERR_GENERIC >> >> esac >> >> >> >> return $rc >> >> } >> >> >> >> Staggering. >> >> >> >> drbd_set_status_variable subroutine does set DRBD_CSTATE >> >> >> >> I think the RA needs to be modified to something like this: >> >> >> >> Secondary) >> >> if [[ $DRBD_CSTATE == Connected ]]; then >> >> rc=$OCF_SUCCESS >> >> else >> >> rc=$OCF_NOT_RUNNING >> >> fi >> > >> > That wouldn't strictly be correct - DRBD *is* currently running on >> > both nodes, Primary (master) on one and Secondary (slave) on the >> > other. This state is correctly reported in crm_mon. The thing >> > that crm_mon can't tell you is that *third* piece of information, >> > i.e. that there's some sort of communication breakdown between >> > the two instances. >> > >> >> Well, it is definitely not doing it's "Slave" job when it is not connected. > > Well, maybe it was connected up to now, and the Primary just failed? > So maybe it is about to be promoted, > but you chose to fail the Secondary as well, > just to be sure that service will go down properly? > >> > That being said, I'll defer to the DRBD crew as to whether or not >> > returning $OCF_NOT_RUNNING in this case is technically safe and/or >> > desirable. >> > >> > (I know its administratively highly desirable to see these failures, >> > of course, I'm just not clear on how best to expose them). >> > >> >> Well, the current situation is unacceptable, at least for me. >> I shut everything down, disconnected direct link , started cluster >> back and there is no indication whatsoever in the cluster status that >> drbd is in trouble, except location constraint added by >> crm-fence-peer.sh >> Even scores attributes for the master resource are not negative on the >> disconnected secondary. >> >> After I applied my fix all, is kosher - I get Slave as stopped, I get >> fail-count. > > And you won't ever be able to promote an unconnected Secondary, > or recover from replication link hickups.
Do you think that an attempt to promote an outdated resource: 0: cs:StandAlone ro:Secondary/Unknown ds:Outdated/DUnknown r----- is a better solution? Will it succeed? > How are you going to do failovers? > How are you going to do a reboot of a degraded cluster? > > But of course you are free to deploy any hacks you want. > > -- > : Lars Ellenberg > : LINBIT | Your Way to High Availability > : DRBD/HA support and consulting http://www.linbit.com > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > -- Serge Dubrouski. _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
