Hello all, After 524 successful continuous migrations I think I am confident enough to report that the above patch from Apollon works just fine. If we weren't both living in Athens I'd promise I would buy him a beer at the next ganeticon, but he's probably getting it a lot earlier than that...
Thanks a lot! On Tue, Nov 5, 2013 at 4:30 PM, Apollon Oikonomopoulos <[email protected]>wrote: > DrbdAttachNet supports both, normal primary/secondary node operation, and > (during live migration) dual-primary operation. When resources are newly > attached, we poll until we find all of them in connected or syncing > operation. > > Although aggressive, this is enough for primary/secondary operation, > because > the primary/secondary role is not changed from within DrbdAttachNet. > However, > in the dual-primary (“multimaster”) case, both peers are subsequently > upgraded > to the primary role. If - for unspecified reasons - both disks are not > UpToDate, then a resync may be triggered after both peers have switched to > primary, causing the resource to disconnect: > > kernel: [1465514.164009] block drbd2: I shall become SyncTarget, but I > am primary! > kernel: [1465514.171562] block drbd2: ASSERT( os.conn == > C_WF_REPORT_PARAMS ) in > /build/linux-rrsxby/linux-3.2.51/drivers/block/drbd/drbd_receiver.c:3245 > > This seems to be extremely racey and is possibly triggered by some > underlying > network issues (e.g. high latency), but it has been observed in the wild. > By > logging the DRBD resource state in the old secondary, we managed to see a > resource getting promoted to primary while it was: > > WFSyncUUID Secondary/Primary Outdated/UpToDate > > We fix this by explicitly waiting for “Connected” cstate and > “UpToDate/UpToDate” disks, as advised in [1]: > > ”For this purpose and scenario, > you only want to promote once you are Connected UpToDate/UpToDate.” > > [1] http://lists.linbit.com/pipermail/drbd-user/2013-July/020173.html > > Signed-off-by: Apollon Oikonomopoulos <[email protected]> > --- > lib/backend.py | 16 ++++++++++++++-- > lib/bdev.py | 1 + > 2 files changed, 15 insertions(+), 2 deletions(-) > > diff --git a/lib/backend.py b/lib/backend.py > index a75432b..9e12639 100644 > --- a/lib/backend.py > +++ b/lib/backend.py > @@ -3622,8 +3622,20 @@ def DrbdAttachNet(nodes_ip, disks, instance_name, > multimaster): > for rd in bdevs: > stats = rd.GetProcStatus() > > - all_connected = (all_connected and > - (stats.is_connected or stats.is_in_resync)) > + if multimaster: > + # In the multimaster case we have to wait explicitly until > + # the resource is Connected and UpToDate/UpToDate, because > + # we promote *both nodes* to primary directly afterwards. > + # Being in resync is not enough, since there is a race during > which we > + # may promote a node with an Outdated disk to primary, effectively > + # tearing down the connection. > + all_connected = (all_connected and > + stats.is_connected and > + stats.is_disk_uptodate and > + stats.peer_disk_uptodate) > + else: > + all_connected = (all_connected and > + (stats.is_connected or stats.is_in_resync)) > > if stats.is_standalone: > # peer had different config info and this node became > diff --git a/lib/bdev.py b/lib/bdev.py > index 7623869..acc18ec 100644 > --- a/lib/bdev.py > +++ b/lib/bdev.py > @@ -1135,6 +1135,7 @@ class DRBD8Status(object): > > self.is_diskless = self.ldisk == self.DS_DISKLESS > self.is_disk_uptodate = self.ldisk == self.DS_UPTODATE > + self.peer_disk_uptodate = self.rdisk == self.DS_UPTODATE > > self.is_in_resync = self.cstatus in self.CSET_SYNC > self.is_in_use = self.cstatus != self.CS_UNCONFIGURED > -- > 1.7.10.4 > > -- Καργιωτάκης Γιώργος
