Hello,

I ran into some undesired behavior from DRBD earlier today -- I'm running 8.4.4 
on CentOS 6.5 (kernel 2.6.32-431.17.1.el6.x86_64), with a fairly vanilla 
two-node, single-primary config replicating over a back-to-back 10GbE link.

The DRBD resource in question holds a backup volume that we rotate a disk from 
periodically to take offsite, replacing it with the previously-offsite disk and 
then resyncing to that.  On the node (let's call it node0, and the other node1) 
that I was swapping the drive into/out of, I ran 'drbdadm detach' on the 
resource, leaving it temporarily diskless, and then 'drbdadm attach' with the 
newly-swapped-in disk once it was up and ready.  Given that the 
freshly-attached backing device would be from an older state of the same 
resource, I was hoping that DRBD would recognize this and do a bitmap-based 
resync of only the data that had been modified since that drive was most 
recently rotated out (I later realized this part of it wasn't likely to work 
out as I'd hoped).

Instead what happened when I re-attached was that node1 (the peer, not the one 
the detach/attach was done on) reported observing unrelated data and aborted, 
while node0 remained stuck in dstate Negotiating, where it appeared happy to 
remain indefinitely.

Full kernel output from the relevant period of time on node0:

Jul  2 11:11:09 block drbd5: disk( UpToDate -> Failed ) 
Jul  2 11:11:09 block drbd5: bitmap WRITE of 0 pages took 1 jiffies
Jul  2 11:11:09 block drbd5: 0 KB (0 bits) marked out-of-sync by on disk 
bit-map.
Jul  2 11:11:09 block drbd5: disk( Failed -> Diskless ) 
Jul  2 11:11:09 block drbd5: receiver updated UUIDs to effective data uuid: 
7D883041AC11CDF4
Jul  2 11:11:10 sd 7: 0:6:0: [sdg] Synchronizing SCSI cache
Jul  2 11:11:10 sd 7: 0:6:0: [sdg] Stopping disk
Jul  2 11:12:20 node0 ata18.00: ATA-8: ST3000VX000-9YW166, CV13, max UDMA/133
Jul  2 11:12:20 node0 ata18.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth 
31/32)
Jul  2 11:12:20 node0 ata18.00: configured for UDMA/133
Jul  2 11:12:20 scsi 7: 0:7:0: Direct-Access     ATA      ST3000VX000-9YW1 CV13 
PQ: 0 ANSI: 5
Jul  2 11:12:20 sd 7: 0:7:0: [sdg] 5860533168 512-byte logical blocks: (3.00 
TB/2.72 TiB)
Jul  2 11:12:20 sd 7: 0:7:0: Attached scsi generic sg6 type 0
Jul  2 11:12:20 sd 7: 0:7:0: [sdg] 4096-byte physical blocks
Jul  2 11:12:20 sd 7: 0:7:0: [sdg] Write Protect is off
Jul  2 11:12:20 sd 7: 0:7:0: [sdg] Write cache: enabled, read cache: enabled, 
doesn't support DPO or FUA
Jul  2 11:12:20 node0  sdg:
Jul  2 11:12:20 node0  unknown partition table
Jul  2 11:12:20 sd 7: 0:7:0: [sdg] Attached SCSI disk
Jul  2 11:12:24 block drbd5: disk( Diskless -> Attaching ) 
Jul  2 11:12:25 block drbd5: bitmap READ of 22356 pages took 980 jiffies
Jul  2 11:12:25 block drbd5: recounting of set bits took additional 56 jiffies
Jul  2 11:12:25 block drbd5: 13 GB (3470336 bits) marked out-of-sync by on disk 
bit-map.
Jul  2 11:12:25 block drbd5: disk( Attaching -> Negotiating ) 
Jul  2 11:12:25 block drbd5: attached to UUIDs 
F7D979074ACFA28E:0000000000000000:773EDC111473E72E:773DDC111473E72E
Jul  2 12:22:50 block drbd5: disk( Negotiating -> Failed ) 

'drbdadm detach' at 11:11:09, sdg deleted at 11:11:10, new disk recognized at 
11:12:20, then 'drbdadm attach' at 11:12:24 -- note the ~70 minute gap between 
the last two lines; the Negotiating->Failed transition was me manually 
intervening.

And the same window of time on node1:

Jul  2 11:11:09 block drbd5: pdsk( UpToDate -> Failed ) 
Jul  2 11:11:09 block drbd5: new current UUID 
7D883041AC11CDF4:3D2B593201711886:AB54E07C096FBAAC:AB53E07C096FBAAC
Jul  2 11:11:09 block drbd5: pdsk( Failed -> Diskless ) 
Jul  2 11:12:25 block drbd5: real peer disk state = Consistent
Jul  2 11:12:25 block drbd5: drbd_sync_handshake:
Jul  2 11:12:25 block drbd5: self 
7D883041AC11CDF4:3D2B593201711886:AB54E07C096FBAAC:AB53E07C096FBAAC bits:0 
flags:0
Jul  2 11:12:25 block drbd5: peer 
F7D979074ACFA28F:0000000000000000:773EDC111473E72E:773DDC111473E72E 
bits:3470336 flags:2
Jul  2 11:12:25 block drbd5: uuid_compare()=-1000 by rule 100
Jul  2 11:12:25 block drbd5: Unrelated data, aborting!
Jul  2 11:12:25 block drbd5: Disk attach process on the peer node was aborted.
Jul  2 12:22:50 block drbd5: pdsk( Diskless -> Failed ) 

Between 11:12:25 and 12:22:50, /proc/drbd looks like this on node0:

version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by phil@Build64R6, 
2013-10-14 15:33:06
...
 5: cs:Connected ro:Primary/Secondary ds:Negotiating/UpToDate C r---d-
    ns:0 nr:0 dw:0 dr:0 al:108344 bm:182159 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f 
oos:13881344


And like this on node1:

version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by phil@Build64R6, 
2013-10-14 15:33:06
...
 5: cs:Connected ro:Secondary/Primary ds:UpToDate/Diskless C r-----
    ns:824333588 nr:1053169504 dw:1053169504 dr:824334636 al:9 bm:366654 lo:0 
pe:0 ua:0 ap:0 ep:1 wo:f oos:0


I tried detaching and re-attaching again (without any disk-swapping) thinking 
perhaps I'd just hit some narrow race window, but got the same behavior on both 
sides.  At that point I gave up and just did a 'create-md' on node0 and 
resynced from scratch, which it (thankfully) did without fuss.

So overall, two things:

1) Getting stuck in dstate=Negotiating definitely looks like a DRBD bug to me; 
is this a known issue, and perhaps something fixed in 8.4.5?  I scanned the 
changelog summary in the announcement email but didn't see anything that looked 
(to me) obviously related.

2) Reading (and perhaps more importantly *thinking*) a bit about DRBD's 
metadata, I guess my hopes of magically bitmap-optimized resync were a bit 
unrealistic, given the somewhat nontrivial orchestration of what bitmaps would 
have to get written to which disks and when, and the fact that I don't think 
DRBD is really designed to support this use case efficiently.  Just in case I'm 
wrong about that, is there any simple way to leverage DRBD's bitmaps for more 
efficient resyncs in this scenario, or am I going to have to just do a 
full-device sync on each disk-swap?


Thanks,
Zev Weiss

_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to