Re: [DRBD-user] Disk Corruption = DRBD Failure?

Charles Kozler Wed, 12 Oct 2011 11:02:04 -0700

This was 100% spot on the answer I was looking for- thanks guys!

Also, do any white papers exist on how DRBD works on the inside? Fromwhat you told me it looks like its


Write to DRBD Block Device -> Write to TCP buffer -> Write to host disks

I thought it was

Write DRBD Block Device -> Write to disk -> Write to TCP Buffer -> Writeto host disks (like a push method almost)

Which is why I wanted to know about disk corruption but from what itseems like is that I should be more concerned about corruption in thenetwork stack, right?



Regards,
Chuck Kozler
/Lead Infrastructure & Systems Administrator/
---
*Office*: 1-646-290-6267 | *Mobile*: 1-646-385-3684
FIX Flyer

Notice to Recipient: This e-mail is meant only for the intendedrecipient(s) of the transmission, and contains confidential informationwhich is proprietaryto FIX Flyer LLC. Any unauthorized use, copying, distribution, ordissemination is strictly prohibited. All rights to this information isreserved by FIX Flyer LLC.If you are not the intended recipient, please contact the sender byreply e-mail and please delete this e-mail from your system and destroyany copies


On 10/12/2011 3:04 AM, Florian Haas wrote:

On 2011-10-11 17:09, Charles Kozler wrote:

Hi,

I have been reading the docs and still seem to be unclear as to some things-

Assume I have a two node setup with DRBD in Primary/Primary with Xen
writing to /dev/drbd0 on node1. I use Primary/Primary for live migration
and in my Xen DomU configuration file I use phy: and not drbd: handler.

Now, what happens if the disk on node1 begins to fail and the blocks
where /dev/drbd0 resides are corrupted while we continue to write to
this- will these bad/corrupted blocks be replicated to node2?

If the underlying _disk_ fails in weird ways and that is why you get
corruption, then the corruption occurs below the DRBD level and there's
no corruption for DRBD to replicate.

If however you have one of your Xen domUs writing garbage to that device
(so the corruption occurs in a layer above DRBD), then of course DRBD
will happily replicate that corruption.

Example aside, in short, I am wondering if a failing disk on a node will
result in DRBD replicating bad block data to the secondary node.  I know
there a place in the docs describing integrity checker using the kernels
crypt algo's (like md5) so maybe thats an option to prevent it?

Nope, that will only prevent corruption that may occur *within* DRBD due
to a fishy network layer, or bit flips on your PCI bus, or broken
checksum offloading on your NICs.

For preventing corruption in the disk I/O layer, DRBD would have to
support DIF/DIX, which it currently doesn't do (very few applications do).

In either case, is there any way to prevent bad block data from node 1
being replicated to node 2?

Corruption rooted in the network stack, yes -- use data-integrity-alg.

Corruption rooted in your Xen domU, nope.

For corruption rooted in the I/O layer, you can't prevent the
replication from happening but you can detect the corruption after the
fact -- use verify-alg and run device verification.

Hope this helps,
Florian

_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Re: [DRBD-user] Disk Corruption = DRBD Failure?

Reply via email to