AFAIK this should not affect data integrity at rest (related to “verify-alg”) 
but only in-flight (csum-alg), and even then at most few blocks (that are 
in-flight) should be affected? (btw shouldn’t stable_pages_required be enabled?)

I think it’s more likely he’s hitting a number of bugs that are getting fixed 
in DRBD, where it would simply not resync data while appearing 
Consistent/UpToDate etc. I urge you to look at drbdsetup status --verbose 
—statistics $resource and look for out-of-sync counter >0.

We used cache=none with qemu and switched to cache=writeback with no corruption 
- you just need to take care only to have it primary on one node then (works 
with live migrations if you know what you’re doing though).

Jan


> On 4 Aug 2017, at 09:55, Veit Wahlich <[email protected]> wrote:
> 
> Hi Luke,
> 
> I assume you are experiencing the results of data inconsistency by
> in-flight writes. This means that a process (here your VM's qemu) can
> change a block that already waits to be written to disk.
> Whether this happens (undetected) or not depends on how the data is
> accessed for writing and synced to disk.
> 
> For qemu, you have to consider two factors; the guest OS' file systems'
> configuration and qemu's disk caching configuration:
> On Linux guests, this usually only happens for guests with file systems,
> that are NOT mounted either sync or with barriers, and with block-backed
> swap.
> On Windows guests it always happens.
> For qemu it depends on how the disk caching strategy is configured and
> thus whether it allows in-fight writes or not.
> 
> The common position is to configure qemu for writethrough caching for
> all disks and leave your guests' OS unchanged. You will also have to
> ignore/override libvirt's warning about unsafe migration with this cache
> setting, as it only applies to file-backed VM disks, not
> blockdev-backed.
> I use this for hundreds of both Linux and Windows VMs backed by DRBD
> block devices and have no inconsistency problems at all since this
> change.
> 
> Changing qemu's caching strategy might affect performance.
> For performance reasons you are advised to use a hardware RAID
> controller with battery-backed write-back cache.
> 
> For consistency reasons you are advised to use real hardware RAID, too,
> as the in-flight block changing problem described above might also
> affect mdraid, dmraid/FakeRAID, LVM mirroring, etc. (depending on
> configuration).
> 
> Best regards,
> // Veit
> 
> 
> Am Freitag, den 04.08.2017, 11:11 +1200 schrieb Luke Pascoe:
>> Hello everyone.
>> 
>> I have a fairly simple 2-node CentOS 7 setup running KVM virtual
>> machines, with DRBD 8.4.9 between them.
>> 
>> There is one DRBD resource per VM, with at least 1 volume each,
>> totalling 47 volumes.
>> 
>> There's no clustering or heartbeat or other complexity. DRBD has it's
>> own Gig-E interface to sync over.
>> 
>> I recently migrated a host between nodes and it crashed. During
>> diagnostics I did a verification on the drbd volume for the host and
>> found that it had _a lot_ of out of sync blocks.
>> 
>> This led me to run a verification on all volumes, and while I didn't
>> find any other volumes with large numbers of out of sync blocks, there
>> were several with a few. I have disconnected and reconnected all these
>> volumes, to force them to resync.
>> 
>> I have now set up a nightly cron which will verify as many volumes as
>> it can in a 2 hour window, this means I get through the whole lot in
>> about a week.
>> 
>> Almost every night, it reports at least 1 volume which is out-of-sync,
>> and I'm trying to understand why that would be.
>> 
>> I did some research and the only likely candidate I could find was
>> related to TCP checksum offloading on the NICs, which I have now
>> disabled, but it has made no difference.
>> 
>> Any suggestions what might be going on here?
>> 
>> Thanks.
>> 
>> Luke Pascoe
>> _______________________________________________
>> drbd-user mailing list
>> [email protected]
>> http://lists.linbit.com/mailman/listinfo/drbd-user
> 
> 
> _______________________________________________
> drbd-user mailing list
> [email protected]
> http://lists.linbit.com/mailman/listinfo/drbd-user

_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to