On 2018-10-16 16:27, Chris Murphy wrote:
On Tue, Oct 16, 2018 at 9:42 AM, Austin S. Hemmelgarn
<[email protected]> wrote:
On 2018-10-16 11:30, Anton Shepelev wrote:

Hello, all

What may be the reason of a CRC mismatch on a BTRFS file in
a virutal machine:

     csum failed ino 175524 off 1876295680 csum 451760558
     expected csum 1446289185

Shall I seek the culprit in the host machine on in the guest
one?  Supposing the host machine healty, what operations on
the gueest might have caused a CRC mismatch?

Possible causes include:

* On the guest side:
   - Unclean shutdown of the guest system (not likely even if this did
happen).
   - A kernel bug on in the guest.
   - Something directly modifying the block device (also not very likely).

* On the host side:
   - Unclean shutdown of the host system without properly flushing data from
the guest.  Not likely unless you're using an actively unsafe caching mode
for the guest's storage back-end.
   - At-rest data corruption in the storage back-end.
   - A bug in the host-side storage stack.
   - A transient error in the host-side storage stack.
   - A bug in the hypervisor.
   - Something directly modifying the back-end storage.

Of these, the statistically most likely location for the issue is probably
the storage stack on the host.

Is there still that O_DIRECT related "bug" (or more of a limitation)
if the guest is using cache=none on the block device?
I had actually forgotten about this, and I'm not quite sure if it's fixed or not.

Anton what virtual machine tech are you using? qemu/kvm managed with
virt-manager? The configuration affects host behavior; but the
negative effect manifests inside the guest as corruption. If I
remember correctly.


Reply via email to