On 2018-10-16 16:27, Chris Murphy wrote:
On Tue, Oct 16, 2018 at 9:42 AM, Austin S. Hemmelgarn
<[email protected]> wrote:
On 2018-10-16 11:30, Anton Shepelev wrote:
Hello, all
What may be the reason of a CRC mismatch on a BTRFS file in
a virutal machine:
csum failed ino 175524 off 1876295680 csum 451760558
expected csum 1446289185
Shall I seek the culprit in the host machine on in the guest
one? Supposing the host machine healty, what operations on
the gueest might have caused a CRC mismatch?
Possible causes include:
* On the guest side:
- Unclean shutdown of the guest system (not likely even if this did
happen).
- A kernel bug on in the guest.
- Something directly modifying the block device (also not very likely).
* On the host side:
- Unclean shutdown of the host system without properly flushing data from
the guest. Not likely unless you're using an actively unsafe caching mode
for the guest's storage back-end.
- At-rest data corruption in the storage back-end.
- A bug in the host-side storage stack.
- A transient error in the host-side storage stack.
- A bug in the hypervisor.
- Something directly modifying the back-end storage.
Of these, the statistically most likely location for the issue is probably
the storage stack on the host.
Is there still that O_DIRECT related "bug" (or more of a limitation)
if the guest is using cache=none on the block device?
I had actually forgotten about this, and I'm not quite sure if it's
fixed or not.
Anton what virtual machine tech are you using? qemu/kvm managed with
virt-manager? The configuration affects host behavior; but the
negative effect manifests inside the guest as corruption. If I
remember correctly.