[ovirt-users] Re: VM disk corruption with LSM on Gluster
This is needed to prevent any inconsistencies stemming from buffered writes/caching file data during live VM migration. Besides, for Gluster to truly honor direct-io behavior in qemu's 'cache=none' mode (which is what oVirt uses), one needs to turn on performance.strict-o-direct and disable remote-dio. -Krutika On Wed, Mar 27, 2019 at 12:24 PM Leo David wrote: > Hi, > I can confirm that after setting these two options, I haven't encountered > disk corruptions anymore. > The downside, is that at least for me it had a pretty big impact on > performance. > The iops really went down - performing inside vm fio tests. > > On Wed, Mar 27, 2019, 07:03 Krutika Dhananjay wrote: > >> Could you enable strict-o-direct and disable remote-dio on the src volume >> as well, restart the vms on "old" and retry migration? >> >> # gluster volume set performance.strict-o-direct on >> # gluster volume set network.remote-dio off >> >> -Krutika >> >> On Tue, Mar 26, 2019 at 10:32 PM Sander Hoentjen >> wrote: >> >>> On 26-03-19 14:23, Sahina Bose wrote: >>> > +Krutika Dhananjay and gluster ml >>> > >>> > On Tue, Mar 26, 2019 at 6:16 PM Sander Hoentjen >>> wrote: >>> >> Hello, >>> >> >>> >> tl;dr We have disk corruption when doing live storage migration on >>> oVirt >>> >> 4.2 with gluster 3.12.15. Any idea why? >>> >> >>> >> We have a 3-node oVirt cluster that is both compute and >>> gluster-storage. >>> >> The manager runs on separate hardware. We are running out of space on >>> >> this volume, so we added another Gluster volume that is bigger, put a >>> >> storage domain on it and then we migrated VM's to it with LSM. After >>> >> some time, we noticed that (some of) the migrated VM's had corrupted >>> >> filesystems. After moving everything back with export-import to the >>> old >>> >> domain where possible, and recovering from backups where needed we set >>> >> off to investigate this issue. >>> >> >>> >> We are now at the point where we can reproduce this issue within a >>> day. >>> >> What we have found so far: >>> >> 1) The corruption occurs at the very end of the replication step, most >>> >> probably between START and FINISH of diskReplicateFinish, before the >>> >> START merge step >>> >> 2) In the corrupted VM, at some place where data should be, this data >>> is >>> >> replaced by zero's. This can be file-contents or a directory-structure >>> >> or whatever. >>> >> 3) The source gluster volume has different settings then the >>> destination >>> >> (Mostly because the defaults were different at creation time): >>> >> >>> >> Setting old(src) new(dst) >>> >> cluster.op-version 30800 30800 (the same) >>> >> cluster.max-op-version 31202 31202 (the same) >>> >> cluster.metadata-self-heal off on >>> >> cluster.data-self-heal off on >>> >> cluster.entry-self-heal off on >>> >> performance.low-prio-threads1632 >>> >> performance.strict-o-direct off on >>> >> network.ping-timeout4230 >>> >> network.remote-dio enableoff >>> >> transport.address-family- inet >>> >> performance.stat-prefetch off on >>> >> features.shard-block-size 512MB 64MB >>> >> cluster.shd-max-threads 1 8 >>> >> cluster.shd-wait-qlength1024 1 >>> >> cluster.locking-scheme full granular >>> >> cluster.granular-entry-heal noenable >>> >> >>> >> 4) To test, we migrate some VM's back and forth. The corruption does >>> not >>> >> occur every time. To this point it only occurs from old to new, but we >>> >> don't have enough data-points to be sure about that. >>> >> >>> >> Anybody an idea what is causing the corruption? Is this the best list >>> to >>> >> ask, or should I ask on a Gluster list? I am not sure if this is oVirt >>> >> specific or Gluster specific though. >>> > Do you have logs from old and new gluster volumes? Any errors in the >>> > new volume's fuse mount logs? >>> >>> Around the time of corruption I see the message: >>> The message "I [MSGID: 133017] [shard.c:4941:shard_seek] >>> 0-ZoneA_Gluster1-shard: seek called on >>> 7fabc273-3d8a-4a49-8906-b8ccbea4a49f. [Operation not supported]" repeated >>> 231 times between [2019-03-26 13:14:22.297333] and [2019-03-26 >>> 13:15:42.912170] >>> >>> I also see this message at other times, when I don't see the corruption >>> occur, though. >>> >>> -- >>> Sander >>> ___ >>> Users mailing list -- users@ovirt.org >>> To unsubscribe send an email to users-le...@ovirt.org >>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>> oVirt Code of Conduct: >>> https://www.ovirt.org/community/about/community-guidelines/ >>> List Archives: >>>
[ovirt-users] Re: VM disk corruption with LSM on Gluster
Hi, I can confirm that after setting these two options, I haven't encountered disk corruptions anymore. The downside, is that at least for me it had a pretty big impact on performance. The iops really went down - performing inside vm fio tests. On Wed, Mar 27, 2019, 07:03 Krutika Dhananjay wrote: > Could you enable strict-o-direct and disable remote-dio on the src volume > as well, restart the vms on "old" and retry migration? > > # gluster volume set performance.strict-o-direct on > # gluster volume set network.remote-dio off > > -Krutika > > On Tue, Mar 26, 2019 at 10:32 PM Sander Hoentjen > wrote: > >> On 26-03-19 14:23, Sahina Bose wrote: >> > +Krutika Dhananjay and gluster ml >> > >> > On Tue, Mar 26, 2019 at 6:16 PM Sander Hoentjen >> wrote: >> >> Hello, >> >> >> >> tl;dr We have disk corruption when doing live storage migration on >> oVirt >> >> 4.2 with gluster 3.12.15. Any idea why? >> >> >> >> We have a 3-node oVirt cluster that is both compute and >> gluster-storage. >> >> The manager runs on separate hardware. We are running out of space on >> >> this volume, so we added another Gluster volume that is bigger, put a >> >> storage domain on it and then we migrated VM's to it with LSM. After >> >> some time, we noticed that (some of) the migrated VM's had corrupted >> >> filesystems. After moving everything back with export-import to the old >> >> domain where possible, and recovering from backups where needed we set >> >> off to investigate this issue. >> >> >> >> We are now at the point where we can reproduce this issue within a day. >> >> What we have found so far: >> >> 1) The corruption occurs at the very end of the replication step, most >> >> probably between START and FINISH of diskReplicateFinish, before the >> >> START merge step >> >> 2) In the corrupted VM, at some place where data should be, this data >> is >> >> replaced by zero's. This can be file-contents or a directory-structure >> >> or whatever. >> >> 3) The source gluster volume has different settings then the >> destination >> >> (Mostly because the defaults were different at creation time): >> >> >> >> Setting old(src) new(dst) >> >> cluster.op-version 30800 30800 (the same) >> >> cluster.max-op-version 31202 31202 (the same) >> >> cluster.metadata-self-heal off on >> >> cluster.data-self-heal off on >> >> cluster.entry-self-heal off on >> >> performance.low-prio-threads1632 >> >> performance.strict-o-direct off on >> >> network.ping-timeout4230 >> >> network.remote-dio enableoff >> >> transport.address-family- inet >> >> performance.stat-prefetch off on >> >> features.shard-block-size 512MB 64MB >> >> cluster.shd-max-threads 1 8 >> >> cluster.shd-wait-qlength1024 1 >> >> cluster.locking-scheme full granular >> >> cluster.granular-entry-heal noenable >> >> >> >> 4) To test, we migrate some VM's back and forth. The corruption does >> not >> >> occur every time. To this point it only occurs from old to new, but we >> >> don't have enough data-points to be sure about that. >> >> >> >> Anybody an idea what is causing the corruption? Is this the best list >> to >> >> ask, or should I ask on a Gluster list? I am not sure if this is oVirt >> >> specific or Gluster specific though. >> > Do you have logs from old and new gluster volumes? Any errors in the >> > new volume's fuse mount logs? >> >> Around the time of corruption I see the message: >> The message "I [MSGID: 133017] [shard.c:4941:shard_seek] >> 0-ZoneA_Gluster1-shard: seek called on >> 7fabc273-3d8a-4a49-8906-b8ccbea4a49f. [Operation not supported]" repeated >> 231 times between [2019-03-26 13:14:22.297333] and [2019-03-26 >> 13:15:42.912170] >> >> I also see this message at other times, when I don't see the corruption >> occur, though. >> >> -- >> Sander >> ___ >> Users mailing list -- users@ovirt.org >> To unsubscribe send an email to users-le...@ovirt.org >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> oVirt Code of Conduct: >> https://www.ovirt.org/community/about/community-guidelines/ >> List Archives: >> https://lists.ovirt.org/archives/list/users@ovirt.org/message/M3T2VGGGV6DE643ZKKJUAF274VSWTJFH/ >> > ___ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZUIRM5PT4Y4USOSDGSUEP3YEE23LE4WG/ >
[ovirt-users] Re: VM disk corruption with LSM on Gluster
Could you enable strict-o-direct and disable remote-dio on the src volume as well, restart the vms on "old" and retry migration? # gluster volume set performance.strict-o-direct on # gluster volume set network.remote-dio off -Krutika On Tue, Mar 26, 2019 at 10:32 PM Sander Hoentjen wrote: > On 26-03-19 14:23, Sahina Bose wrote: > > +Krutika Dhananjay and gluster ml > > > > On Tue, Mar 26, 2019 at 6:16 PM Sander Hoentjen > wrote: > >> Hello, > >> > >> tl;dr We have disk corruption when doing live storage migration on oVirt > >> 4.2 with gluster 3.12.15. Any idea why? > >> > >> We have a 3-node oVirt cluster that is both compute and gluster-storage. > >> The manager runs on separate hardware. We are running out of space on > >> this volume, so we added another Gluster volume that is bigger, put a > >> storage domain on it and then we migrated VM's to it with LSM. After > >> some time, we noticed that (some of) the migrated VM's had corrupted > >> filesystems. After moving everything back with export-import to the old > >> domain where possible, and recovering from backups where needed we set > >> off to investigate this issue. > >> > >> We are now at the point where we can reproduce this issue within a day. > >> What we have found so far: > >> 1) The corruption occurs at the very end of the replication step, most > >> probably between START and FINISH of diskReplicateFinish, before the > >> START merge step > >> 2) In the corrupted VM, at some place where data should be, this data is > >> replaced by zero's. This can be file-contents or a directory-structure > >> or whatever. > >> 3) The source gluster volume has different settings then the destination > >> (Mostly because the defaults were different at creation time): > >> > >> Setting old(src) new(dst) > >> cluster.op-version 30800 30800 (the same) > >> cluster.max-op-version 31202 31202 (the same) > >> cluster.metadata-self-heal off on > >> cluster.data-self-heal off on > >> cluster.entry-self-heal off on > >> performance.low-prio-threads1632 > >> performance.strict-o-direct off on > >> network.ping-timeout4230 > >> network.remote-dio enableoff > >> transport.address-family- inet > >> performance.stat-prefetch off on > >> features.shard-block-size 512MB 64MB > >> cluster.shd-max-threads 1 8 > >> cluster.shd-wait-qlength1024 1 > >> cluster.locking-scheme full granular > >> cluster.granular-entry-heal noenable > >> > >> 4) To test, we migrate some VM's back and forth. The corruption does not > >> occur every time. To this point it only occurs from old to new, but we > >> don't have enough data-points to be sure about that. > >> > >> Anybody an idea what is causing the corruption? Is this the best list to > >> ask, or should I ask on a Gluster list? I am not sure if this is oVirt > >> specific or Gluster specific though. > > Do you have logs from old and new gluster volumes? Any errors in the > > new volume's fuse mount logs? > > Around the time of corruption I see the message: > The message "I [MSGID: 133017] [shard.c:4941:shard_seek] > 0-ZoneA_Gluster1-shard: seek called on > 7fabc273-3d8a-4a49-8906-b8ccbea4a49f. [Operation not supported]" repeated > 231 times between [2019-03-26 13:14:22.297333] and [2019-03-26 > 13:15:42.912170] > > I also see this message at other times, when I don't see the corruption > occur, though. > > -- > Sander > ___ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/M3T2VGGGV6DE643ZKKJUAF274VSWTJFH/ > ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZUIRM5PT4Y4USOSDGSUEP3YEE23LE4WG/
[ovirt-users] Re: VM disk corruption with LSM on Gluster
On 26-03-19 14:23, Sahina Bose wrote: > +Krutika Dhananjay and gluster ml > > On Tue, Mar 26, 2019 at 6:16 PM Sander Hoentjen wrote: >> Hello, >> >> tl;dr We have disk corruption when doing live storage migration on oVirt >> 4.2 with gluster 3.12.15. Any idea why? >> >> We have a 3-node oVirt cluster that is both compute and gluster-storage. >> The manager runs on separate hardware. We are running out of space on >> this volume, so we added another Gluster volume that is bigger, put a >> storage domain on it and then we migrated VM's to it with LSM. After >> some time, we noticed that (some of) the migrated VM's had corrupted >> filesystems. After moving everything back with export-import to the old >> domain where possible, and recovering from backups where needed we set >> off to investigate this issue. >> >> We are now at the point where we can reproduce this issue within a day. >> What we have found so far: >> 1) The corruption occurs at the very end of the replication step, most >> probably between START and FINISH of diskReplicateFinish, before the >> START merge step >> 2) In the corrupted VM, at some place where data should be, this data is >> replaced by zero's. This can be file-contents or a directory-structure >> or whatever. >> 3) The source gluster volume has different settings then the destination >> (Mostly because the defaults were different at creation time): >> >> Setting old(src) new(dst) >> cluster.op-version 30800 30800 (the same) >> cluster.max-op-version 31202 31202 (the same) >> cluster.metadata-self-heal off on >> cluster.data-self-heal off on >> cluster.entry-self-heal off on >> performance.low-prio-threads1632 >> performance.strict-o-direct off on >> network.ping-timeout4230 >> network.remote-dio enableoff >> transport.address-family- inet >> performance.stat-prefetch off on >> features.shard-block-size 512MB 64MB >> cluster.shd-max-threads 1 8 >> cluster.shd-wait-qlength1024 1 >> cluster.locking-scheme full granular >> cluster.granular-entry-heal noenable >> >> 4) To test, we migrate some VM's back and forth. The corruption does not >> occur every time. To this point it only occurs from old to new, but we >> don't have enough data-points to be sure about that. >> >> Anybody an idea what is causing the corruption? Is this the best list to >> ask, or should I ask on a Gluster list? I am not sure if this is oVirt >> specific or Gluster specific though. > Do you have logs from old and new gluster volumes? Any errors in the > new volume's fuse mount logs? Around the time of corruption I see the message: The message "I [MSGID: 133017] [shard.c:4941:shard_seek] 0-ZoneA_Gluster1-shard: seek called on 7fabc273-3d8a-4a49-8906-b8ccbea4a49f. [Operation not supported]" repeated 231 times between [2019-03-26 13:14:22.297333] and [2019-03-26 13:15:42.912170] I also see this message at other times, when I don't see the corruption occur, though. -- Sander ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/M3T2VGGGV6DE643ZKKJUAF274VSWTJFH/
[ovirt-users] Re: VM disk corruption with LSM on Gluster
+Krutika Dhananjay and gluster ml On Tue, Mar 26, 2019 at 6:16 PM Sander Hoentjen wrote: > > Hello, > > tl;dr We have disk corruption when doing live storage migration on oVirt > 4.2 with gluster 3.12.15. Any idea why? > > We have a 3-node oVirt cluster that is both compute and gluster-storage. > The manager runs on separate hardware. We are running out of space on > this volume, so we added another Gluster volume that is bigger, put a > storage domain on it and then we migrated VM's to it with LSM. After > some time, we noticed that (some of) the migrated VM's had corrupted > filesystems. After moving everything back with export-import to the old > domain where possible, and recovering from backups where needed we set > off to investigate this issue. > > We are now at the point where we can reproduce this issue within a day. > What we have found so far: > 1) The corruption occurs at the very end of the replication step, most > probably between START and FINISH of diskReplicateFinish, before the > START merge step > 2) In the corrupted VM, at some place where data should be, this data is > replaced by zero's. This can be file-contents or a directory-structure > or whatever. > 3) The source gluster volume has different settings then the destination > (Mostly because the defaults were different at creation time): > > Setting old(src) new(dst) > cluster.op-version 30800 30800 (the same) > cluster.max-op-version 31202 31202 (the same) > cluster.metadata-self-heal off on > cluster.data-self-heal off on > cluster.entry-self-heal off on > performance.low-prio-threads1632 > performance.strict-o-direct off on > network.ping-timeout4230 > network.remote-dio enableoff > transport.address-family- inet > performance.stat-prefetch off on > features.shard-block-size 512MB 64MB > cluster.shd-max-threads 1 8 > cluster.shd-wait-qlength1024 1 > cluster.locking-scheme full granular > cluster.granular-entry-heal noenable > > 4) To test, we migrate some VM's back and forth. The corruption does not > occur every time. To this point it only occurs from old to new, but we > don't have enough data-points to be sure about that. > > Anybody an idea what is causing the corruption? Is this the best list to > ask, or should I ask on a Gluster list? I am not sure if this is oVirt > specific or Gluster specific though. Do you have logs from old and new gluster volumes? Any errors in the new volume's fuse mount logs? > > Kind regards, > Sander Hoentjen > ___ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/43E2QYJYDHPYTIU3IFS53WS4WL5OFXUV/ ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/LIW5BDIKZTJWJOBYPEDHSYPS3E7UCKSY/