I tried to measure IOs using gluster volume top but its results seem very cryptic to me (I need a deep analyze and don't have the time now)
Thank you very much for your analysis, if I understood the problem is that the consumer SSD cache is too weak to help in times under a smoll number ~15 not particularly IO intensive VMs, so the IO hangs as the performance is poor and this hangs the VMs. The VMs kernel think that the CPU had hanged and so it crash. This seem to be the case.... If it's possible would be very useful a sort of profiler in the gluster enviromnent that raise up the evidence of issue related to speed of the undelying storage infrastructure, it can be a problem related to disks or to network, in any case the errors reported to user are almost misleading as it seem there is a data integrity issue (cannot read... or something like this. Only for reference these are the first lines of the "open" top command (currently I don't experience problems): [root@ovirt-node2 ~]# gluster volume top gv1 open Brick: ovirt-node2.ovirt:/brickgv1/gv1 Current open fds: 15, Max open fds: 38, Max openfd time: 2022-09-19 07:27:20.033304 +0000 Count filename ======================= 331763 /45b4f14c-8323-482f-90ab-99d8fd610018/dom_md/inbox 66284 /45b4f14c-8323-482f-90ab-99d8fd610018/dom_md/leases 53939 /45b4f14c-8323-482f-90ab-99d8fd610018/dom_md/metadata.new 169 /45b4f14c-8323-482f-90ab-99d8fd610018/images/910fa026-d30b-4be2-9111-3c9f4f646fde/b7d6f39a-1481-4f5c-84fd-fc43f9e14d71 [...] _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/F7FKIJHYOANZM657KDZMIKC23CHXKRDS/