[ovirt-users] Gluster issue with brick going down

Chris Adams Mon, 21 Mar 2022 06:17:36 -0700

I have a hyper-converged cluster running oVirt 4.4.10 and Gluster 8.6.
Periodically, one brick of one volume will drop out, but it's seemingly
random as to which volume and brick is affected.  All I see in the brick
log is:


[2022-03-19 13:27:36.360727] W [MSGID: 113075] 
[posix-helpers.c:2135:posix_fs_health_check] 0-vmstore-posix: 
aio_read_cmp_buf() on /gluster_bricks/vmstore/vmstore/.glusterfs/health_check 
returned ret is -1 error is Structure needs cleaning 
[2022-03-19 13:27:36.361160] M [MSGID: 113075] 
[posix-helpers.c:2214:posix_health_check_thread_proc] 0-vmstore-posix: 
health-check failed, going down 
[2022-03-19 13:27:36.361395] M [MSGID: 113075] 
[posix-helpers.c:2232:posix_health_check_thread_proc] 0-vmstore-posix: still 
alive! -> SIGTERM

Searching around, I see references to similar issues, but no real
solutions.  I see a suggestion that changing the health-check-interval
from 10 to 30 seconds helps, but it looks like 30 seconds is the default
with this version of Gluster (and I don't see it explicitly set for any
of my volumes).

While "Structure needs cleaning" appears to be an XFS filesystem error,
I don't see any XFS errors from the kernel.

This is a low I/O cluster - the storage network is on two 10 gig
switches with a two-port LAG to each server, but typically is only
seeing a few tens of megabits per second.

-- 
Chris Adams <c...@cmadams.net>
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ACE5G25RRGOE4MADK4MYJJFAIDP5BZCJ/

[ovirt-users] Gluster issue with brick going down

Reply via email to