Sorry for necrobumping this, but this morning I've suffered this on my Proxmox + GlusterFS cluster. In the log I can see this
[2022-11-21 07:38:00.213620 +0000] I [MSGID: 133017] [shard.c:7275:shard_seek] 11-vmdata-shard: seek called on fbc063cb-874e-475d-b585-f89 f7518acdd. [Operation not supported] pending frames: frame : type(1) op(WRITE) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) ... frame : type(1) op(FSYNC) frame : type(1) op(FSYNC) frame : type(1) op(FSYNC) frame : type(1) op(FSYNC) frame : type(1) op(FSYNC) frame : type(1) op(FSYNC) frame : type(1) op(FSYNC) frame : type(1) op(FSYNC) frame : type(1) op(FSYNC) frame : type(1) op(FSYNC) frame : type(1) op(FSYNC) frame : type(1) op(FSYNC) frame : type(1) op(FSYNC) frame : type(1) op(FSYNC) frame : type(1) op(FSYNC) frame : type(1) op(FSYNC) frame : type(1) op(FSYNC) frame : type(1) op(FSYNC) frame : type(1) op(FSYNC) frame : type(1) op(FSYNC) patchset: git://git.gluster.org/glusterfs.git signal received: 11 time of crash: 2022-11-21 07:38:00 +0000 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 10.3 /lib/x86_64-linux-gnu/libglusterfs.so.0(+0x28a54)[0x7f74f286ba54] /lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x700)[0x7f74f2873fc0] /lib/x86_64-linux-gnu/libc.so.6(+0x38d60)[0x7f74f262ed60] /usr/lib/x86_64-linux-gnu/glusterfs/10.3/xlator/cluster/disperse.so(+0x37a14)[0x7f74ecfcea14] /usr/lib/x86_64-linux-gnu/glusterfs/10.3/xlator/cluster/disperse.so(+0x19414)[0x7f74ecfb0414] /usr/lib/x86_64-linux-gnu/glusterfs/10.3/xlator/cluster/disperse.so(+0x16373)[0x7f74ecfad373] /usr/lib/x86_64-linux-gnu/glusterfs/10.3/xlator/cluster/disperse.so(+0x21d59)[0x7f74ecfb8d59] /usr/lib/x86_64-linux-gnu/glusterfs/10.3/xlator/cluster/disperse.so(+0x22815)[0x7f74ecfb9815] /usr/lib/x86_64-linux-gnu/glusterfs/10.3/xlator/cluster/disperse.so(+0x377d9)[0x7f74ecfce7d9] /usr/lib/x86_64-linux-gnu/glusterfs/10.3/xlator/cluster/disperse.so(+0x19414)[0x7f74ecfb0414] /usr/lib/x86_64-linux-gnu/glusterfs/10.3/xlator/cluster/disperse.so(+0x16373)[0x7f74ecfad373] /usr/lib/x86_64-linux-gnu/glusterfs/10.3/xlator/cluster/disperse.so(+0x170f9)[0x7f74ecfae0f9] /usr/lib/x86_64-linux-gnu/glusterfs/10.3/xlator/cluster/disperse.so(+0x313bb)[0x7f74ecfc83bb] /usr/lib/x86_64-linux-gnu/glusterfs/10.3/xlator/protocol/client.so(+0x48e3a)[0x7f74ed06ce3a] /lib/x86_64-linux-gnu/libgfrpc.so.0(+0xfccb)[0x7f74f2816ccb] /lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x26)[0x7f74f2812646] /usr/lib/x86_64-linux-gnu/glusterfs/10.3/rpc-transport/socket.so(+0x64c8)[0x7f74ee15f4c8] /usr/lib/x86_64-linux-gnu/glusterfs/10.3/rpc-transport/socket.so(+0xd38c)[0x7f74ee16638c] /lib/x86_64-linux-gnu/libglusterfs.so.0(+0x7971d)[0x7f74f28bc71d] /lib/x86_64-linux-gnu/libpthread.so.0(+0x7ea7)[0x7f74f27d2ea7] /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f74f26f2aef] --------- The mount point wasn't accessible with the "Transport endpoint is not connected" message and it was shown like this. d????????? ? ? ? ? ? vmdata I had to stop all the VMs on that proxmox node, then stop the gluster daemon to ummount de directory, and after starting the daemon and re-mounting, all was working again. My gluster volume info returns this Volume Name: vmdata Type: Distributed-Disperse Volume ID: cace5aa4-b13a-4750-8736-aa179c2485e1 Status: Started Snapshot Count: 0 Number of Bricks: 2 x (2 + 1) = 6 Transport-type: tcp Bricks: Brick1: g01:/data/brick1/brick Brick2: g02:/data/brick2/brick Brick3: g03:/data/brick1/brick Brick4: g01:/data/brick2/brick Brick5: g02:/data/brick1/brick Brick6: g03:/data/brick2/brick Options Reconfigured: nfs.disable: on transport.address-family: inet storage.fips-mode-rchecksum: on features.shard: enable features.shard-block-size: 256MB performance.read-ahead: off performance.quick-read: off performance.io-cache: off server.event-threads: 2 client.event-threads: 3 performance.client-io-threads: on performance.stat-prefetch: off dht.force-readdirp: off performance.force-readdirp: off network.remote-dio: on features.cache-invalidation: on performance.parallel-readdir: on performance.readdir-ahead: on Xavi, do you think the open-behind off setting can help somehow? I did try to understand what it does (with no luck), and if it could impact the performance of my VMs (I've the setup you know so well ;)) I would like to avoid more crashings like this, version 10.3 of gluster was working since two weeks ago, quite well until this morning. *Angel Docampo* <https://www.google.com/maps/place/Edificio+de+Oficinas+Euro+3/@41.3755943,2.0730134,17z/data=!3m2!4b1!5s0x12a4997021aad323:0x3e06bf8ae6d68351!4m5!3m4!1s0x12a4997a67bf592f:0x83c2323a9cc2aa4b!8m2!3d41.3755903!4d2.0752021> <angel.doca...@eoniantec.com> <+34-93-1592929> El vie, 19 mar 2021 a las 2:10, David Cunningham (<dcunning...@voisonics.com>) escribió: > Hi Xavi, > > Thank you for that information. We'll look at upgrading it. > > > On Fri, 12 Mar 2021 at 05:20, Xavi Hernandez <jaher...@redhat.com> wrote: > >> Hi David, >> >> with so little information it's hard to tell, but given that there are >> several OPEN and UNLINK operations, it could be related to an already fixed >> bug (in recent versions) in open-behind. >> >> You can try disabling open-behind with this command: >> >> # gluster volume set <volname> open-behind off >> >> But given the version you are using is very old and unmaintained, I would >> recommend you to upgrade to 8.x at least. >> >> Regards, >> >> Xavi >> >> >> On Wed, Mar 10, 2021 at 5:10 AM David Cunningham < >> dcunning...@voisonics.com> wrote: >> >>> Hello, >>> >>> We have a GlusterFS 5.13 server which also mounts itself with the native >>> FUSE client. Recently the FUSE mount crashed and we found the following in >>> the syslog. There isn't anything logged in mnt-glusterfs.log for that time. >>> After killing all processes with a file handle open on the filesystem we >>> were able to unmount and then remount the filesystem successfully. >>> >>> Would anyone have advice on how to debug this crash? Thank you in >>> advance! >>> >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: pending frames: >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: frame : type(0) op(0) >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: frame : type(0) op(0) >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: frame : type(1) op(UNLINK) >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: frame : type(1) op(UNLINK) >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: frame : type(1) op(OPEN) >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: message repeated 3355 times: [ >>> frame : type(1) op(OPEN)] >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: frame : type(1) op(OPEN) >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: message repeated 6965 times: [ >>> frame : type(1) op(OPEN)] >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: frame : type(1) op(OPEN) >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: message repeated 4095 times: [ >>> frame : type(1) op(OPEN)] >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: frame : type(0) op(0) >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: patchset: git:// >>> git.gluster.org/glusterfs.git >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: signal received: 11 >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: time of crash: >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: 2021-03-09 03:12:31 >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: configuration details: >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: argp 1 >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: backtrace 1 >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: dlfcn 1 >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: libpthread 1 >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: llistxattr 1 >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: setfsid 1 >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: spinlock 1 >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: epoll.h 1 >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: xattr.h 1 >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: st_atim.tv_nsec 1 >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: package-string: glusterfs 5.13 >>> Mar 9 05:12:31 voip1 mnt-glusterfs[2932]: --------- >>> ... >>> Mar 9 05:13:50 voip1 systemd[1]: glusterfssharedstorage.service: Main >>> process exited, code=killed, status=11/SEGV >>> Mar 9 05:13:50 voip1 systemd[1]: glusterfssharedstorage.service: Failed >>> with result 'signal'. >>> ... >>> Mar 9 05:13:54 voip1 systemd[1]: glusterfssharedstorage.service: Service >>> hold-off time over, scheduling restart. >>> Mar 9 05:13:54 voip1 systemd[1]: glusterfssharedstorage.service: >>> Scheduled restart job, restart counter is at 2. >>> Mar 9 05:13:54 voip1 systemd[1]: Stopped Mount glusterfs sharedstorage. >>> Mar 9 05:13:54 voip1 systemd[1]: Starting Mount glusterfs >>> sharedstorage... >>> Mar 9 05:13:54 voip1 mount-shared-storage.sh[20520]: ERROR: Mount point >>> does not exist >>> Mar 9 05:13:54 voip1 mount-shared-storage.sh[20520]: Please specify a >>> mount point >>> Mar 9 05:13:54 voip1 mount-shared-storage.sh[20520]: Usage: >>> Mar 9 05:13:54 voip1 mount-shared-storage.sh[20520]: man 8 >>> /sbin/mount.glusterfs >>> >>> -- >>> David Cunningham, Voisonics Limited >>> http://voisonics.com/ >>> USA: +1 213 221 1092 >>> New Zealand: +64 (0)28 2558 3782 >>> ________ >>> >>> >>> >>> Community Meeting Calendar: >>> >>> Schedule - >>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>> Bridge: https://meet.google.com/cpu-eiue-hvk >>> Gluster-users mailing list >>> Gluster-users@gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> >> > > -- > David Cunningham, Voisonics Limited > http://voisonics.com/ > USA: +1 213 221 1092 > New Zealand: +64 (0)28 2558 3782 > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://meet.google.com/cpu-eiue-hvk > Gluster-users mailing list > Gluster-users@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users >
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users