[Gluster-users] VM freeze issue on simple gluster setup.
I have a replica2+arbiter setup that is used for VMs. ip #.1 is the arb ip #.2 and #.3 are the kvm hosts. Two Volumes are involved and its gluster 6.5/Ubuntu 18.4/fuse The Gluster networking uses a two ethernet card teamd/round-robin setup which *should* have stayed up if one of the ports had failed. I just had a number of VMs go Read-Only due to the below communication failure at 22:00 but only on kvm host #2 VMs on the same gluster volumes on kvm host 3 were unaffected. The logs on host #2 show the following: [2019-12-05 22:00:43.739804] C [rpc-clnt-ping.c:155:rpc_clnt_ping_timer_expired] 0-GL1image-client-2: server 10.255.1.1:49153 has not responded in the last 21 seconds, disconnecting. [2019-12-05 22:00:43.757095] C [rpc-clnt-ping.c:155:rpc_clnt_ping_timer_expired] 0-GL1image-client-1: server 10.255.1.3:49152 has not responded in the last 21 seconds, disconnecting. [2019-12-05 22:00:43.757191] I [MSGID: 114018] [client.c:2323:client_rpc_notify] 0-GL1image-client-2: disconnected from GL1image-client-2. Client process will keep trying to connect to glusterd until brick's port is available [2019-12-05 22:00:43.757246] I [MSGID: 114018] [client.c:2323:client_rpc_notify] 0-GL1image-client-1: disconnected from GL1image-client-1. Client process will keep trying to connect to glusterd until brick's port is available [2019-12-05 22:00:43.757266] W [MSGID: 108001] [afr-common.c:5608:afr_notify] 0-GL1image-replicate-0: Client-quorum is not met [2019-12-05 22:00:43.790639] E [rpc-clnt.c:346:saved_frames_unwind] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x139)[0x7f030d045f59] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xcbb0)[0x7f030cdf0bb0] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xccce)[0x7f030cdf0cce] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x95)[0x7f030cdf1c45] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xe890)[0x7f030cdf2890] ) 0-GL1image-client-2: forced unwinding frame type(GlusterFS 4.x v1) op(FXATTROP(34)) called at 2019-12-05 22:00:19.736456 (xid=0x825bffb) [2019-12-05 22:00:43.790655] W [MSGID: 114031] [client-rpc-fops_v2.c:1614:client4_0_fxattrop_cbk] 0-GL1image-client-2: remote operation failed [2019-12-05 22:00:43.790686] E [rpc-clnt.c:346:saved_frames_unwind] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x139)[0x7f030d045f59] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xcbb0)[0x7f030cdf0bb0] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xccce)[0x7f030cdf0cce] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x95)[0x7f030cdf1c45] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xe890)[0x7f030cdf2890] ) 0-GL1image-client-1: forced unwinding frame type(GlusterFS 4.x v1) op(FXATTROP(34)) called at 2019-12-05 22:00:19.736428 (xid=0x89fee01) [2019-12-05 22:00:43.790703] W [MSGID: 114031] [client-rpc-fops_v2.c:1614:client4_0_fxattrop_cbk] 0-GL1image-client-1: remote operation failed [2019-12-05 22:00:43.790774] E [MSGID: 114031] [client-rpc-fops_v2.c:1393:client4_0_finodelk_cbk] 0-GL1image-client-1: remote operation failed [Transport endpoint is not connected] [2019-12-05 22:00:43.790777] E [rpc-clnt.c:346:saved_frames_unwind] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x139)[0x7f030d045f59] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xcbb0)[0x7f030cdf0bb0] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xccce)[0x7f030cdf0cce] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x95)[0x7f030cdf1c45] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xe890)[0x7f030cdf2890] ) 0-GL1image-client-2: forced unwinding frame type(GlusterFS 4.x v1) op(FXATTROP(34)) called at 2019-12-05 22:00:19.736542 (xid=0x825bffc) [2019-12-05 22:00:43.790794] W [MSGID: 114029] [client-rpc-fops_v2.c:4873:client4_0_finodelk] 0-GL1image-client-1: failed to send the fop [2019-12-05 22:00:43.790806] W [MSGID: 114031] [client-rpc-fops_v2.c:1614:client4_0_fxattrop_cbk] 0-GL1image-client-2: remote operation failed [2019-12-05 22:00:43.790825] E [MSGID: 114031] [client-rpc-fops_v2.c:1393:client4_0_finodelk_cbk] 0-GL1image-client-2: remote operation failed [Transport endpoint is not connected] [2019-12-05 22:00:43.790842] W [MSGID: 114029] [client-rpc-fops_v2.c:4873:client4_0_finodelk] 0-GL1image-client-2: failed to send the fop the fop/transport not connected errors just repeat for another 50 lines or so until I hit 22:00:46 seconds at which point the Volumes appear to be fine (though the VMs were still read-only until I rebooted. [2019-12-05 22:00:46.987242] W [fuse-bridge.c:2827:fuse_readv_cbk] 0-glusterfs-fuse: 91701328: READ => -1 gfid=d883b7c4-97f5-4f12-9373-7987cfc7dee4 fd=0x7f02f005b708 (Transport endpoint is not connected) [2019-12-05 22:00:47.029947] W [fuse-bridge.c:2827:fuse_readv_cbk] 0-glusterfs-fuse: 91701329: READ => -1 gfid=d883b7c4-97f5-4f12-9373-7987cfc7dee4
Re: [Gluster-users] In-place volume type conversion
On Thu, Dec 5, 2019 at 7:37 AM Dmitry Antipov wrote: > Is it technically possible/feasible to implement an in-place volume > conversion, > at least for volumes with the same number of bricks (say, from 'replica 3' > to > 'disperse 3')? If so, any thoughts on initial steps? > > The backend layouts for replicate and disperse are different. Hence it is not recommended to try out an in-place volume conversion for these types. Regards, Vijay Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/441850968 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] In-place volume type conversion
Is it technically possible/feasible to implement an in-place volume conversion, at least for volumes with the same number of bricks (say, from 'replica 3' to 'disperse 3')? If so, any thoughts on initial steps? Dmitry Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/441850968 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] glusterfind can't find modul utils.py
Dear Gluster Community, I tried to run glusterfind on my SLES15-SP1 machine using GlusterFS v5.10 but I got this error when performing it with user root: fs-davids-c1-n1:~ # glusterfind create --help Traceback (most recent call last): File "/usr/bin/glusterfind", line 14, in from glusterfind.main import main File "/usr/lib/glusterfs/glusterfind/main.py", line 26, in from utils import execute, is_host_local, mkdirp, fail ModuleNotFoundError: No module named 'utils' It seems to be not the default python3 utils module because I found in this path utils.py which has defined the function execute, is_host_local, mkdirp, fail : fs-davids-c1-n1:~ # ll /usr/lib/glusterfs/glusterfind/ total 112 -rwxr-xr-x 1 root root 1846 Oct 11 04:53 S57glusterfind-delete-post.py -rw-r--r-- 1 root root 381 Oct 11 04:53 __init__.py drwxr-xr-x 2 root root 4096 Dec 4 10:03 __pycache__ -rwxr-xr-x 1 root root 3737 Oct 11 04:53 brickfind.py -rwxr-xr-x 1 root root 15342 Oct 11 04:53 changelog.py -rw-r--r-- 1 root root 16079 Oct 11 04:53 changelogdata.py -rw-r--r-- 1 root root 851 Oct 11 04:53 conf.py -rw-r--r-- 1 root root 2507 Oct 11 04:53 libgfchangelog.py -rw-r--r-- 1 root root 33317 Oct 11 04:53 main.py -rwxr-xr-x 1 root root 5092 Oct 11 04:53 nodeagent.py -rw-r--r-- 1 root root 365 Oct 11 04:53 tool.conf -rw-r--r-- 1 root root 7195 Oct 11 04:53 utils.py Why glusterfind can't find this utils.py? Regards David Spisla Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/441850968 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Trying to fix files that don't want to heal
Am Montag, den 02.12.2019, 02:15 -0500 schrieb Ashish Pandey: > > > From: "Gudrun Mareike Amedick" > To: "Ashish Pandey" > Cc: "Gluster-users" > Sent: Friday, November 29, 2019 8:45:13 PM > Subject: Re: [Gluster-users] Trying to fix files that don't want to heal > > Hi Ashish, > > thanks for your reply. To fulfill the "no IO"-requirement, I'll have to wait > until second week of December (9th – 14th). > > We originally planned to update GlusterFS from 4.1.7 to 5 and then to 6 in > December. Should we do that upgrade before or after running those > scripts? > The best > > >> It will be best if you could do it before upgrading to newer version. > BTW, why are you not planing to upgrade to gluster 7? Okay, will do. We have ~400TB-volume. Restoring it is no fun. As a result, we have a pretty conservative updating schedule. > > > Kind regards > > GudrunAm Freitag, den 29.11.2019, 00:38 -0500 schrieb Ashish Pandey: > > Hey Gudrun, > > > > Could you please try to use the scripts and try to resolve it. > > We have written some scripts and it is in final phase to get merge - > > https://review.gluster.org/#/c/glusterfs/+/23380/ > > > > You can find the steps to use these scripts in README.md file > > > > --- > > Ashish > > > > From: "Gudrun Mareike Amedick" > > To: "Gluster-users" > > Sent: Thursday, November 28, 2019 3:57:18 PM > > Subject: [Gluster-users] Trying to fix files that don't want to heal > > > > Hi, > > > > I have a distributed dispersed volume with files that don't want to heal. > > I'm trying to fix them manually. > > > > I'm currently working on a file that is present on all bricks, GFID exists > > in .glusterfs-structure and getfattr shows identical attributes for all > > files. They look like this: > > > > # getfattr -m. -d -e hex $brick/somepath/libssl.so.1.1 > > getfattr: Removing leading '/' from absolute path names > > # file: $brick/$somepath/libssl.so.1.1 > > trusted.ec.config=0x080602000200 > > trusted.ec.dirty=0x0001 > > trusted.ec.size=0x000a > > trusted.ec.version=0x00040005 > > trusted.gfid=0xdd7dd64f6bb34b5f891a5e32fe83874f > > trusted.gfid2path.0c3a5b76c518ef60=0x34663064396234332d343730342d343634352d613834342d3338303532336137346632662f6c696273736c2e736f2e312e31 > > trusted.gfid2path.578ce2ec37aa0f9d=0x31636136323433342d396132642d343039362d616265352d6463353065613131333066632f6c696273736c2e736f2e312e31 > > trusted.glusterfs.quota.1ca62434-9a2d-4096-abe5-dc50ea1130fc.contri.3=0x00029201 > > trusted.glusterfs.quota.4f0d9b43-4704-4645-a844-380523a74f2f.contri.3=0x00029201 > > trusted.pgfid.1ca62434-9a2d-4096-abe5-dc50ea1130fc=0x0001 > > trusted.pgfid.4f0d9b43-4704-4645-a844-380523a74f2f=0x0001 > > > > pgfid is "parent gfid" right? Both GFID's refer to a dir in my volume, both > > of those dirs contain a file named libssl.so.1.1. They seem to be > > hardlinks: > > > > find $brick/$somepath -samefile $brick/$someotherpath/libssl.so.1.1 > > $brick/$somepath/libssl.so.1 > > > > This exceeds the limits of my GlusterFS knowledge. Is that something that > > can/should happen? If not, is it the reason that file won't heal and how > > do > > I fix that? > > > > Kind regards > > > > Gudrun Amedick > > > > > > Community Meeting Calendar: > > > > APAC Schedule - > > Every 2nd and 4th Tuesday at 11:30 AM IST > > Bridge: https://bluejeans.com/441850968 > > > > NA/EMEA Schedule - > > Every 1st and 3rd Tuesday at 01:00 PM EDT > > Bridge: https://bluejeans.com/441850968 > > > > Gluster-users mailing list > > Gluster-users@gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-users > > > > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/441850968 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing list > Gluster-users@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users > smime.p7s Description: S/MIME cryptographic signature Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/441850968 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users