Re: [Gluster-users] Geo-replication status Faulty

2020-10-27 Thread Strahil Nikolov
If you can afford the extra space , set the logs to TRACE and after reasonable timeframe lower them back. Despite RH's gluster versioning is different - this thread should help:

Re: [Gluster-users] missing files on FUSE mount

2020-10-27 Thread Strahil Nikolov
Yes, common sense leads that any issues should be observed on nodes that did not do the operation. As you see the issue constantly on a single client - maybe you can reinstall the packages there and reconnect. Also, consider updating to latest 7.X version as soon as possible and then the

Re: [Gluster-users] Geo-replication status Faulty

2020-10-27 Thread Strahil Nikolov
It could be a "simple" bug - software has bugs and regressions. I would recommend you to ping the debian mailing list - at least it won't hurt. Best Regards, Strahil Nikolov В вторник, 27 октомври 2020 г., 20:10:39 Гринуич+2, Gilberto Nunes написа: [SOLVED] Well... It seems to me

Re: [Gluster-users] missing files on FUSE mount

2020-10-27 Thread Strahil Nikolov
Have you tried to reduce the cache timeouts ? I can't find your gluster version in the thread - can you share again OS + gluster version ? Best Regards, Strahil Nikolov В вторник, 27 октомври 2020 г., 19:23:28 Гринуич+2, Martín Lorenzo написа: Hi Strahil, today we have the same

Re: [Gluster-users] Gluster monitoring

2020-10-27 Thread Alvin Starr
We have been using zabbix for tracking gluster but  that works because we are using zabbix for the rest of our monitoring of things like network and disk IO. One thing to track that is not part of the usual suspects is the heal counts. They should always be 0 unless you have a problem

Re: [Gluster-users] Upgrade from 6.9 to 7.7 stuck (peer is rejected)

2020-10-27 Thread Strahil Nikolov
If you use the same block device for the arbiter,  I would recommend you to 'mkfs' again. For example , XFS brick will be done via 'mkfs.xfs -f -i size=512 /dev/DEVICE'. Reusing a brick without recreating the FS is error-prone. Also, don't forget to create your brick dir , once the device is

Re: [Gluster-users] missing files on FUSE mount

2020-10-27 Thread Martín Lorenzo
Hi Strahil The versions are: CentOS Linux release 7.7.1908 glusterfs 7.3 I am setting performance.md-cache-timeout and performance.nl-cache-timeout to 120s The weird thing about it, It always happens on the same mount as the operation (copy, mv). My common sense is that any cache related problem

Re: [Gluster-users] Geo-replication status Faulty

2020-10-27 Thread Gilberto Nunes
Not so fast with my solution! After shutting the other node in the head, get FAULTY stat again... The only failure I saw in this thing regarding xattr value... [2020-10-27 19:20:07.718897] E [syncdutils(worker /DATA/vms):110:gf_mount_ready] : failed to get the xattr value Don't know if I am

Re: [Gluster-users] Geo-replication status Faulty

2020-10-27 Thread Gilberto Nunes
[SOLVED] Well... It seems to me that pure Debian Linux 10 has some problem with XFS, which is the FS that I used. It's not accept attr2 mount options. Interestingly enough, I have now used Proxmox 6.x, which is Debian based, I am now able to use the attr2 mount point option. Then the Faulty

Re: [Gluster-users] missing files on FUSE mount

2020-10-27 Thread Martín Lorenzo
Hi Strahil, today we have the same number clients on all nodes, but the problem persists. I have the impression that it gets more frequent as the server capacity fills up, now we are having at least one incident per day. Regards, Martin On Mon, Oct 26, 2020 at 8:09 AM Martín Lorenzo wrote: > HI

Re: [Gluster-users] Gluster monitoring

2020-10-27 Thread WK
sorry, I didn't notice you had already looked at gstatus. Nonetheless with its JSON output you certainly cover the issues you described i.e. "When Brick went down (crash, failure, shutdown), node failure, peering issue, on-going healing" which is how we use it. -wk On 10/27/2020 9:33 AM,

Re: [Gluster-users] Gluster monitoring

2020-10-27 Thread WK
https://github.com/gluster/gstatus we run this from an ansible driven cronjob and check for the healthy signal in status, as well as looking for healing files that seem to persist. We have a number of gluster clusters and we have found its warnings both useful and timely. -wk On

Re: [Gluster-users] Geo-replication status Faulty

2020-10-27 Thread Gilberto Nunes
>> IIUC you're begging for split-brain ... Not at all! I have used this configuration and there isn't any split brain at all! But if I do not use it, then I get a split brain. Regarding count 2 I will see it! Thanks --- Gilberto Nunes Ferreira Em ter., 27 de out. de 2020 às 09:37, Diego

Re: [Gluster-users] Geo-replication status Faulty

2020-10-27 Thread Diego Zuccato
Il 27/10/20 13:15, Gilberto Nunes ha scritto: > I have applied this parameters to the 2-node gluster: > gluster vol set VMS cluster.heal-timeout 10 > gluster volume heal VMS enable > gluster vol set VMS cluster.quorum-reads false > gluster vol set VMS cluster.quorum-count 1 Urgh! IIUC you're

Re: [Gluster-users] Geo-replication status Faulty

2020-10-27 Thread Gilberto Nunes
Hi Aravinda Let me thank you for that nice tools... It helps me a lot. And yes! Indeed I think this is the case, but why does gluster03 (which is the backup server) not continue since gluster02 are still online?? That puzzles me... --- Gilberto Nunes Ferreira Em ter., 27 de out. de 2020 às

Re: [Gluster-users] Geo-replication status Faulty

2020-10-27 Thread Gilberto Nunes
Dear Felix I have applied this parameters to the 2-node gluster: gluster vol set VMS cluster.heal-timeout 10 gluster volume heal VMS enable gluster vol set VMS cluster.quorum-reads false gluster vol set VMS cluster.quorum-count 1 gluster vol set VMS network.ping-timeout 2 gluster volume set VMS

Re: [Gluster-users] Upgrade from 6.9 to 7.7 stuck (peer is rejected)

2020-10-27 Thread mabi
First to answer your question how this first happened, I reached that issue first by simply rebooting my arbiter node yesterday morning in order to due some maintenance which I do on a regular basis and was never a problem before GlusterFS 7.8. I have now removed the arbiter brick from all of

Re: [Gluster-users] Geo-replication status Faulty

2020-10-27 Thread Aravinda VK
Hi Gilberto, Happy to see georepsetup tool is useful for you. The repo I moved to https://github.com/aravindavk/gluster-georep-tools (renamed as “gluster-georep-setup”). I think the georep command failure is due to respective node’s(peer)

Re: [Gluster-users] Geo-replication status Faulty

2020-10-27 Thread Felix Kölzow
Dear Gilberto, If I am right, you ran into server-quorum if you startet a 2-node replica and shutdown one host. From my perspective, its fine. Please correct me if I am wrong here. Regards, Felix On 27/10/2020 01:46, Gilberto Nunes wrote: Well I do not reboot the host. I shut down the

Re: [Gluster-users] Upgrade from 6.9 to 7.7 stuck (peer is rejected)

2020-10-27 Thread Diego Zuccato
Il 27/10/20 07:40, mabi ha scritto: > First to answer your question how this first happened, I reached that issue > first by simply rebooting my arbiter node yesterday morning in order to due > some maintenance which I do on a regular basis and was never a problem before > GlusterFS 7.8. In my