Re: [Gluster-users] Right way to use community Gluster on genuine RHEL?

2022-07-18 Thread Péter Károly JUHÁSZ
The best would be officially pre built rpms for RHEL.


Kaleb Keithley  于 2022年7月18日周一 14:58写道:

>
>
> On Sat, Jul 16, 2022 at 5:42 PM Thomas Cameron <
> thomas.came...@camerontech.com> wrote:
>
>> All -
>>
>> Is there a way to install community packages on genuine RHEL? ... It
>> seems like I need to install
>> centos-release-gluster9-1.0-1.el8.noarch.rpm,
>> centos-release-storage-common-2-2.el8.noarch.rpm, and maybe
>> centos-release?
>>
>>
>
> Péter Károly JUHÁSZ wrote:
> >I don't know what is the correct way but what I did on my RHEL7 (I assume
> 8 and 9 is more or less the same):
> >
> >  * Added this repo
> http://mirror.centos.org/centos/7/storage/x86_64/gluster-9/
> >  * Then yum install glusterfs-server
>
> Strahil Nikolov wrote:
> > You can built the rpms from source.
> > https://docs.gluster.org/en/main/Developer-guide/Building-GlusterFS/
>
> Those all work. Building from source is maybe the hardest, but it's not
> that hard.
>
> Packages are nice because they're easy to install, update, and remove.
>
> What is the "correct" way of using gluster on RHEL 8 or, preferably, 9?
>>
>
> There isn't any one correct or official way. If packages make sense to
> you, use them. If building from source works for you, do that. Building
> your own RPMs gives you the best of both.
>
> The glusterfs.spec (and related files) is at
> https://git.centos.org/rpms/glusterfs/ if you want to build your own rpms.
>
> --
>
> Kaleb
> 
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Right way to use community Gluster on genuine RHEL?

2022-07-18 Thread Péter Károly JUHÁSZ
I can confirm it works. I just started the migration of our RHEL clusters
to CentOS packages since Redhat is disgustinued the product and left us
behind.

Kaleb Keithley  于 2022年7月18日周一 17:51写道:

>
>
> On Mon, Jul 18, 2022 at 11:46 AM Yaniv Kaul  wrote:
>
>>
>>
>> On Mon, Jul 18, 2022 at 6:34 PM Thomas Cameron <
>> thomas.came...@camerontech.com> wrote:
>>
>>> On 7/18/22 09:18, Péter Károly JUHÁSZ wrote:
>>> > The best would be officially pre built rpms for RHEL.
>>>
>>> Where are there official Red Hat Gluster 10 RPMs for RHEL?
>>>
>>
>> There's no such thing. Let's not confuse the upstream Gluster project and
>> Red Hat product - RHGS (Red Hat Gluster Storage), which has a different
>> version[1] and lifecycle[2] than the project.
>> Red Hat does not build upstream project official RPMs for RHEL.
>>
>> That being said, I'm somewhat surprised the CentOS RPMs don't work on
>> RHEL - is that indeed the case?
>> Y.
>>
>>
>  No.  The CentOS packages work fine on RHEL.
>
> --
>
> Kaleb
> 
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Right way to use community Gluster on genuine RHEL?

2022-07-18 Thread Péter Károly JUHÁSZ
[...] Redhat is disgustinued the product [...]

I meant "discontinued"... Stupid spell checker.:)

>
> Kaleb Keithley  于 2022年7月18日周一 17:51写道:
>
>>
>>
>> On Mon, Jul 18, 2022 at 11:46 AM Yaniv Kaul  wrote:
>>
>>>
>>>
>>> On Mon, Jul 18, 2022 at 6:34 PM Thomas Cameron <
>>> thomas.came...@camerontech.com> wrote:
>>>
>>>> On 7/18/22 09:18, Péter Károly JUHÁSZ wrote:
>>>> > The best would be officially pre built rpms for RHEL.
>>>>
>>>> Where are there official Red Hat Gluster 10 RPMs for RHEL?
>>>>
>>>
>>> There's no such thing. Let's not confuse the upstream Gluster project
>>> and Red Hat product - RHGS (Red Hat Gluster Storage), which has a different
>>> version[1] and lifecycle[2] than the project.
>>> Red Hat does not build upstream project official RPMs for RHEL.
>>>
>>> That being said, I'm somewhat surprised the CentOS RPMs don't work on
>>> RHEL - is that indeed the case?
>>> Y.
>>>
>>>
>>  No.  The CentOS packages work fine on RHEL.
>>
>> --
>>
>> Kaleb
>> 
>>
>>
>>
>> Community Meeting Calendar:
>>
>> Schedule -
>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> Bridge: https://meet.google.com/cpu-eiue-hvk
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Right way to use community Gluster on genuine RHEL?

2022-07-17 Thread Péter Károly JUHÁSZ
Hi,

I don't know what is the correct way but what I did on my RHEL7 (I assume 8
and 9 is more or less the same):

  * Added this repo
http://mirror.centos.org/centos/7/storage/x86_64/gluster-9/
  * Then yum install glusterfs-server

It works for me.

Regards,
Stone


Thomas Cameron  于 2022年7月16日周六 23:42写道:

> All -
>
> Is there a way to install community packages on genuine RHEL? I mean, I
> *can* install CentOS 8 I suppose, but I kind of want to play with real
> RHEL but with community packages. It seems like I need to install
> centos-release-gluster9-1.0-1.el8.noarch.rpm,
> centos-release-storage-common-2-2.el8.noarch.rpm, and maybe centos-release?
>
> What is the "correct" way of using gluster on RHEL 8 or, preferably, 9?
>
> Thanks!
> Thomas
> 
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] need input on configuration

2022-08-24 Thread Péter Károly JUHÁSZ
Hi Karl,

I think you should check out Syncthing too.

Karl Kleinpaste  于 2022年8月24日周三 20:21写道:

> Apologies for the previous incomplete message. It seems an unintended
> Alt-Ret told Thunderbird to send prematurely.  So this time I'm composing
> outside Tbird so that it doesn't get that opportunity.
>
> I'm new to this world and trying to find my way around; I could use a bit
> of advice on how not to bump into corners.  I'm working a contract in which
> the client has one main office plus a remote office with inconsistent
> net.connectivity.  There will also be some very mobile laptops, sometimes a
> long way off and entirely disconnected, but when in the office it would be
> good if the results of whatever they were doing while elsewhere would be
> readily (automatically) imparted to storage there.  They have interest in
> gluster in a probable configuration based on a set of 3 servers at the main
> office and 1 at the remote.  Questions arise around how to involve the
> remote laptops (all Linux).  There is not a huge amount of data involved
> here at the moment, on the order of a few Tbytes, but it will surely grow;
> local concern in the main office is redundancy, plus making data available
> to the remote office + laptops.
>
> The data generation and usage model tends to be that a good amount of
> material is generated, but very local to each user, so that a lot of
> locality is present for who writes where.  But then everybody tends to read
> from everybody else's area.  It's almost like users have a $HOME within the
> volume, and people peruse others $HOMEs frequently.
>
> So far, I'm just playing with configuration, to see what's possible.  At
> the moment, I've defined a 1x3 at the mains plus a 1-node volume at the
> remote. geo-replication is active mains -> remote, but I decided to see
> what would happen if I also set it up in the other direction, remote ->
> mains.  This has had surprisingly good effect, in that anybody using either
> volume gets their content replicated to the other, and everyone gets volume
> access on a local fast network.  An odd downside is that geo-rep apparently
> induces the target volume to go read-only, but I am able to turn
> features.read-only off, it seems persistent, and geo-rep continues.
>
> The laptops are a stickier problem especially in being often
> disconnected.  A straightforward-but-dumb solution is to define a 1-node
> volume on a laptop, just to get it involved in the mechanism, and then
> again geo-rep to the mains...and possibly geo-rep the mains back to the
> laptop as well.
>
> Am I taking a wrong turn, or going off a deep end?  Is gluster overkill
> for the entire question?  The choices seem obvious so far, but I'm so new
> that such a level of obviousness also seems to look as much like naïvèté.
> If anyone might have a sentence or three of observation or suggestion about
> this sort of situation, I'd appreciate it.
>
> --karl
> 
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] random disconnects of peers

2022-08-18 Thread Péter Károly JUHÁSZ
Did you tired to TCPdump the connections to see who and how closes the
connection? Normal fin-ack, or timeout? Maybe some network device between?
(This later has small probably since you told that you can trigger the
error by high load.)

 于 2022年8月18日周四 12:38写道:

> I just niced all glusterfsd processes on all nodes to a value of -10.
> The problem just occured, so it seems nicing the processes didn't help.
>
> Am 18.08.2022 09:54 schrieb Péter Károly JUHÁSZ:
> > What if you renice the gluster processes to some negative value?
> >
> >   于 2022年8月18日周四 09:45写道:
> >
> >> Hi folks,
> >>
> >> i am running multiple GlusterFS servers in multiple datacenters.
> >> Every
> >> datacenter is basically the same setup: 3x storage nodes, 3x kvm
> >> hypervisors (oVirt) and 2x HPE switches which are acting as one
> >> logical
> >> unit. The NICs of all servers are attached to both switches with a
> >> bonding of two NICs, in case one of the switches has a major
> >> problem.
> >> In one datacenter i have strange problems with the glusterfs for
> >> nearly
> >> half of a year now and i'm not able to figure out the root cause.
> >>
> >> Enviorment
> >> - glusterfs 9.5 running on a centos 7.9.2009 (Core)
> >> - three gluster volumes, all options equally configured
> >>
> >> root@storage-001# gluster volume info
> >> Volume Name: g-volume-domain
> >> Type: Replicate
> >> Volume ID: ffd3baa5-6125-48da-a5a4-5ee3969cfbd0
> >> Status: Started
> >> Snapshot Count: 0
> >> Number of Bricks: 1 x 3 = 3
> >> Transport-type: tcp
> >> Bricks:
> >> Brick1: storage-003.my.domain:/mnt/bricks/g-volume-domain
> >> Brick2: storage-002.my.domain:/mnt/bricks/g-volume-domain
> >> Brick3: storage-001.my.domain:/mnt/bricks/g-volume-domain
> >> Options Reconfigured:
> >> client.event-threads: 4
> >> performance.cache-size: 1GB
> >> server.event-threads: 4
> >> server.allow-insecure: On
> >> network.ping-timeout: 42
> >> performance.client-io-threads: off
> >> nfs.disable: on
> >> transport.address-family: inet
> >> cluster.quorum-type: auto
> >> network.remote-dio: enable
> >> cluster.eager-lock: enable
> >> performance.stat-prefetch: off
> >> performance.io-cache: off
> >> performance.quick-read: off
> >> cluster.data-self-heal-algorithm: diff
> >> storage.owner-uid: 36
> >> storage.owner-gid: 36
> >> performance.readdir-ahead: on
> >> performance.read-ahead: off
> >> client.ssl: off
> >> server.ssl: off
> >> auth.ssl-allow:
> >>
> >
> storage-001.my.domain,storage-002.my.domain,storage-003.my.domain,hv-001.my.domain,hv-002.my.domain,hv-003.my.domain
> >> ssl.cipher-list: HIGH:!SSLv2
> >> cluster.shd-max-threads: 4
> >> diagnostics.latency-measurement: on
> >> diagnostics.count-fop-hits: on
> >> performance.io-thread-count: 32
> >>
> >> Problem
> >> The glusterd on one storage node seems to loose connection to one
> >> another storage node. If the problem occurs, the first message in
> >> /var/log/glusterfs/glusterd.log is always the following (variable
> >> values
> >> are filled with "x":
> >> [2022-08-16 05:01:28.615441 +] I [MSGID: 106004]
> >> [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management:
> >> Peer
> >>  (),
> >> in
> >> state , has disconnected from glusterd.
> >>
> >> I will post a filtered log for this specific error on each of my
> >> storage
> >> nodes below.
> >> storage-001:
> >> root@storage-001# tail -n 10 /var/log/glusterfs/glusterd.log |
> >> grep
> >> "has disconnected from" | grep "2022-08-16"
> >> [2022-08-16 05:01:28.615441 +] I [MSGID: 106004]
> >> [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management:
> >> Peer
> >>  (<8bb466f6-01d6-42f2-ba75-b7a1eebc5ac6>),
> >> in
> >> state , has disconnected from glusterd.
> >> [2022-08-16 05:34:47.721060 +] I [MSGID: 106004]
> >> [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management:
> >> Peer
> >>  (),
> >> in
> >> state , has disconnected from glusterd.
> >> [2022-08-16 06:01:22.472973 +] I [MSGID: 106004]
> >> [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management:
> >> Peer
> &g

Re: [Gluster-users] How Does Gluster Failover

2022-08-31 Thread Péter Károly JUHÁSZ
You can also add the mount option: backupvolfile-server to let the client
know the other nodes.

Matthew J Black  于 2022年8月31日周三 17:21写道:

> Ah, it all now falls into place: I was unaware that the client receives
> that file upon initial contact with the cluster, and thus has that
> information at hand independently of the cluster nodes.
>
> Thank you for taking the time to educate a poor newbie - it is very much
> appreciated.
>
> Cheers
>
> Dulux-Oz
> On 01/09/2022 01:16, Joe Julian wrote:
>
> You know when you do a `gluster volume info` and you get the whole volume
> definition, the client graph is built from the same info. In fact, if you
> look in /var/lib/glusterd/vols/$volume_name you'll find some ".vol" files.
> `$volume_name.tcp-fuse.vol` is the configuration that the clients receive
> from whichever server they initially connect to. You'll notice that file
> has multiple "type/client" sections, each establishing a tcp connection to
> a server.
>
> Sidenote: You can also see in that file, how the microkernels are used to
> build all the logic that forms the volume, which is kinda cool. Back when I
> first started using gluster, there was no glusterd and you have to build
> those .vol files by hand.
> On 8/31/22 8:04 AM, Matthew J Black wrote:
>
> Hi Joe,
>
> Thanks for getting back to me about this, it was helpful, and I really
> appreciate it.
>
> I am, however, still (slightly) confused - *how* does the client "know"
> the addresses of the other servers in the cluster (for read or write
> purposes), when all the client has is the line in the fstab file: "gfs1:gv1
> /data/gv1  glusterfs  defaults  0 2"? I'm missing something, somewhere,
> in all of this, and I can't work out what that "something" is.  :-)
>
> Your help truely is appreciated
>
> Cheers
>
> Dulux-Oz
> On 01/09/2022 00:55, Joe Julian wrote:
>
> With a replica volume the client connects and writes to all the replicas
> directly. For reads, when a filename is looked up the client checks with
> all the replicas and, if the file is healthy, opens a read connection to
> the first replica to respond (by default).
>
> If a server is shut down, the client receives the tcp messages that close
> the connection. For read operations, it chooses the next server. Writes
> will just continue to the remaining replicas (metadata is stored in
> extended attributes to inform future lookups and the self-healer of file
> health).
>
> If a server crashes (no tcp finalization) the volume will pause for
> ping-timeout seconds (42 by default). Then continue as above. BTW, that 42
> second timeout shouldn't be a big deal. The MTBF should be sufficiently far
> apart that this should still easily get you five or six nines.
> On 8/30/22 11:55 PM, duluxoz wrote:
>
> Hi Guys & Gals,
>
> A Gluster newbie question for sure, but something I just don't "get" (or
> I've missed in the doco, mailing lists, etc):
>
> What happens to a Gluster Client when a Gluster Cluster Node goes off-line
> / fails-over?
>
> How does the Client "know" to use (connect to) another Gluster Node in the
> Gluster Cluster?
>
> Let me elaborate.
>
> I've got four hosts: gfs1, gfs2, gfs3, and client4 sitting on
> 192.168.1.1/24, .2, .3, and .4 respectively.
>
> DNS is set up and working correctly.
>
> gfs1, gs2, and gfs3 form a "Gluster Cluster" with a Gluster Volume (gv1)
> replicated across all three nodes. This is all working correctly (ie a file
> (file1) created/modified on gfs1:/gv1 is replicated correctly to gfs2:/gv1
> and gfs3:/gv1).
>
> client4 has an entry in its /etc/fstab file which reads: "gfs1:gv1
> /data/gv1  glusterfs  defaults  0 2". This is also all working correctly
> (ie client4:/data/gv1/file1 is accessible and replicated).
>
> So, (and I haven't tested this yet) what happens to client4:/data/gv1/file1
> when gfs1 fails (ie is turned off, crashes, etc)?
>
> Does client4 "automatically" switch to using one of the other two Gluster
> Nodes, or do I have something wrong in clients4's /etc/fstab file, or an
> error/mis-configuration somewhere else?
>
> I thought about setting some DNS entries along the lines of:
>
> ~~~
>
> glustercluster  IN  A  192.168.0.1
>
> glustercluster  IN  A  192.168.0.2
>
> glustercluster  IN  A  192.168.0.3
>
> ~~~
>
> and having clients4's /etc/fstab file read: "glustercluster:gv1
> /data/gv1  glusterfs  defaults  0 2", but this is a Round-Robin DNS
> config and I'm not sure how Gluster treats this situation.
>
> So, if people could comment / point me in the correct direction I would
> really appreciate it - thanks.
>
> Dulux-Oz
>
> 
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing 
> listGluster-users@gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users
>
>
> [image: width=]
> 
> 

Re: [Gluster-users] Directory in split brain does not heal - Gfs 9.2

2022-08-12 Thread Péter Károly JUHÁSZ
This always helped for me in this kind of situations:
http://docs.gluster.org/Troubleshooting/resolving-splitbrain/

Joe Julian  于 2022年8月12日周五 18:33写道:

> It could work, but I never imagined, back then, that *directories* could
> get in split-brain.
>
> The most likely reason for that split is that there's a gfid mismatch on
> one of the replicas. I'd go to the brick with the odd gfid, move that
> directory out of the brick path, then do a "find folder" on the client
> mount to rebuild the directory structure. Check the directory to make sure
> all the files are right before deleting the moved odd one.
>
> If you need to fix anything, just copy from the moved directory to a
> client-mount on the same machine.
> On 8/12/22 8:12 AM, Ilias Chasapakis forumZFD wrote:
>
> Dear fellow gluster users,
>
> we are facing a problem with our replica 3 setup. Glusterfs version is 9.2.
>
> We have a problem with a directory that is in split-brain and we cannot
> manage to heal with:
>
> gluster volume heal gfsVol split-brain latest-mtime /folder
>
> The command throws the following error: "failed:Transport endpoint is not
> connected."
>
> So the split brain directory entry remains and and so the whole healing
> process is not completing and other entries get stuck.
>
> I saw there is a python script available
> https://github.com/joejulian/glusterfs-splitbrain Would that be a good
> solution to try? To be honest we are a bit concerned with deleting the gfid
> and the files from the brick manually as it seems it can create
> inconsistencies and break things... I can of course give you more
> information about our setup and situation, but if you already have some
> tip, that would be fantastic.
>
> Best regards
>
> Ilias
>
> --
> forumZFD
> Entschieden für Frieden | Committed to Peace
>
> Ilias Chasapakis
> Referent IT | IT Consultant
>
> Forum Ziviler Friedensdienst e.V. | Forum Civil Peace Service
> Am Kölner Brett 8 | 50825 Köln | Germany
>
> Tel 0221 91273243 | Fax 0221 91273299 | http://www.forumZFD.de
>
> Vorstand nach § 26 BGB, einzelvertretungsberechtigt | Executive Board:
> Oliver Knabe (Vorsitz | Chair), Jens von Bargen, Alexander Mauz
> VR 17651 Amtsgericht Köln
>
> Spenden | Donations: IBAN DE37 3702 0500 0008 2401 01 BIC BFSWDE33XXX
>
>
> 
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing 
> listGluster-users@gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users
>
> 
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] random disconnects of peers

2022-08-18 Thread Péter Károly JUHÁSZ
What if you renice the gluster processes to some negative value?

 于 2022年8月18日周四 09:45写道:

> Hi folks,
>
> i am running multiple GlusterFS servers in multiple datacenters. Every
> datacenter is basically the same setup: 3x storage nodes, 3x kvm
> hypervisors (oVirt) and 2x HPE switches which are acting as one logical
> unit. The NICs of all servers are attached to both switches with a
> bonding of two NICs, in case one of the switches has a major problem.
> In one datacenter i have strange problems with the glusterfs for nearly
> half of a year now and i'm not able to figure out the root cause.
>
> Enviorment
> - glusterfs 9.5 running on a centos 7.9.2009 (Core)
> - three gluster volumes, all options equally configured
>
> root@storage-001# gluster volume info
> Volume Name: g-volume-domain
> Type: Replicate
> Volume ID: ffd3baa5-6125-48da-a5a4-5ee3969cfbd0
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: storage-003.my.domain:/mnt/bricks/g-volume-domain
> Brick2: storage-002.my.domain:/mnt/bricks/g-volume-domain
> Brick3: storage-001.my.domain:/mnt/bricks/g-volume-domain
> Options Reconfigured:
> client.event-threads: 4
> performance.cache-size: 1GB
> server.event-threads: 4
> server.allow-insecure: On
> network.ping-timeout: 42
> performance.client-io-threads: off
> nfs.disable: on
> transport.address-family: inet
> cluster.quorum-type: auto
> network.remote-dio: enable
> cluster.eager-lock: enable
> performance.stat-prefetch: off
> performance.io-cache: off
> performance.quick-read: off
> cluster.data-self-heal-algorithm: diff
> storage.owner-uid: 36
> storage.owner-gid: 36
> performance.readdir-ahead: on
> performance.read-ahead: off
> client.ssl: off
> server.ssl: off
> auth.ssl-allow:
>
> storage-001.my.domain,storage-002.my.domain,storage-003.my.domain,hv-001.my.domain,hv-002.my.domain,hv-003.my.domain
> ssl.cipher-list: HIGH:!SSLv2
> cluster.shd-max-threads: 4
> diagnostics.latency-measurement: on
> diagnostics.count-fop-hits: on
> performance.io-thread-count: 32
>
> Problem
> The glusterd on one storage node seems to loose connection to one
> another storage node. If the problem occurs, the first message in
> /var/log/glusterfs/glusterd.log is always the following (variable values
> are filled with "x":
> [2022-08-16 05:01:28.615441 +] I [MSGID: 106004]
> [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer
>  (), in
> state , has disconnected from glusterd.
>
> I will post a filtered log for this specific error on each of my storage
> nodes below.
> storage-001:
> root@storage-001# tail -n 10 /var/log/glusterfs/glusterd.log | grep
> "has disconnected from" | grep "2022-08-16"
> [2022-08-16 05:01:28.615441 +] I [MSGID: 106004]
> [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer
>  (<8bb466f6-01d6-42f2-ba75-b7a1eebc5ac6>), in
> state , has disconnected from glusterd.
> [2022-08-16 05:34:47.721060 +] I [MSGID: 106004]
> [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer
>  (), in
> state , has disconnected from glusterd.
> [2022-08-16 06:01:22.472973 +] I [MSGID: 106004]
> [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer
>  (<8bb466f6-01d6-42f2-ba75-b7a1eebc5ac6>), in
> state , has disconnected from glusterd.
> root@storage-001#
>
> storage-002:
> root@storage-002# tail -n 10 /var/log/glusterfs/glusterd.log | grep
> "has disconnected from" | grep "2022-08-16"
> [2022-08-16 05:01:34.502322 +] I [MSGID: 106004]
> [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer
>  (), in
> state , has disconnected from glusterd.
> [2022-08-16 05:19:16.898406 +] I [MSGID: 106004]
> [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer
>  (), in
> state , has disconnected from glusterd.
> [2022-08-16 06:01:22.462676 +] I [MSGID: 106004]
> [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer
>  (), in
> state , has disconnected from glusterd.
> [2022-08-16 10:17:52.154501 +] I [MSGID: 106004]
> [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer
>  (), in
> state , has disconnected from glusterd.
> root@storage-002#
>
> storage-003:
> root@storage-003# tail -n 10 /var/log/glusterfs/glusterd.log | grep
> "has disconnected from" | grep "2022-08-16"
> [2022-08-16 05:24:18.225432 +] I [MSGID: 106004]
> [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer
>  (<8bb466f6-01d6-42f2-ba75-b7a1eebc5ac6>), in
> state , has disconnected from glusterd.
> [2022-08-16 05:27:22.683234 +] I [MSGID: 106004]
> [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer
>  (<8bb466f6-01d6-42f2-ba75-b7a1eebc5ac6>), in
> state , has disconnected from glusterd.
> [2022-08-16 10:17:50.624775 +] I [MSGID: 106004]
> [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer
>  (<8bb466f6-01d6-42f2-ba75-b7a1eebc5ac6>), in
> 

Re: [Gluster-users] poor performance

2022-12-14 Thread Péter Károly JUHÁSZ
We did this with WordPress too. It uses a tons of static files, executing
them is the slow part. You can rsync them and use the upload dir from
glusterfs.

Jaco Kroon  于 2022年12月14日周三 13:20写道:

> Hi,
>
> The problem is files generated by wordpress, and uploads etc ... so
> copying them to frontend hosts whilst making perfect sense assumes I have
> control over the code to not write to the local front-end, else we could
> have relied on something like lsync.
>
> As it stands, performance is acceptable with nl-cache enabled, but the
> fact that we get those ENOENT errors are highly problematic.
>
>
> Kind Regards,
> Jaco Kroon
>
>
> n 2022/12/14 14:04, Péter Károly JUHÁSZ wrote:
>
> When we used glusterfs for websites, we copied the web dir from gluster to
> local on frontend boots, then served it from there.
>
> Jaco Kroon  于 2022年12月14日周三 12:49写道:
>
>> Hi All,
>>
>> We've got a glusterfs cluster that houses some php web sites.
>>
>> This is generally considered a bad idea and we can see why.
>>
>> With performance.nl-cache on it actually turns out to be very
>> reasonable, however, with this turned of performance is roughly 5x
>> worse.  meaning a request that would take sub 500ms now takes 2500ms.
>> In other cases we see far, far worse cases, eg, with nl-cache takes
>> ~1500ms, without takes ~30s (20x worse).
>>
>> So why not use nl-cache?  Well, it results in readdir reporting files
>> which then fails to open with ENOENT.  The cache also never clears even
>> though the configuration says nl-cache entries should only be cached for
>> 60s.  Even for "ls -lah" in affected folders you'll notice  mark
>> entries for attributes on files.  If this recovers in a reasonable time
>> (say, a few seconds, sure).
>>
>> # gluster volume info
>> Type: Replicate
>> Volume ID: cbe08331-8b83-41ac-b56d-88ef30c0f5c7
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x 2 = 2
>> Transport-type: tcp
>> Options Reconfigured:
>> performance.nl-cache: on
>> cluster.readdir-optimize: on
>> config.client-threads: 2
>> config.brick-threads: 4
>> config.global-threading: on
>> performance.iot-pass-through: on
>> storage.fips-mode-rchecksum: on
>> cluster.granular-entry-heal: enable
>> cluster.data-self-heal-algorithm: full
>> cluster.locking-scheme: granular
>> client.event-threads: 2
>> server.event-threads: 2
>> transport.address-family: inet
>> nfs.disable: on
>> cluster.metadata-self-heal: off
>> cluster.entry-self-heal: off
>> cluster.data-self-heal: off
>> cluster.self-heal-daemon: on
>> server.allow-insecure: on
>> features.ctime: off
>> performance.io-cache: on
>> performance.cache-invalidation: on
>> features.cache-invalidation: on
>> performance.qr-cache-timeout: 600
>> features.cache-invalidation-timeout: 600
>> performance.io-cache-size: 128MB
>> performance.cache-size: 128MB
>>
>> Are there any other recommendations short of abandon all hope of
>> redundancy and to revert to a single-server setup (for the web code at
>> least).  Currently the cost of the redundancy seems to outweigh the
>> benefit.
>>
>> Glusterfs version 10.2.  With patch for --inode-table-size, mounts
>> happen with:
>>
>> /usr/sbin/glusterfs --acl --reader-thread-count=2 --lru-limit=524288
>> --inode-table-size=524288 --invalidate-limit=16 --background-qlen=32
>> --fuse-mountopts=nodev,nosuid,noexec,noatime --process-name fuse
>> --volfile-server=127.0.0.1 --volfile-id=gv_home
>> --fuse-mountopts=nodev,nosuid,noexec,noatime /home
>>
>> Kind Regards,
>> Jaco
>>
>> 
>>
>>
>>
>> Community Meeting Calendar:
>>
>> Schedule -
>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> Bridge: https://meet.google.com/cpu-eiue-hvk
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] poor performance

2022-12-14 Thread Péter Károly JUHÁSZ
When we used glusterfs for websites, we copied the web dir from gluster to
local on frontend boots, then served it from there.

Jaco Kroon  于 2022年12月14日周三 12:49写道:

> Hi All,
>
> We've got a glusterfs cluster that houses some php web sites.
>
> This is generally considered a bad idea and we can see why.
>
> With performance.nl-cache on it actually turns out to be very
> reasonable, however, with this turned of performance is roughly 5x
> worse.  meaning a request that would take sub 500ms now takes 2500ms.
> In other cases we see far, far worse cases, eg, with nl-cache takes
> ~1500ms, without takes ~30s (20x worse).
>
> So why not use nl-cache?  Well, it results in readdir reporting files
> which then fails to open with ENOENT.  The cache also never clears even
> though the configuration says nl-cache entries should only be cached for
> 60s.  Even for "ls -lah" in affected folders you'll notice  mark
> entries for attributes on files.  If this recovers in a reasonable time
> (say, a few seconds, sure).
>
> # gluster volume info
> Type: Replicate
> Volume ID: cbe08331-8b83-41ac-b56d-88ef30c0f5c7
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Options Reconfigured:
> performance.nl-cache: on
> cluster.readdir-optimize: on
> config.client-threads: 2
> config.brick-threads: 4
> config.global-threading: on
> performance.iot-pass-through: on
> storage.fips-mode-rchecksum: on
> cluster.granular-entry-heal: enable
> cluster.data-self-heal-algorithm: full
> cluster.locking-scheme: granular
> client.event-threads: 2
> server.event-threads: 2
> transport.address-family: inet
> nfs.disable: on
> cluster.metadata-self-heal: off
> cluster.entry-self-heal: off
> cluster.data-self-heal: off
> cluster.self-heal-daemon: on
> server.allow-insecure: on
> features.ctime: off
> performance.io-cache: on
> performance.cache-invalidation: on
> features.cache-invalidation: on
> performance.qr-cache-timeout: 600
> features.cache-invalidation-timeout: 600
> performance.io-cache-size: 128MB
> performance.cache-size: 128MB
>
> Are there any other recommendations short of abandon all hope of
> redundancy and to revert to a single-server setup (for the web code at
> least).  Currently the cost of the redundancy seems to outweigh the
> benefit.
>
> Glusterfs version 10.2.  With patch for --inode-table-size, mounts
> happen with:
>
> /usr/sbin/glusterfs --acl --reader-thread-count=2 --lru-limit=524288
> --inode-table-size=524288 --invalidate-limit=16 --background-qlen=32
> --fuse-mountopts=nodev,nosuid,noexec,noatime --process-name fuse
> --volfile-server=127.0.0.1 --volfile-id=gv_home
> --fuse-mountopts=nodev,nosuid,noexec,noatime /home
>
> Kind Regards,
> Jaco
>
> 
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users