Below is the top output of running "top -bHd d" on one of the nodes, maybe that 
can help to see what that glusterfsd process is doing?

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
 4375 root      20   0 2856784 120492   8360 D 61.1  0.4 117:09.29 glfs_iotwr001
 4385 root      20   0 2856784 120492   8360 R 61.1  0.4 117:12.92 glfs_iotwr003
 4387 root      20   0 2856784 120492   8360 R 61.1  0.4 117:32.19 glfs_iotwr005
 4388 root      20   0 2856784 120492   8360 R 61.1  0.4 117:28.87 glfs_iotwr006
 4391 root      20   0 2856784 120492   8360 D 61.1  0.4 117:20.71 glfs_iotwr008
 4395 root      20   0 2856784 120492   8360 D 61.1  0.4 117:17.22 glfs_iotwr009
 4405 root      20   0 2856784 120492   8360 R 61.1  0.4 117:19.52 glfs_iotwr00d
 4406 root      20   0 2856784 120492   8360 R 61.1  0.4 117:29.51 glfs_iotwr00e
 4366 root      20   0 2856784 120492   8360 D 55.6  0.4 117:27.58 glfs_iotwr000
 4386 root      20   0 2856784 120492   8360 D 55.6  0.4 117:22.77 glfs_iotwr004
 4390 root      20   0 2856784 120492   8360 D 55.6  0.4 117:26.49 glfs_iotwr007
 4396 root      20   0 2856784 120492   8360 R 55.6  0.4 117:23.68 glfs_iotwr00a
 4376 root      20   0 2856784 120492   8360 D 50.0  0.4 117:36.17 glfs_iotwr002
 4397 root      20   0 2856784 120492   8360 D 50.0  0.4 117:11.09 glfs_iotwr00b
 4403 root      20   0 2856784 120492   8360 R 50.0  0.4 117:26.34 glfs_iotwr00c
 4408 root      20   0 2856784 120492   8360 D 50.0  0.4 117:27.47 glfs_iotwr00f
 9814 root      20   0 2043684  75208   8424 D 22.2  0.2  50:15.20 glfs_iotwr003
28131 root      20   0 2043684  75208   8424 R 22.2  0.2  50:07.46 glfs_iotwr004
 2208 root      20   0 2043684  75208   8424 R 22.2  0.2  49:32.70 glfs_iotwr008
 2372 root      20   0 2043684  75208   8424 R 22.2  0.2  49:52.60 glfs_iotwr009
 2375 root      20   0 2043684  75208   8424 D 22.2  0.2  49:54.08 glfs_iotwr00c
  767 root      39  19       0      0      0 R 16.7  0.0  67:50.83 dbuf_evict
 4132 onadmin   20   0   45292   4184   3176 R 16.7  0.0   0:00.04 top
28484 root      20   0 2043684  75208   8424 R 11.1  0.2  49:41.34 glfs_iotwr005
 2376 root      20   0 2043684  75208   8424 R 11.1  0.2  49:49.49 glfs_iotwr00d
 2719 root      20   0 2043684  75208   8424 R 11.1  0.2  49:58.61 glfs_iotwr00e
 4384 root      20   0 2856784 120492   8360 S  5.6  0.4   4:01.27 glfs_rpcrqhnd
 3842 root      20   0 2043684  75208   8424 S  5.6  0.2   0:30.12 glfs_epoll001
    1 root      20   0   57696   7340   5248 S  0.0  0.0   0:03.59 systemd
    2 root      20   0       0      0      0 S  0.0  0.0   0:09.57 kthreadd
    3 root      20   0       0      0      0 S  0.0  0.0   0:00.16 ksoftirqd/0
    5 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 kworker/0:0H
    7 root      20   0       0      0      0 S  0.0  0.0   0:07.36 rcu_sched
    8 root      20   0       0      0      0 S  0.0  0.0   0:00.00 rcu_bh
    9 root      rt   0       0      0      0 S  0.0  0.0   0:00.03 migration/0
   10 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 lru-add-drain
   11 root      rt   0       0      0      0 S  0.0  0.0   0:00.01 watchdog/0
   12 root      20   0       0      0      0 S  0.0  0.0   0:00.00 cpuhp/0
   13 root      20   0       0      0      0 S  0.0  0.0   0:00.00 cpuhp/1

Any clues anyone?

The load is really high around 20 now on the two nodes...


‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, November 5, 2020 11:50 AM, mabi <m...@protonmail.ch> wrote:

> Hello,
>
> I have a 3 node replica including arbiter GlusterFS 7.8 server with 3 volumes 
> and the two nodes (not arbiter) seem to have a high load due to the 
> glusterfsd brick process taking all CPU resources (12 cores).
>
> Checking these two servers with iostat command shows that the disks are not 
> so busy and that they are mostly doing writes activity. On the FUSE clients 
> there is not so much activity so I was wondering how to find out or explain 
> why GlusterFS is currently generating such a high load on these two servers 
> (the arbiter does not show any high load). There are no files currently 
> healing either. This volume is the only volume which has the quota enabled if 
> this might be a hint. So does anyone know how to see why GlusterFS is so busy 
> on a specific volume?
>
> Here is a sample "vmstat 60" of one of the nodes:
>
> onadmin@gfs1b:~$ vmstat 60
> procs -----------memory---------- ---swap-- -----io---- -system-- 
> ------cpu-----
> r b swpd free buff cache si so bi bo in cs us sy id wa st
> 9 2 0 22296776 32004 260284 0 0 33 301 153 39 2 60 36 2 0
> 13 0 0 22244540 32048 260456 0 0 343 2798 10898 367652 2 80 16 1 0
> 18 0 0 22215740 32056 260672 0 0 308 2524 9892 334537 2 83 14 1 0
> 18 0 0 22179348 32084 260828 0 0 169 2038 8703 250351 1 88 10 0 0
>
> I already tried rebooting but that did not help and there is nothing special 
> in the log files either.
>
> Best regards,
> Mabi


________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to