Background:  4 identical gluster servers with 15 TB each in 2x2 setup.
CentOS Linux release 7.3.1611 (Core)
glusterfs-server-3.9.1-1.el7.x86_64
client systems are using:
glusterfs-client         3.5.2-2+deb8u3

The cluster has ~12 TB in use with 21 million files.  Lots of jpgs.  About 12 
clients are mounting gluster volumes.  

Network load is light: iftop shows each server has 10-15 Mbit reads and about 
half that in writes.

What I’m seeing that concerns me is that one box, gluster4, has roughly twice 
the CPU utilization and twice or more the load average of the other three 
servers.  gluster4 has a 24 hour average of about 30% CPU utilization, 
something that seems to me to be way out of line for a couple MB/sec of traffic.

In running volume top, the odd thing I see is that for gluster1-3 I get latency 
summaries like this:
Brick: gluster1.publicinteractive.com:/gluster/drupal_prod
—————————————————————————————
%-latency  Avg-latency  Min-Latency  Max-Latency   No. of calls       Fop
 --------  -----------  -----------  -----------   ------------      ----

 9.96     675.07 us      15.00 us 1067793.00 us         205060     INODELK 
15.85    3414.20 us      16.00 us  773621.00 us          64494        READ
51.35    2235.96 us      12.00 us 1093609.00 us         319120      LOOKUP

… but my problem server has far more inodelk latency:

12.01    4712.03 us      17.00 us 1773590.00 us          47214        READ
27.50    2390.27 us      14.00 us 1877571.00 us         213121     INODELK
28.70    1643.65 us      12.00 us 1837696.00 us         323407      LOOKUP

The servers are intended to be identical, and are indeed identical hardware.

Suggestions on where to look or which FM to RT ver welcome indeed.

Thanks,

David




_______________________________________________
Gluster-users mailing list
[email protected]
http://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to