Dear All-
The excessive CPU load problem seems to have been caused by the problematic upgrade and subsequent downgrade I reported in the following thread.

http://www.gluster.org/pipermail/gluster-users/2012-November/034643.html

When downgrading to 3.3.0 using yum the glusterfs-server-3.3.1-1 packages removal failed because of an RPM script error. On CentOS-5 servers "yum remove glusterfs-3.3.1-1.el5" did the trick, but on CentOS-6 I had to forcibly remove the package with "rpm -e --noscripts glusterfs-server-3.3.1-1.el6.x86_64". I later discovered that the UUID value in /var/lib/glusterd/glusterd.info on the CentOS-6 servers had changed, and they were listing themselves in the output of "gluster peer status".

I found the original UUID's for the CentOS-6 servers by looking at the file names in /var/lib/glusterd/peers on other servers, like this.

[root@remus peers]# grep romulus /var/lib/glusterd/peers/*
/var/lib/glusterd/peers/cb21050d-05c2-42b3-8660-230954bab324:hostname1=romulus.nerc-essc.ac.uk

With glusterd stopped on all servers I changed the "UUID=" line in /var/lib/glusterd/glusterd.info back to the original value for each server. With glusterd running again on all the servers everything seemed to go back to normal, except for a lot of self-heal activity on the servers that had been suffering from the excessive load problem. I presume a lot of xattr errors had been caused by those servers not talking to the others properly while the load was so high.

While looking back at what I did in order to write this message, I have just discovered another UUID related problem. On some servers the files in /var/lib/glusterd/peers have the wrong UUID. The "UUID=" line in each of those files should match the file names but on some servers they don't. I haven't noticed any adverse effects yet, except for not being able to do "gluster volume status" on any of the CentOS-6 servers that were messed up by the problematic downgrade to 3.3.0. I suppose I will have to stop glusterd on all the servers again and manually correct these errors on all the servers. I have 21 of them so it will take a while, but it could be worse I suppose. I would be interested to know if there is a quicker way to recover from a mess like this; any suggestions?

-Dan.

On 10/25/2012 04:34 PM, Dan Bretherton wrote:
Dear All-
I'm not sure this excessive server load has anything to do with the bricks having been full. I noticed the full bricks while I was investigating the excessive load, and assumed the two were related. However despite there being plenty of room on all the bricks the load on this particular pair of servers has been consistently between 60 and 80 all week, and this is causing serious problems for users who are getting repeated I/O errors. The servers are responding so slowly that GlusterFS isn't working properly, and CLI commands like "gluster volume stop" just time out when issued on any server. Restarting glusterd on all servers has no effect.

Is there any way to limit the load imposed by GlusterFS on a server? I desperately need to reduce it to a level where GlusterFS can work properly and talk to the other servers without timing out.

-Dan.


On 10/22/2012 02:03 PM, Dan Bretherton wrote:
Dear All-
A replicated pair of servers in my GlusterFS 3.3.0 cluster have been experiencing extremely high load for the past few days after a replicated brick pair became 100% full. The GlusterFS related load on one of the servers was fluctuating at around 60, and this high load would swap to the other server periodically. When I noticed the full bricks I quickly extended the volume by creating new bricks on another server, and manually moved some data off the full bricks to create space for write operations. The fix-layout operation seemed to start normally but the load then increased even further. The server with the high load (then up to about 80) became very slow to respond and I noticed a lot of errors in the VOLNAME-rebalance.log files like the following.

[2012-10-22 00:35:52.070364] W [socket.c:1512:__socket_proto_state_machine] 0-atmos-client-10: reading from socket failed. Error (Transport endpoint is not connected), peer (192.171.166.92:24052) [2012-10-22 00:35:52.070446] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xe7) [0x2b3fd905c547] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb2) [0x2b3fd905bf42] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x2b3fd905bbfe]))) 0-atmos-client-10: forced unwinding frame type(GlusterFS 3.1) op(INODELK(29)) called at 2012-10-22 00:35:45.454529 (xid=0x285951x)

There have also been occasional errors like the following, referring to the pair of bricks that became 100% full.

[2012-10-22 01:32:52.827044] W [client3_1-fops.c:5517:client3_1_readdir] 0-atmos-client-15: (00000000-0000-0000-0000-000000000000) remote_fd is -1. EBADFD [2012-10-22 09:49:21.103066] W [client3_1-fops.c:5628:client3_1_readdirp] 0-atmos-client-14: (00000000-0000-0000-0000-000000000000) remote_fd is -1. EBADFD

The log files from the bricks that were 100% full have a lot of these errors in, from the period after I freed up some space on them.

[2012-10-22 00:40:56.246075] E [server.c:176:server_submit_reply] (-->/usr/lib64/libglusterfs.so.0(default_inodelk_cbk+0xa4) [0x361da23e84] (-->/usr/lib64/glusterfs/3.3.0/xlator/debug/io-stats.so(io_stats_inodelk_cbk+0xd8) [0x2aaaabd74d48] (-->/usr/lib64/glusterfs/3.3.0/xlator/protocol/server.so(server_inodelk_cbk+0x10b) [0x2aaaabf9742b]))) 0-: Reply submission failed [2012-10-22 00:40:56.246117] I [server-helpers.c:629:server_connection_destroy] 0-atmos-server: destroyed connection of bdan10.nerc-essc.ac.uk-13609-2012/10/21-23:04:53:323865-atmos-client-15-0

All these errors have only occurred on the replicated pair of servers that had suffered from 100% full bricks. I don't know if the errors are being caused by the high load (resulting in poor communication with other peers for example) or if the high load is the result of replication and/or distribution errors. I have tried various things to bring the load down, including un-mounting the volume and stopping the fix-layout operation, but the only thing that works is stopping the volume. Obviously I can't do that for long because people need to use the data, but with the load as high as it is data access is very slow and users are experiencing a lot of temporary I/O errors. Bricks from several volumes are on those servers so everybody in the department is being affected by this problem. I thought at first that the load was being caused by self-heal operations fixing errors caused by write failures that occurred when the bricks were full, but it is glusterfs threads that are causing the high load, not glustershd.

Can anyone suggest a way to bring the load down so people can access the data properly again? Also, can I trust GlusterFS to eventually self-heal the errors causing the above error messages?

Regards,
-Dan.
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Reply via email to