Re: [Gluster-users] Extremely high load after 100% full bricks

Dan Bretherton Thu, 01 Nov 2012 09:24:29 -0700

Dear All-

The excessive CPU load problem seems to have been caused by theproblematic upgrade and subsequent downgrade I reported in the followingthread.


http://www.gluster.org/pipermail/gluster-users/2012-November/034643.html

When downgrading to 3.3.0 using yum the glusterfs-server-3.3.1-1packages removal failed because of an RPM script error. On CentOS-5servers "yum remove glusterfs-3.3.1-1.el5" did the trick, but onCentOS-6 I had to forcibly remove the package with "rpm -e --noscriptsglusterfs-server-3.3.1-1.el6.x86_64". I later discovered that the UUIDvalue in /var/lib/glusterd/glusterd.info on the CentOS-6 servers hadchanged, and they were listing themselves in the output of "gluster peerstatus".

I found the original UUID's for the CentOS-6 servers by looking at thefile names in /var/lib/glusterd/peers on other servers, like this.


[root@remus peers]# grep romulus /var/lib/glusterd/peers/*
/var/lib/glusterd/peers/cb21050d-05c2-42b3-8660-230954bab324:hostname1=romulus.nerc-essc.ac.uk

With glusterd stopped on all servers I changed the "UUID=" line in/var/lib/glusterd/glusterd.info back to the original value for eachserver. With glusterd running again on all the servers everythingseemed to go back to normal, except for a lot of self-heal activity onthe servers that had been suffering from the excessive load problem. Ipresume a lot of xattr errors had been caused by those servers nottalking to the others properly while the load was so high.

While looking back at what I did in order to write this message, I havejust discovered another UUID related problem. On some servers the filesin /var/lib/glusterd/peers have the wrong UUID. The "UUID=" line ineach of those files should match the file names but on some servers theydon't. I haven't noticed any adverse effects yet, except for not beingable to do "gluster volume status" on any of the CentOS-6 servers thatwere messed up by the problematic downgrade to 3.3.0. I suppose I willhave to stop glusterd on all the servers again and manually correctthese errors on all the servers. I have 21 of them so it will take awhile, but it could be worse I suppose. I would be interested to knowif there is a quicker way to recover from a mess like this; any suggestions?


-Dan.

On 10/25/2012 04:34 PM, Dan Bretherton wrote:

Dear All-
I'm not sure this excessive server load has anything to do with thebricks having been full. I noticed the full bricks while I wasinvestigating the excessive load, and assumed the two were related.However despite there being plenty of room on all the bricks the loadon this particular pair of servers has been consistently between 60and 80 all week, and this is causing serious problems for users whoare getting repeated I/O errors. The servers are responding so slowlythat GlusterFS isn't working properly, and CLI commands like "glustervolume stop" just time out when issued on any server. Restartingglusterd on all servers has no effect.
Is there any way to limit the load imposed by GlusterFS on a server?I desperately need to reduce it to a level where GlusterFS can workproperly and talk to the other servers without timing out.
-Dan.


On 10/22/2012 02:03 PM, Dan Bretherton wrote:
Dear All-
A replicated pair of servers in my GlusterFS 3.3.0 cluster have beenexperiencing extremely high load for the past few days after areplicated brick pair became 100% full. The GlusterFS related loadon one of the servers was fluctuating at around 60, and this highload would swap to the other server periodically. When I noticed thefull bricks I quickly extended the volume by creating new bricks onanother server, and manually moved some data off the full bricks tocreate space for write operations. The fix-layout operation seemedto start normally but the load then increased even further. Theserver with the high load (then up to about 80) became very slow torespond and I noticed a lot of errors in the VOLNAME-rebalance.logfiles like the following.
[2012-10-22 00:35:52.070364] W[socket.c:1512:__socket_proto_state_machine] 0-atmos-client-10:reading from socket failed. Error (Transport endpoint is notconnected), peer (192.171.166.92:24052)[2012-10-22 00:35:52.070446] E [rpc-clnt.c:373:saved_frames_unwind](-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xe7) [0x2b3fd905c547](-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb2)[0x2b3fd905bf42](-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x2b3fd905bbfe]))) 0-atmos-client-10: forced unwinding frametype(GlusterFS 3.1) op(INODELK(29)) called at 2012-10-2200:35:45.454529 (xid=0x285951x)
There have also been occasional errors like the following, referringto the pair of bricks that became 100% full.
[2012-10-22 01:32:52.827044] W[client3_1-fops.c:5517:client3_1_readdir] 0-atmos-client-15:(00000000-0000-0000-0000-000000000000) remote_fd is -1. EBADFD[2012-10-22 09:49:21.103066] W[client3_1-fops.c:5628:client3_1_readdirp] 0-atmos-client-14:(00000000-0000-0000-0000-000000000000) remote_fd is -1. EBADFD
The log files from the bricks that were 100% full have a lot of theseerrors in, from the period after I freed up some space on them.
[2012-10-22 00:40:56.246075] E [server.c:176:server_submit_reply](-->/usr/lib64/libglusterfs.so.0(default_inodelk_cbk+0xa4)[0x361da23e84](-->/usr/lib64/glusterfs/3.3.0/xlator/debug/io-stats.so(io_stats_inodelk_cbk+0xd8)[0x2aaaabd74d48](-->/usr/lib64/glusterfs/3.3.0/xlator/protocol/server.so(server_inodelk_cbk+0x10b)[0x2aaaabf9742b]))) 0-: Reply submission failed[2012-10-22 00:40:56.246117] I[server-helpers.c:629:server_connection_destroy] 0-atmos-server:destroyed connection ofbdan10.nerc-essc.ac.uk-13609-2012/10/21-23:04:53:323865-atmos-client-15-0
All these errors have only occurred on the replicated pair of serversthat had suffered from 100% full bricks. I don't know if the errorsare being caused by the high load (resulting in poor communicationwith other peers for example) or if the high load is the result ofreplication and/or distribution errors. I have tried various thingsto bring the load down, including un-mounting the volume and stoppingthe fix-layout operation, but the only thing that works is stoppingthe volume. Obviously I can't do that for long because people need touse the data, but with the load as high as it is data access is veryslow and users are experiencing a lot of temporary I/O errors.Bricks from several volumes are on those servers so everybody in thedepartment is being affected by this problem. I thought at firstthat the load was being caused by self-heal operations fixing errorscaused by write failures that occurred when the bricks were full, butit is glusterfs threads that are causing the high load, not glustershd.
Can anyone suggest a way to bring the load down so people can accessthe data properly again? Also, can I trust GlusterFS to eventuallyself-heal the errors causing the above error messages?
Regards,
-Dan.

_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Extremely high load after 100% full bricks

Reply via email to