Dear All-
The excessive CPU load problem seems to have been caused by the
problematic upgrade and subsequent downgrade I reported in the following
thread.
http://www.gluster.org/pipermail/gluster-users/2012-November/034643.html
When downgrading to 3.3.0 using yum the glusterfs-server-3.3.1-1
packages removal failed because of an RPM script error. On CentOS-5
servers "yum remove glusterfs-3.3.1-1.el5" did the trick, but on
CentOS-6 I had to forcibly remove the package with "rpm -e --noscripts
glusterfs-server-3.3.1-1.el6.x86_64". I later discovered that the UUID
value in /var/lib/glusterd/glusterd.info on the CentOS-6 servers had
changed, and they were listing themselves in the output of "gluster peer
status".
I found the original UUID's for the CentOS-6 servers by looking at the
file names in /var/lib/glusterd/peers on other servers, like this.
[root@remus peers]# grep romulus /var/lib/glusterd/peers/*
/var/lib/glusterd/peers/cb21050d-05c2-42b3-8660-230954bab324:hostname1=romulus.nerc-essc.ac.uk
With glusterd stopped on all servers I changed the "UUID=" line in
/var/lib/glusterd/glusterd.info back to the original value for each
server. With glusterd running again on all the servers everything
seemed to go back to normal, except for a lot of self-heal activity on
the servers that had been suffering from the excessive load problem. I
presume a lot of xattr errors had been caused by those servers not
talking to the others properly while the load was so high.
While looking back at what I did in order to write this message, I have
just discovered another UUID related problem. On some servers the files
in /var/lib/glusterd/peers have the wrong UUID. The "UUID=" line in
each of those files should match the file names but on some servers they
don't. I haven't noticed any adverse effects yet, except for not being
able to do "gluster volume status" on any of the CentOS-6 servers that
were messed up by the problematic downgrade to 3.3.0. I suppose I will
have to stop glusterd on all the servers again and manually correct
these errors on all the servers. I have 21 of them so it will take a
while, but it could be worse I suppose. I would be interested to know
if there is a quicker way to recover from a mess like this; any suggestions?
-Dan.
On 10/25/2012 04:34 PM, Dan Bretherton wrote:
Dear All-
I'm not sure this excessive server load has anything to do with the
bricks having been full. I noticed the full bricks while I was
investigating the excessive load, and assumed the two were related.
However despite there being plenty of room on all the bricks the load
on this particular pair of servers has been consistently between 60
and 80 all week, and this is causing serious problems for users who
are getting repeated I/O errors. The servers are responding so slowly
that GlusterFS isn't working properly, and CLI commands like "gluster
volume stop" just time out when issued on any server. Restarting
glusterd on all servers has no effect.
Is there any way to limit the load imposed by GlusterFS on a server?
I desperately need to reduce it to a level where GlusterFS can work
properly and talk to the other servers without timing out.
-Dan.
On 10/22/2012 02:03 PM, Dan Bretherton wrote:
Dear All-
A replicated pair of servers in my GlusterFS 3.3.0 cluster have been
experiencing extremely high load for the past few days after a
replicated brick pair became 100% full. The GlusterFS related load
on one of the servers was fluctuating at around 60, and this high
load would swap to the other server periodically. When I noticed the
full bricks I quickly extended the volume by creating new bricks on
another server, and manually moved some data off the full bricks to
create space for write operations. The fix-layout operation seemed
to start normally but the load then increased even further. The
server with the high load (then up to about 80) became very slow to
respond and I noticed a lot of errors in the VOLNAME-rebalance.log
files like the following.
[2012-10-22 00:35:52.070364] W
[socket.c:1512:__socket_proto_state_machine] 0-atmos-client-10:
reading from socket failed. Error (Transport endpoint is not
connected), peer (192.171.166.92:24052)
[2012-10-22 00:35:52.070446] E [rpc-clnt.c:373:saved_frames_unwind]
(-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xe7) [0x2b3fd905c547]
(-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb2)
[0x2b3fd905bf42]
(-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)
[0x2b3fd905bbfe]))) 0-atmos-client-10: forced unwinding frame
type(GlusterFS 3.1) op(INODELK(29)) called at 2012-10-22
00:35:45.454529 (xid=0x285951x)
There have also been occasional errors like the following, referring
to the pair of bricks that became 100% full.
[2012-10-22 01:32:52.827044] W
[client3_1-fops.c:5517:client3_1_readdir] 0-atmos-client-15:
(00000000-0000-0000-0000-000000000000) remote_fd is -1. EBADFD
[2012-10-22 09:49:21.103066] W
[client3_1-fops.c:5628:client3_1_readdirp] 0-atmos-client-14:
(00000000-0000-0000-0000-000000000000) remote_fd is -1. EBADFD
The log files from the bricks that were 100% full have a lot of these
errors in, from the period after I freed up some space on them.
[2012-10-22 00:40:56.246075] E [server.c:176:server_submit_reply]
(-->/usr/lib64/libglusterfs.so.0(default_inodelk_cbk+0xa4)
[0x361da23e84]
(-->/usr/lib64/glusterfs/3.3.0/xlator/debug/io-stats.so(io_stats_inodelk_cbk+0xd8)
[0x2aaaabd74d48]
(-->/usr/lib64/glusterfs/3.3.0/xlator/protocol/server.so(server_inodelk_cbk+0x10b)
[0x2aaaabf9742b]))) 0-: Reply submission failed
[2012-10-22 00:40:56.246117] I
[server-helpers.c:629:server_connection_destroy] 0-atmos-server:
destroyed connection of
bdan10.nerc-essc.ac.uk-13609-2012/10/21-23:04:53:323865-atmos-client-15-0
All these errors have only occurred on the replicated pair of servers
that had suffered from 100% full bricks. I don't know if the errors
are being caused by the high load (resulting in poor communication
with other peers for example) or if the high load is the result of
replication and/or distribution errors. I have tried various things
to bring the load down, including un-mounting the volume and stopping
the fix-layout operation, but the only thing that works is stopping
the volume. Obviously I can't do that for long because people need to
use the data, but with the load as high as it is data access is very
slow and users are experiencing a lot of temporary I/O errors.
Bricks from several volumes are on those servers so everybody in the
department is being affected by this problem. I thought at first
that the load was being caused by self-heal operations fixing errors
caused by write failures that occurred when the bricks were full, but
it is glusterfs threads that are causing the high load, not glustershd.
Can anyone suggest a way to bring the load down so people can access
the data properly again? Also, can I trust GlusterFS to eventually
self-heal the errors causing the above error messages?
Regards,
-Dan.
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users