On 02/07/2011 11:49 PM, Raghavendra G wrote:
Hi Steve,

Are the back-end file systems working correctly? I am seeing lots of errors in 
server log files while accessing back-end filesystem.

gluster-01-brick.log.1:[2011-01-26 03:43:07.353445] E [posix.c:2193:posix_open] 
post-posix: open on /gluster/01/bri
ck/home/lev/deltah/aadimers/serd/converge/0..75000/serd_phi-psi_hist.4deg.0..75000_map.cmd:
 Read-only file system
gluster-01-brick.log.1:[2011-01-26 03:43:07.353857] E 
[posix.c:678:posix_setattr] post-posix: setattr (utimes) on /
gluster/01/brick/home/lev/deltah/aadimers/serd/converge/0..75000/serd_phi-psi_hist.4deg.0..75000_map.cmd
 failed: Re
ad-only file system
gluster-01-brick.log.1:[2011-01-26 03:43:07.354827] E 
[posix.c:2318:posix_readv] post-posix: read failed on fd=0x7f
28e50dc1c8: Input/output error
gluster-01-brick.log.1:[2011-01-26 03:43:07.357396] E [posix.c:2193:posix_open] 
post-posix: open on /gluster/01/bri
ck/home/lev/deltah/aadimers/serd/converge/0..75000/serd_phi-psi_hist.4deg.0..75000_map.ps:
 Read-only file system
gluster-01-brick.log.1:[2011-01-26 03:43:07.357794] E 
[posix.c:678:posix_setattr] post-posix: setattr (utimes) on /
gluster/01/brick/home/lev/deltah/aadimers/serd/converge/0..75000/serd_phi-psi_hist.4deg.0..75000_map.ps
 failed: Rea
d-only file system
gluster-01-brick.log.1:[2011-01-26 03:43:07.358865] E 
[posix.c:2318:posix_readv] post-posix: read failed on fd=0x7f
28e50dc1c8: Input/output error
gluster-01-brick.log.1:[2011-01-26 03:43:07.359264] E 
[posix.c:2318:posix_readv] post-posix: read failed on fd=0x7f
28e50dc1c8: Input/output error
gluster-01-brick.log.1:[2011-01-26 03:43:07.359548] E 
[posix.c:2318:posix_readv] post-posix: read failed on fd=0x7f
28e50dc1c8: Input/output error
gluster-01-brick.log.1:[2011-01-26 03:43:07.367163] E 
[posix.c:2318:posix_readv] post-posix: read failed on fd=0x7f

I am seeing other errors, which indicate that the backend is read-only 
filesystem. Due to this distribute and replicate are not able to store the 
metadata (using xattrs), which in turn is resulting in lots of split-brains and 
layout NULL errors. Can you please check the backend file system?

regards,


Yes, the filesystem was read-only for a time when a disk failed. We then rebuilt the brick on that disk from the corresponding brick in the second server (with the volume stopped, of course) using:
    rsync -aXv brick/ stanley:/gluster/06/brick/

Following some instructions we found on the mailing list we then:
    1)  deleted the volume
2) ran "find /gluster -exec setfattr -x trusted.gfid \{\} \;" on the bricks
    3)  created the volume again
    4)  mounted the volume
5) ran "find . -print0 | xargs --null stat > /dev/null" on the mounted volume

This returned us to what seemed to be a stable state (i.e., no errors from running "ls -alR" from the top of the volume). Then after putting the volume back into service, these errors started occurring again. I have noticed that turning off "performance.stat-prefetch" has brought about a great improvement. We continue to see some errors like this on one of the servers:

   [2011-02-08 14:22:08.360799] I [dht-common.c:369:dht_revalidate_cbk]
   post-dht: subvolume post-replicate-1 returned -1 (Invalid argument)
   [2011-02-08 14:22:08.836672] I [dht-common.c:369:dht_revalidate_cbk]
   post-dht: subvolume post-replicate-4 returned -1 (Invalid argument)
   [2011-02-08 14:22:39.468388] I [dht-common.c:369:dht_revalidate_cbk]
   post-dht: subvolume post-replicate-0 returned -1 (Invalid argument)
   [2011-02-08 14:22:39.468436] W [fuse-bridge.c:184:fuse_entry_cbk]
   glusterfs-fuse: 22465136: LOOKUP() /home/lev/.Xauthority => -1
   (Invalid argument)
   [2011-02-08 14:22:40.462910] I [dht-common.c:369:dht_revalidate_cbk]
   post-dht: subvolume post-replicate-5 returned -1 (Invalid argument)
   [2011-02-08 14:22:40.462958] W [fuse-bridge.c:184:fuse_entry_cbk]
   glusterfs-fuse: 22466110: LOOKUP() /home/lev/.viminfo => -1 (Invalid
   argument)

And the user sees:

   root@stanley:/net/post/lev# ls -al .viminfo .Xauthority
   ls: cannot access .viminfo: Invalid argument
   ls: cannot access .Xauthority: Invalid argument

But only from one client (which also happens to be the server giving the errors above). Another client (the other server) shows these same files without problem:

   root@pablo:/net/post/lev# ls -al .viminfo .Xauthority
   -rw------- 1 lev post 9400 2011-02-07 22:52 .viminfo
   -rw------- 1 lev post 7401 2011-02-08 00:27 .Xauthority


Steve

----- Original Message -----
From: "Steve Wilson"<ste...@purdue.edu>
To: "Lakshmipathi"<lakshmipa...@gluster.com>
Cc: "Raghavendra G"<raghaven...@gluster.com>
Sent: Thursday, February 3, 2011 7:21:36 PM
Subject: Re: [Gluster-users] 3.1.2 with "No such file" and "Invalid argument" 
errors
Hi,

Thanks for looking into this. Any ideas so far? Or anything you'd like
me to try?

Here's some other perhaps relevant information:
* all bricks are formatted ext4 and mounted with the noatime option
in addition to default options
* servers and clients are running Ubuntu 10.04
* I did try mounting the GlusterFS volume with direct-io-mode
disabled but that didn't fix the problem

Thanks!

Steve

On 02/01/2011 07:35 AM, Lakshmipathi wrote:
Hi,
Could you please sent us client and server log files?


--
Steven M. Wilson, Systems and Network Manager
Markey Center for Structural Biology
Purdue University
(765) 496-1946

--
Steven M. Wilson, Systems and Network Manager
Markey Center for Structural Biology
Purdue University
(765) 496-1946

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Reply via email to