Hi RafiKC, Thanks for replying I've attached a tar.gz file with code and repro.txt to help reproduce the issue
One point to note as well was our original mount options was as follow - mount -t glusterfs -o acl <host>:/volume1 /mnt/... Introducing the entry_timeout option mount -t glusterfs -o acl, *entry-timeout=0* (entry_timeout from the ubuntu fuse documentation The timeout in seconds for which name lookups will be cached.) does away with the stale file handle error. I suppose this comes with a performance hit, any insight on this is from gluster's point is appreciated as well. Log snippet from brick/data-glusterfs-volume1-brick1-brick.log (please let me know if there was another log file that I can get information) [2016-03-24 00:23:53.718403] W [server-resolve.c:437:resolve_anonfd_simple] 0-server: inode for the gfid (2af5d8c7-2321-4d77-bda9-ad883ae8d230) is not found. anonymous fd creation failed [2016-03-24 00:23:53.718543] W [server-resolve.c:437:resolve_anonfd_simple] 0-server: inode for the gfid (2af5d8c7-2321-4d77-bda9-ad883ae8d230) is not found. anonymous fd creation failed [2016-03-24 00:23:53.718573] I [server-rpc-fops.c:1235:server_fstat_cbk] 0-volume1-server: 15309: FSTAT -2 (2af5d8c7-2321-4d77-bda9-ad883ae8d230) ==> (No such file or directory) [2016-03-24 00:24:09.679523] W [server-resolve.c:437:resolve_anonfd_simple] 0-server: inode for the gfid (02b2e084-261b-45c5-8283-d7babd219a4d) is not found. anonymous fd creation failed [2016-03-24 00:24:09.679602] W [server-resolve.c:437:resolve_anonfd_simple] 0-server: inode for the gfid (02b2e084-261b-45c5-8283-d7babd219a4d) is not found. anonymous fd creation failed [2016-03-24 00:24:09.679618] I [server-rpc-fops.c:1235:server_fstat_cbk] 0-volume1-server: 23548: FSTAT -2 (02b2e084-261b-45c5-8283-d7babd219a4d) ==> (No such file or directory) [2016-03-24 00:24:53.789620] W [server-resolve.c:437:resolve_anonfd_simple] 0-server: inode for the gfid (74b2fd0e-966a-40fd-9554-bf30e2c0b7ea) is not found. anonymous fd creation failed [2016-03-24 00:24:53.789694] W [server-resolve.c:437:resolve_anonfd_simple] 0-server: inode for the gfid (74b2fd0e-966a-40fd-9554-bf30e2c0b7ea) is not found. anonymous fd creation failed [2016-03-24 00:24:53.789709] I [server-rpc-fops.c:1235:server_fstat_cbk] 0-volume1-server: 38694: FSTAT -2 (74b2fd0e-966a-40fd-9554-bf30e2c0b7ea) ==> (No such file or directory) === Log snippet from /var/log/glusterfs/mnt-repovolume1.log (on client side) [2016-03-24 00:24:53.747930] I [dht-rename.c:1344:dht_rename] 0-volume1-dht: renaming /hhh/master.lock (hash=volume1-replicate-0/cache=volume1-replicate-0) => /hhh/master (hash=volume1-replicate-0/cache=volume1-replicate-0) [2016-03-24 00:24:53.760365] I [dht-rename.c:1344:dht_rename] 0-volume1-dht: renaming /hhh/master.lock (hash=volume1-replicate-0/cache=volume1-replicate-0) => /hhh/master (hash=volume1-replicate-0/cache=volume1-replicate-0) [2016-03-24 00:24:53.778746] W [client-rpc-fops.c:1472:client3_3_fstat_cbk] 0-volume1-client-1: remote operation failed: No such file or directory [2016-03-24 00:24:53.779381] W [MSGID: 108008] [afr-read-txn.c:225:afr_read_txn] 0-volume1-replicate-0: Unreadable subvolume -1 found with event generation 2. (Possible split-brain) [2016-03-24 00:24:53.779816] E [dht-helper.c:940:dht_migration_complete_check_task] 0-volume1-dht: (null): failed to get the 'linkto' xattr Stale file handle Log snippet on my program [master] Size: 64 bytes. nlink 1, Inode: 8be6731fd9f9fbe4 File Permissions:-rw-r--r--commit() filename = master, lock_filename = master.lock read(master) :: Stale file handle Thanks Rama On Tue, Mar 22, 2016 at 11:25 PM, Mohammed Rafi K C <[email protected]> wrote: > > > On 03/23/2016 04:10 AM, Rama Shenai wrote: > > Hi, We had some questions with respect to expectations of atomicity of > rename in gluster. > > To elaborate : > > We have setup with two machines (lets call them M1 and M2) which > essentially access a file (F) on a gluster volume (mounted by M1 and M2) > A program does the following steps sequentially on each of the two > machines (M1 & M2) in an infinite loop > 1) Opens the file F in O_RDWR|O_EXCL mode, reads some data and closes (F) > 2) Renames some other file F' => F > > Periodically either M1 or M2 sees a "Stalefile handle error" when it tries > to read the file (step (1)) after opening the file in O_RDWR|O_EXCL (the > open is successful) > > The specific error reported the client volume logs > (/var/log/glusterfs/mnt-repos-volume1.log) > [2016-03-21 16:53:17.897902] I [dht-rename.c:1344:dht_rename] > 0-volume1-dht: renaming master.lock > (hash=volume1-replicate-0/cache=volume1-replicate-0) => master > (hash=volume1-replicate-0/cache=<nul>) > [2016-03-21 16:53:18.735090] W [client-rpc-fops.c:504:client3_3_stat_cbk] > 0-volume1-client-0: remote operation failed: Stale file handle > > > Hi Rama, > > ESTALE error in rename normally generated when either the source file is > not resolvable (deleted or inaccessible) or when the parent of destination > is not resolvable. It can happen when let's say file F' was present when > your application did a lookup before rename, but if it is got renamed by > Node M1 before M2 could rename it. Basically a race between two rename on > the same file can result in ESTALE for either of one. > > To confirm this, Can you please paste the log message from brick > "0-volume1-client-0". You can find out the brick name from the graph. > > Also if you can share the program or snippet that used to reproduce this > issue, that would be great. > > Rafi KC > > > > > We see no error when: have two processes of the above program running on > the same machine (say on M1) accessing the file F on the gluster volume, > for which we want to understand the expectations of atomicity in gluster > specifically specifically for rename, and if the above is a bug. > > Also glusterfs --version => glusterfs 3.6.9 built on Mar 2 2016 18:21:14 > > We also would like to know if there any parameter in the one translators > that we can tweak to prevent this problem > > Any help or insights here is appreciated > > Thanks > Rama > > > > _______________________________________________ > Gluster-users mailing > [email protected]http://www.gluster.org/mailman/listinfo/gluster-users > > >
stale-repro.tar.gz
Description: GNU Zip compressed data
_______________________________________________ Gluster-users mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-users
