[Gluster-devel] inode linking in GlusterFS NFS server

2014-07-07 Thread Raghavendra Bhat


Hi,

As per my understanding nfs server is not doing inode linking in 
readdirp callback. Because of this there might be some errors while 
dealing with virtual inodes (or gfids). As of now meta, gfid-access and 
snapview-server (used for user serviceable snapshots) xlators makes use 
of virtual inodes with random gfids. The situation is this:


Say User serviceable snapshot feature has been enabled and there are 2 
snapshots (snap1 and snap2). Let /mnt/nfs be the nfs mount. Now the 
snapshots can be accessed by entering .snaps directory.  Now if snap1 
directory is entered and *ls -l* is done (i.e. cd 
/mnt/nfs/.snaps/snap1 and then ls -l),  the readdirp fop is sent to 
the snapview-server xlator (which is part of a daemon running for the 
volume), which talks to the corresponding snapshot volume and gets the 
dentry list. Before unwinding it would have generated random gfids for 
those dentries.


Now nfs server upon getting readdirp reply, will associate the gfid with 
the filehandle created for the entry. But without linking the inode, it 
would send the readdirp reply back to nfs client. Now next time when nfs 
client makes some operation on one of those filehandles, nfs server 
tries to resolve it by finding the inode for the gfid present in the 
filehandle. But since the inode was not linked in readdirp, inode_find 
operation fails and it tries to do a hard resolution by sending the 
lookup operation on that gfid to the normal main graph. (The information 
on whether the call should be sent to main graph or snapview-server 
would be present in the inode context. But here the lookup has come on a 
gfid with a newly created inode where the context is not there yet. So 
the call would be sent to the main graph itself). But since the gfid is 
a randomly generated virtual gfid (not present on disk), the lookup 
operation fails giving error.


As per my understanding this can happen with any xlator that deals with 
virtual inodes (by generating random gfids).


I can think of these 2 methods to handle this:
1)  do inode linking for readdirp also in nfs server
2)  If lookup operation fails, snapview-client xlator (which actually 
redirects the fops on snapshot world to snapview-server by looking into 
the inode context) should check if the failed lookup is a nameless 
lookup. If so, AND the gfid of the inode is NULL AND lookup has come 
from main graph, then instead of unwinding the lookup with failure, send 
it to snapview-server which might be able to find the inode for the gfid 
(as the gfid was generated by itself, it should be able to find the 
inode for that gfid unless and until it has been purged from the inode 
table).



Please let me know if I have missed anything. Please provide feedback.

Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] inode linking in GlusterFS NFS server

2014-07-07 Thread Anand Avati
On Mon, Jul 7, 2014 at 12:48 PM, Raghavendra Bhat rab...@redhat.com wrote:


 Hi,

 As per my understanding nfs server is not doing inode linking in readdirp
 callback. Because of this there might be some errors while dealing with
 virtual inodes (or gfids). As of now meta, gfid-access and snapview-server
 (used for user serviceable snapshots) xlators makes use of virtual inodes
 with random gfids. The situation is this:

 Say User serviceable snapshot feature has been enabled and there are 2
 snapshots (snap1 and snap2). Let /mnt/nfs be the nfs mount. Now the
 snapshots can be accessed by entering .snaps directory.  Now if snap1
 directory is entered and *ls -l* is done (i.e. cd /mnt/nfs/.snaps/snap1
 and then ls -l),  the readdirp fop is sent to the snapview-server xlator
 (which is part of a daemon running for the volume), which talks to the
 corresponding snapshot volume and gets the dentry list. Before unwinding it
 would have generated random gfids for those dentries.

 Now nfs server upon getting readdirp reply, will associate the gfid with
 the filehandle created for the entry. But without linking the inode, it
 would send the readdirp reply back to nfs client. Now next time when nfs
 client makes some operation on one of those filehandles, nfs server tries
 to resolve it by finding the inode for the gfid present in the filehandle.
 But since the inode was not linked in readdirp, inode_find operation fails
 and it tries to do a hard resolution by sending the lookup operation on
 that gfid to the normal main graph. (The information on whether the call
 should be sent to main graph or snapview-server would be present in the
 inode context. But here the lookup has come on a gfid with a newly created
 inode where the context is not there yet. So the call would be sent to the
 main graph itself). But since the gfid is a randomly generated virtual gfid
 (not present on disk), the lookup operation fails giving error.

 As per my understanding this can happen with any xlator that deals with
 virtual inodes (by generating random gfids).

 I can think of these 2 methods to handle this:
 1)  do inode linking for readdirp also in nfs server
 2)  If lookup operation fails, snapview-client xlator (which actually
 redirects the fops on snapshot world to snapview-server by looking into the
 inode context) should check if the failed lookup is a nameless lookup. If
 so, AND the gfid of the inode is NULL AND lookup has come from main graph,
 then instead of unwinding the lookup with failure, send it to
 snapview-server which might be able to find the inode for the gfid (as the
 gfid was generated by itself, it should be able to find the inode for that
 gfid unless and until it has been purged from the inode table).


 Please let me know if I have missed anything. Please provide feedback.



That's right. NFS server should be linking readdirp_cbk inodes just like
FUSE or protocol/server. It has been OK without virtual gfids thus far.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] inode linking in GlusterFS NFS server

2014-07-07 Thread Raghavendra Bhat

On Tuesday 08 July 2014 01:21 AM, Anand Avati wrote:
On Mon, Jul 7, 2014 at 12:48 PM, Raghavendra Bhat rab...@redhat.com 
mailto:rab...@redhat.com wrote:



Hi,

As per my understanding nfs server is not doing inode linking in
readdirp callback. Because of this there might be some errors
while dealing with virtual inodes (or gfids). As of now meta,
gfid-access and snapview-server (used for user serviceable
snapshots) xlators makes use of virtual inodes with random gfids.
The situation is this:

Say User serviceable snapshot feature has been enabled and there
are 2 snapshots (snap1 and snap2). Let /mnt/nfs be the nfs
mount. Now the snapshots can be accessed by entering .snaps
directory.  Now if snap1 directory is entered and *ls -l* is done
(i.e. cd /mnt/nfs/.snaps/snap1 and then ls -l),  the readdirp
fop is sent to the snapview-server xlator (which is part of a
daemon running for the volume), which talks to the corresponding
snapshot volume and gets the dentry list. Before unwinding it
would have generated random gfids for those dentries.

Now nfs server upon getting readdirp reply, will associate the
gfid with the filehandle created for the entry. But without
linking the inode, it would send the readdirp reply back to nfs
client. Now next time when nfs client makes some operation on one
of those filehandles, nfs server tries to resolve it by finding
the inode for the gfid present in the filehandle. But since the
inode was not linked in readdirp, inode_find operation fails and
it tries to do a hard resolution by sending the lookup operation
on that gfid to the normal main graph. (The information on whether
the call should be sent to main graph or snapview-server would be
present in the inode context. But here the lookup has come on a
gfid with a newly created inode where the context is not there
yet. So the call would be sent to the main graph itself). But
since the gfid is a randomly generated virtual gfid (not present
on disk), the lookup operation fails giving error.

As per my understanding this can happen with any xlator that deals
with virtual inodes (by generating random gfids).

I can think of these 2 methods to handle this:
1)  do inode linking for readdirp also in nfs server
2)  If lookup operation fails, snapview-client xlator (which
actually redirects the fops on snapshot world to snapview-server
by looking into the inode context) should check if the failed
lookup is a nameless lookup. If so, AND the gfid of the inode is
NULL AND lookup has come from main graph, then instead of
unwinding the lookup with failure, send it to snapview-server
which might be able to find the inode for the gfid (as the gfid
was generated by itself, it should be able to find the inode for
that gfid unless and until it has been purged from the inode table).


Please let me know if I have missed anything. Please provide feedback.



That's right. NFS server should be linking readdirp_cbk inodes just 
like FUSE or protocol/server. It has been OK without virtual gfids 
thus far.


I did the changes to link inodes in readdirp_cbk in nfs server. It seems 
to work fine. Should we need the second change also? (i.e chage in the 
snapview-client to redirect the fresh nameless lookups to 
snapview-server). With nfs server linking the inodes in readdirp, I 
think second change might not be needed.


Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel