Re: [Nfs-ganesha-devel] Readdir with nfs4err_serverfault
I don't think it's the check for type==DIRECTORY. If that fails, you'll get the error message "Failed to get root for..." which is not in the log snippet. The callback is only called if the getattrs() on the junction obj succeeds, so maybe it's failing? A reproducer would help a lot, if that's possible. Daniel On 03/02/2017 10:54 AM, Supriti Singh wrote: > Thanks for reply. > > I investigated further and looking at log it seems that error occurs > only at Junction point somehow. > > /*nfs4_readdir_callback :EXPORT :DEBUG :Need to cross junction to > Export_Id 1 Path / > */ > /**/ > /*nfs4_readdir_callback :RW LOCK :F_DBG :Unlocked 0x1f8ec08 > (>state_hdl->state_lock) at /Protocols/NFS/nfs4_op_readdir.c:256 > populate_dirent :RW LOCK :F_DBG :Got read lock on 0x1f8ec08 > (>state_hdl->state_lock) at /FSAL/fsal_helper.c:1330 > populate_dirent :RW LOCK :F_DBG :Unlocked 0x1f8ec08 > (>state_hdl->state_lock) at /FSAL/fsal_helper.c:1341 > nfs_export_get_root_entry :RW LOCK :F_DBG :Got read lock on 0x1ed6d00 > (>lock) at support/exports.c:2118 > nfs_export_get_root_entry :RW LOCK :F_DBG :Unlocked 0x1ed6d00 > (>lock) at support/exports.c:2123 > mdcache_getattrs :RW LOCK :F_DBG :Got read lock on 0x1f64680 > (>attr_lock) at > /FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:979 > mdcache_getattrs :RW LOCK :F_DBG :Unlocked 0x1f64680 (>attr_lock) > at /FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1026 > mdcache_getattrs :INODE :F_DBG :attrs obj attributes Mask = 0005dfce > NO_FILE_TYPE > > */In function populate_dirent --> call to nfs_export_get_root_entry > checks that the junction mount is a directory. > Only if its a directory call is made to mdcache_getattrs. > I guess somehow in mdcache_refattrs, we get the wrong FILE TYPE. > > But sadly I don't have a reproducer. I will try to write a script that > can reproduce it. > > Thanks, > Supriti > > > -- > Supriti Singh > SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, > HRB 21284 (AG Nürnberg) > Daniel Gryniewicz03/02/17 2:29 PM >>> > That's much better. A few fixes have gone in since 2.4.1 relating to > readdir, mostly relating to large directories, or concurrent access from > multiple clients, or readdir in the presence of add/delete/rename that > turned up during stress testing for RHGS. It might be worth it to > attempt to recreate on 2.4.3, and see if those fixes helped. > > There's also quite a bit of work that has gone into 2.5-dev related to > cephfs, so that may have fixed issues here. Unfortunately, 2.5 isn't > released yet; it will be at least a few weeks before that is ready. But > if the issue can be reproduced in a lab, then checking with 2.5 may be a > useful data point as well. > > Daniel > > On 03/01/2017 04:12 AM, Supriti Singh wrote: >> My mistake. The package corresponds to tag 2.4.1. >> >> >> -- >> Supriti Singh >> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, >> HRB 21284 (AG Nürnberg) >> > Daniel Gryniewicz 02/28/17 7:13 PM >>> >> Okay, that's a very old tag, and lots of changes have gone in since. We >> won't be able to easily nail down what changed, but -dev27 isn't >> necessarily expected to work properly. >> >> Daniel >> >> On 02/28/2017 01:06 PM, Supriti Singh wrote: >>> This package was created from the tag 2.4-dev-27. >>> >>> >>> >>> -- >>> Supriti Singh >>> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, >>> HRB 21284 (AG Nürnberg) >>> >> Daniel Gryniewicz 02/28/17 6:44 PM >>> >>> Which exact version of 2.4 are they using? If it's 2.4.0.3 or earlier, >>> then attribute access was reworked in 2.4.0.4 to fix a lot of races. >>> >>> Daniel >>> >>> On 02/28/2017 12:18 PM, Supriti Singh wrote: Hello All, For one of our client, using nfs-ganesha v2.4 and ceph v10.2.4, readdir fails with error: On nfs-client: "/*Remote I/O error*/" In nfs-ganesha server log (Removed some part of readability) */mdcache_readdir :NFS READDIR :F_DBG :About to readdir in mdcache_readdir: directory=0x1f8dc10 cookie=0 collisions 0 mdcache_getattrs :INODE :F_DBG :attrs obj attributes Mask = 0005dfce NO_FILE_TYPE Encode FAILED for attr 1, name = FATTR4_TYPE NFS READDIR :F_DBG :Returning NFS4ERR_SERVERFAULT /* But for the same directory=0x1f8dc10 , readdir works sometime. In the log, it prints the correct attr "/*nfs4_FSALattr_To_Fattr :NFS4 :F_DBG :Encoded attr 1, name = FATTR4_TYPE*/" Could there be a possible race condition somewhere in accessing mdcache, because of which for the same directory it works sometime? I have not been able to reproduce it. Looking at code, it seems somehow mdcache and attributes are not in sync. Client is also using cephfs cache tiering, but I think that should not have any effect on nfs-ganesha mdcache. Any hints on how to debug it
Re: [Nfs-ganesha-devel] Readdir with nfs4err_serverfault
Thanks for reply. I investigated further and looking at log it seems that error occurs only at Junction point somehow. nfs4_readdir_callback :EXPORT :DEBUG :Need to cross junction to Export_Id 1 Path / nfs4_readdir_callback :RW LOCK :F_DBG :Unlocked 0x1f8ec08 (>state_hdl->state_lock) at /Protocols/NFS/nfs4_op_readdir.c:256 populate_dirent :RW LOCK :F_DBG :Got read lock on 0x1f8ec08 (>state_hdl->state_lock) at /FSAL/fsal_helper.c:1330 populate_dirent :RW LOCK :F_DBG :Unlocked 0x1f8ec08 (>state_hdl->state_lock) at /FSAL/fsal_helper.c:1341 nfs_export_get_root_entry :RW LOCK :F_DBG :Got read lock on 0x1ed6d00 (>lock) at support/exports.c:2118 nfs_export_get_root_entry :RW LOCK :F_DBG :Unlocked 0x1ed6d00 (>lock) at support/exports.c:2123 mdcache_getattrs :RW LOCK :F_DBG :Got read lock on 0x1f64680 (>attr_lock) at /FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:979 mdcache_getattrs :RW LOCK :F_DBG :Unlocked 0x1f64680 (>attr_lock) at /FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1026 mdcache_getattrs :INODE :F_DBG :attrs obj attributes Mask = 0005dfce NO_FILE_TYPE In function populate_dirent --> call to nfs_export_get_root_entry checks that the junction mount is a directory. Only if its a directory call is made to mdcache_getattrs. I guess somehow in mdcache_refattrs, we get the wrong FILE TYPE. But sadly I don't have a reproducer. I will try to write a script that can reproduce it. Thanks, Supriti -- Supriti Singh��SUSE Linux GmbH, GF: Felix Imend��rffer, Jane Smithard, Graham Norton, HRB 21284 (AG N��rnberg) >>> Daniel Gryniewicz03/02/17 2:29 PM >>> That's much better. A few fixes have gone in since 2.4.1 relating to readdir, mostly relating to large directories, or concurrent access from multiple clients, or readdir in the presence of add/delete/rename that turned up during stress testing for RHGS. It might be worth it to attempt to recreate on 2.4.3, and see if those fixes helped. There's also quite a bit of work that has gone into 2.5-dev related to cephfs, so that may have fixed issues here. Unfortunately, 2.5 isn't released yet; it will be at least a few weeks before that is ready. But if the issue can be reproduced in a lab, then checking with 2.5 may be a useful data point as well. Daniel On 03/01/2017 04:12 AM, Supriti Singh wrote: > My mistake. The package corresponds to tag 2.4.1. > > > -- > Supriti Singh > SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, > HRB 21284 (AG Nürnberg) > Daniel Gryniewicz 02/28/17 7:13 PM >>> > Okay, that's a very old tag, and lots of changes have gone in since. We > won't be able to easily nail down what changed, but -dev27 isn't > necessarily expected to work properly. > > Daniel > > On 02/28/2017 01:06 PM, Supriti Singh wrote: >> This package was created from the tag 2.4-dev-27. >> >> >> >> -- >> Supriti Singh >> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, >> HRB 21284 (AG Nürnberg) >> > Daniel Gryniewicz 02/28/17 6:44 PM >>> >> Which exact version of 2.4 are they using? If it's 2.4.0.3 or earlier, >> then attribute access was reworked in 2.4.0.4 to fix a lot of races. >> >> Daniel >> >> On 02/28/2017 12:18 PM, Supriti Singh wrote: >>> Hello All, >>> >>> For one of our client, using nfs-ganesha v2.4 and ceph v10.2.4, readdir >>> fails with error: >>> >>> On nfs-client: "/*Remote I/O error*/" >>> >>> In nfs-ganesha server log (Removed some part of readability) >>> >>> */mdcache_readdir :NFS READDIR :F_DBG :About to readdir in >>> mdcache_readdir: directory=0x1f8dc10 cookie=0 collisions 0 >>> mdcache_getattrs :INODE :F_DBG :attrs obj attributes Mask = 0005dfce >>> NO_FILE_TYPE >>> Encode FAILED for attr 1, name = FATTR4_TYPE >>> NFS READDIR :F_DBG :Returning NFS4ERR_SERVERFAULT /* >>> >>> >>> But for the same directory=0x1f8dc10 , readdir works sometime. >>> In >>> :F_DBG :Encoded attr 1, name = FATTR4_TYPE*/" >>> >>> Could there be a possible race condition somewhere in accessing mdcache, >>> because of which for the same directory it works sometime? >>> >>> I have not been able to reproduce it. Looking at code, it seems somehow >>> mdcache and attributes are not in sync. >>> Client is also using cephfs cache tiering, but I think that should not >>> have any effect on nfs-ganesha mdcache. >>> >>> Any hints on how to debug it further? >>> >>> Thanks, >>> Supriti >>> >>> -- >>> Supriti Singh >>> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, >>> HRB 21284 (AG Nürnberg) >>> >>> >>> >>> -- >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >>> >>> >>> >>> ___ >>> Nfs-ganesha-devel mailing list >>> Nfs-ganesha-devel@lists.sourceforge.net >>>
Re: [Nfs-ganesha-devel] pNFS with CephFS and RGW
Hi Supriti, Neither FSAL currently supports pNFS. The Ceph fsal has vestigial bits of a pNFS files layout that aren't currently supported. Matt - Original Message - > From: "Supriti Singh"> To: nfs-ganesha-devel@lists.sourceforge.net > Sent: Thursday, March 2, 2017 9:06:27 AM > Subject: [Nfs-ganesha-devel] pNFS with CephFS and RGW > > Hi, > > Is it possible to use pNFS protocol for CephFS and RGW FSAL? > If yes, how to specify it in the config file? > > I could not find any documentation regarding the same. > > Thanks, > Supriti > > -- > Supriti Singh > SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, > HRB 21284 (AG Nürnberg) > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, SlashDot.org! http://sdm.link/slashdot > ___ > Nfs-ganesha-devel mailing list > Nfs-ganesha-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel > -- Matt Benjamin Red Hat, Inc. 315 West Huron Street, Suite 140A Ann Arbor, Michigan 48103 http://www.redhat.com/en/technologies/storage tel. 734-821-5101 fax. 734-769-8938 cel. 734-216-5309 -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
[Nfs-ganesha-devel] pNFS with CephFS and RGW
Hi, Is it possible to use pNFS protocol for CephFS and RGW FSAL? If yes, how to specify it in the config file? I could not find any documentation regarding the same. Thanks, Supriti -- Supriti Singh��SUSE Linux GmbH, GF: Felix Imend��rffer, Jane Smithard, Graham Norton, HRB 21284 (AG N��rnberg) -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] Readdir with nfs4err_serverfault
That's much better. A few fixes have gone in since 2.4.1 relating to readdir, mostly relating to large directories, or concurrent access from multiple clients, or readdir in the presence of add/delete/rename that turned up during stress testing for RHGS. It might be worth it to attempt to recreate on 2.4.3, and see if those fixes helped. There's also quite a bit of work that has gone into 2.5-dev related to cephfs, so that may have fixed issues here. Unfortunately, 2.5 isn't released yet; it will be at least a few weeks before that is ready. But if the issue can be reproduced in a lab, then checking with 2.5 may be a useful data point as well. Daniel On 03/01/2017 04:12 AM, Supriti Singh wrote: > My mistake. The package corresponds to tag 2.4.1. > > > -- > Supriti Singh > SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, > HRB 21284 (AG Nürnberg) > Daniel Gryniewicz02/28/17 7:13 PM >>> > Okay, that's a very old tag, and lots of changes have gone in since. We > won't be able to easily nail down what changed, but -dev27 isn't > necessarily expected to work properly. > > Daniel > > On 02/28/2017 01:06 PM, Supriti Singh wrote: >> This package was created from the tag 2.4-dev-27. >> >> >> >> -- >> Supriti Singh >> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, >> HRB 21284 (AG Nürnberg) >> > Daniel Gryniewicz 02/28/17 6:44 PM >>> >> Which exact version of 2.4 are they using? If it's 2.4.0.3 or earlier, >> then attribute access was reworked in 2.4.0.4 to fix a lot of races. >> >> Daniel >> >> On 02/28/2017 12:18 PM, Supriti Singh wrote: >>> Hello All, >>> >>> For one of our client, using nfs-ganesha v2.4 and ceph v10.2.4, readdir >>> fails with error: >>> >>> On nfs-client: "/*Remote I/O error*/" >>> >>> In nfs-ganesha server log (Removed some part of readability) >>> >>> */mdcache_readdir :NFS READDIR :F_DBG :About to readdir in >>> mdcache_readdir: directory=0x1f8dc10 cookie=0 collisions 0 >>> mdcache_getattrs :INODE :F_DBG :attrs obj attributes Mask = 0005dfce >>> NO_FILE_TYPE >>> Encode FAILED for attr 1, name = FATTR4_TYPE >>> NFS READDIR :F_DBG :Returning NFS4ERR_SERVERFAULT /* >>> >>> >>> But for the same directory=0x1f8dc10 , readdir works sometime. >>> In the log, it prints the correct attr "/*nfs4_FSALattr_To_Fattr :NFS4 >>> :F_DBG :Encoded attr 1, name = FATTR4_TYPE*/" >>> >>> Could there be a possible race condition somewhere in accessing mdcache, >>> because of which for the same directory it works sometime? >>> >>> I have not been able to reproduce it. Looking at code, it seems somehow >>> mdcache and attributes are not in sync. >>> Client is also using cephfs cache tiering, but I think that should not >>> have any effect on nfs-ganesha mdcache. >>> >>> Any hints on how to debug it further? >>> >>> Thanks, >>> Supriti >>> >>> -- >>> Supriti Singh >>> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, >>> HRB 21284 (AG Nürnberg) >>> >>> >>> >>> -- >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >>> >>> >>> >>> ___ >>> Nfs-ganesha-devel mailing list >>> Nfs-ganesha-devel@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel >>> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >> ___ >> Nfs-ganesha-devel mailing list >> Nfs-ganesha-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel >> > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, SlashDot.org! http://sdm.link/slashdot > ___ > Nfs-ganesha-devel mailing list > Nfs-ganesha-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel