Re: [Nfs-ganesha-devel] Readdir with nfs4err_serverfault

2017-03-02 Thread Daniel Gryniewicz
I don't think it's the check for type==DIRECTORY.  If that fails, you'll 
get the error message "Failed to get root for..." which is not in the 
log snippet.  The callback is only called if the getattrs() on the 
junction obj succeeds, so maybe it's failing?  A reproducer would help a 
lot, if that's possible.

Daniel

On 03/02/2017 10:54 AM, Supriti Singh wrote:
> Thanks for reply.
>
> I investigated further and looking at log it seems that error occurs
> only at Junction point somehow.
>
> /*nfs4_readdir_callback :EXPORT :DEBUG :Need to cross junction to
> Export_Id 1 Path /
> */
> /**/
> /*nfs4_readdir_callback :RW LOCK :F_DBG :Unlocked 0x1f8ec08
> (>state_hdl->state_lock) at /Protocols/NFS/nfs4_op_readdir.c:256
> populate_dirent :RW LOCK :F_DBG :Got read lock on 0x1f8ec08
> (>state_hdl->state_lock) at /FSAL/fsal_helper.c:1330
> populate_dirent :RW LOCK :F_DBG :Unlocked 0x1f8ec08
> (>state_hdl->state_lock) at /FSAL/fsal_helper.c:1341
> nfs_export_get_root_entry :RW LOCK :F_DBG :Got read lock on 0x1ed6d00
> (>lock) at support/exports.c:2118
> nfs_export_get_root_entry :RW LOCK :F_DBG :Unlocked 0x1ed6d00
> (>lock) at support/exports.c:2123
> mdcache_getattrs :RW LOCK :F_DBG :Got read lock on 0x1f64680
> (>attr_lock) at
> /FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:979
> mdcache_getattrs :RW LOCK :F_DBG :Unlocked 0x1f64680 (>attr_lock)
> at /FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1026
> mdcache_getattrs :INODE :F_DBG :attrs  obj attributes Mask = 0005dfce
> NO_FILE_TYPE
>
> */In function populate_dirent --> call to nfs_export_get_root_entry
> checks that the junction mount is a directory.
> Only if its a directory call is made to mdcache_getattrs.
> I guess somehow in mdcache_refattrs, we get the wrong FILE TYPE.
>
> But sadly I don't have a reproducer. I will try to write a script that
> can reproduce it.
>
> Thanks,
> Supriti
>
>
> --
> Supriti Singh
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nürnberg)
>
 Daniel Gryniewicz  03/02/17 2:29 PM >>>
> That's much better. A few fixes have gone in since 2.4.1 relating to
> readdir, mostly relating to large directories, or concurrent access from
> multiple clients, or readdir in the presence of add/delete/rename that
> turned up during stress testing for RHGS. It might be worth it to
> attempt to recreate on 2.4.3, and see if those fixes helped.
>
> There's also quite a bit of work that has gone into 2.5-dev related to
> cephfs, so that may have fixed issues here. Unfortunately, 2.5 isn't
> released yet; it will be at least a few weeks before that is ready. But
> if the issue can be reproduced in a lab, then checking with 2.5 may be a
> useful data point as well.
>
> Daniel
>
> On 03/01/2017 04:12 AM, Supriti Singh wrote:
>> My mistake. The package corresponds to tag 2.4.1.
>>
>>
>> --
>> Supriti Singh
>> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
>> HRB 21284 (AG Nürnberg)
>>
> Daniel Gryniewicz  02/28/17 7:13 PM >>>
>> Okay, that's a very old tag, and lots of changes have gone in since. We
>> won't be able to easily nail down what changed, but -dev27 isn't
>> necessarily expected to work properly.
>>
>> Daniel
>>
>> On 02/28/2017 01:06 PM, Supriti Singh wrote:
>>> This package was created from the tag 2.4-dev-27.
>>>
>>>
>>>
>>> --
>>> Supriti Singh
>>> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
>>> HRB 21284 (AG Nürnberg)
>>>
>> Daniel Gryniewicz  02/28/17 6:44 PM >>>
>>> Which exact version of 2.4 are they using? If it's 2.4.0.3 or earlier,
>>> then attribute access was reworked in 2.4.0.4 to fix a lot of races.
>>>
>>> Daniel
>>>
>>> On 02/28/2017 12:18 PM, Supriti Singh wrote:
 Hello All,

 For one of our client, using nfs-ganesha v2.4 and ceph v10.2.4, readdir
 fails with error:

 On nfs-client: "/*Remote I/O error*/"

 In nfs-ganesha server log (Removed some part of readability)

 */mdcache_readdir :NFS READDIR :F_DBG :About to readdir in
 mdcache_readdir: directory=0x1f8dc10 cookie=0 collisions 0
 mdcache_getattrs :INODE :F_DBG :attrs obj attributes Mask = 0005dfce
 NO_FILE_TYPE
 Encode FAILED for attr 1, name = FATTR4_TYPE
 NFS READDIR :F_DBG :Returning NFS4ERR_SERVERFAULT /*


 But for the same directory=0x1f8dc10 , readdir works sometime.
 In the log, it prints the correct attr "/*nfs4_FSALattr_To_Fattr :NFS4
 :F_DBG :Encoded attr 1, name = FATTR4_TYPE*/"

 Could there be a possible race condition somewhere in accessing mdcache,
 because of which for the same directory it works sometime?

 I have not been able to reproduce it. Looking at code, it seems somehow
 mdcache and attributes are not in sync.
 Client is also using cephfs cache tiering, but I think that should not
 have any effect on nfs-ganesha mdcache.

 Any hints on how to debug it 

Re: [Nfs-ganesha-devel] Readdir with nfs4err_serverfault

2017-03-02 Thread Supriti Singh
Thanks for reply.

I investigated further and looking at log it seems that error occurs only at 
Junction point somehow. 

nfs4_readdir_callback :EXPORT :DEBUG :Need to cross junction to Export_Id 1 
Path /

nfs4_readdir_callback :RW LOCK :F_DBG :Unlocked 0x1f8ec08 
(>state_hdl->state_lock) at
/Protocols/NFS/nfs4_op_readdir.c:256
populate_dirent :RW LOCK :F_DBG :Got read lock on 0x1f8ec08 
(>state_hdl->state_lock) at /FSAL/fsal_helper.c:1330
populate_dirent :RW LOCK :F_DBG :Unlocked 0x1f8ec08 
(>state_hdl->state_lock) at /FSAL/fsal_helper.c:1341
nfs_export_get_root_entry :RW LOCK :F_DBG :Got read lock on 0x1ed6d00 
(>lock) at support/exports.c:2118
nfs_export_get_root_entry :RW LOCK :F_DBG :Unlocked 0x1ed6d00 (>lock) 
at support/exports.c:2123
mdcache_getattrs :RW LOCK :F_DBG :Got read lock on 0x1f64680 
(>attr_lock) at
/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:979
mdcache_getattrs :RW LOCK :F_DBG :Unlocked 0x1f64680 (>attr_lock) at
/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1026
mdcache_getattrs :INODE :F_DBG :attrs  obj attributes Mask = 0005dfce 
NO_FILE_TYPE

In function populate_dirent --> call to nfs_export_get_root_entry checks that 
the junction mount is a directory.
Only if its a directory call is made to mdcache_getattrs. 
I guess somehow in mdcache_refattrs, we get the wrong FILE TYPE. 

But sadly I don't have a reproducer. I will try to write a script that can 
reproduce it. 

Thanks,
Supriti 


--
Supriti Singh��SUSE Linux GmbH, GF: Felix Imend��rffer, Jane Smithard, Graham 
Norton,
HRB 21284 (AG N��rnberg)
 



>>> Daniel Gryniewicz  03/02/17 2:29 PM >>>
That's much better.  A few fixes have gone in since 2.4.1 relating to 
readdir, mostly relating to large directories, or concurrent access from 
multiple clients, or readdir in the presence of add/delete/rename that 
turned up during stress testing for RHGS.  It might be worth it to 
attempt to recreate on 2.4.3, and see if those fixes helped.

There's also quite a bit of work that has gone into 2.5-dev related to 
cephfs, so that may have fixed issues here.  Unfortunately, 2.5 isn't 
released yet; it will be at least a few weeks before that is ready.  But 
if the issue can be reproduced in a lab, then checking with 2.5 may be a 
useful data point as well.

Daniel

On 03/01/2017 04:12 AM, Supriti Singh wrote:
> My mistake. The package corresponds to tag 2.4.1.
>
>
> --
> Supriti Singh
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nürnberg)
>
 Daniel Gryniewicz  02/28/17 7:13 PM >>>
> Okay, that's a very old tag, and lots of changes have gone in since. We
> won't be able to easily nail down what changed, but -dev27 isn't
> necessarily expected to work properly.
>
> Daniel
>
> On 02/28/2017 01:06 PM, Supriti Singh wrote:
>> This package was created from the tag 2.4-dev-27.
>>
>>
>>
>> --
>> Supriti Singh
>> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
>> HRB 21284 (AG Nürnberg)
>>
> Daniel Gryniewicz  02/28/17 6:44 PM >>>
>> Which exact version of 2.4 are they using? If it's 2.4.0.3 or earlier,
>> then attribute access was reworked in 2.4.0.4 to fix a lot of races.
>>
>> Daniel
>>
>> On 02/28/2017 12:18 PM, Supriti Singh wrote:
>>> Hello All,
>>>
>>> For one of our client, using nfs-ganesha v2.4 and ceph v10.2.4, readdir
>>> fails with error:
>>>
>>> On nfs-client: "/*Remote I/O error*/"
>>>
>>> In nfs-ganesha server log (Removed some part of readability)
>>>
>>> */mdcache_readdir :NFS READDIR :F_DBG :About to readdir in
>>> mdcache_readdir: directory=0x1f8dc10 cookie=0 collisions 0
>>> mdcache_getattrs :INODE :F_DBG :attrs obj attributes Mask = 0005dfce
>>> NO_FILE_TYPE
>>> Encode FAILED for attr 1, name = FATTR4_TYPE
>>> NFS READDIR :F_DBG :Returning NFS4ERR_SERVERFAULT /*
>>>
>>>
>>> But for the same directory=0x1f8dc10 ,
 readdir works sometime.
>>> In >>> :F_DBG :Encoded attr 1, name = FATTR4_TYPE*/"
>>>
>>> Could there be a possible race condition somewhere in accessing mdcache,
>>> because of which for the same directory it works sometime?
>>>
>>> I have not been able to reproduce it. Looking at code, it seems somehow
>>> mdcache and attributes are not in sync.
>>> Client is also using cephfs cache tiering, but I think that should not
>>> have any effect on nfs-ganesha mdcache.
>>>
>>> Any hints on how to debug it further?
>>>
>>> Thanks,
>>> Supriti
>>>
>>> --
>>> Supriti Singh
>>> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
>>> HRB 21284 (AG Nürnberg)
>>>
>>>
>>>
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>>>
>>>
>>>
>>> ___
>>> Nfs-ganesha-devel mailing list
>>> Nfs-ganesha-devel@lists.sourceforge.net
>>> 

Re: [Nfs-ganesha-devel] pNFS with CephFS and RGW

2017-03-02 Thread Matt Benjamin
Hi Supriti,

Neither FSAL currently supports pNFS.  The Ceph fsal has vestigial bits of a 
pNFS files layout that aren't currently supported.

Matt

- Original Message -
> From: "Supriti Singh" 
> To: nfs-ganesha-devel@lists.sourceforge.net
> Sent: Thursday, March 2, 2017 9:06:27 AM
> Subject: [Nfs-ganesha-devel] pNFS with CephFS and RGW
> 
> Hi,
> 
> Is it possible to use pNFS protocol for CephFS and RGW FSAL?
> If yes, how to specify it in the config file?
> 
> I could not find any documentation regarding the same.
> 
> Thanks,
> Supriti
> 
> --
> Supriti Singh
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nürnberg)
> 
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
> 

-- 
Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] pNFS with CephFS and RGW

2017-03-02 Thread Supriti Singh
Hi,

Is it possible to use pNFS protocol for CephFS and RGW FSAL?
If yes, how to specify it in the config file? 

I could not find any documentation regarding the same. 

Thanks,
Supriti 

--
Supriti Singh��SUSE Linux GmbH, GF: Felix Imend��rffer, Jane Smithard, Graham 
Norton,
HRB 21284 (AG N��rnberg)
 




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Readdir with nfs4err_serverfault

2017-03-02 Thread Daniel Gryniewicz
That's much better.  A few fixes have gone in since 2.4.1 relating to 
readdir, mostly relating to large directories, or concurrent access from 
multiple clients, or readdir in the presence of add/delete/rename that 
turned up during stress testing for RHGS.  It might be worth it to 
attempt to recreate on 2.4.3, and see if those fixes helped.

There's also quite a bit of work that has gone into 2.5-dev related to 
cephfs, so that may have fixed issues here.  Unfortunately, 2.5 isn't 
released yet; it will be at least a few weeks before that is ready.  But 
if the issue can be reproduced in a lab, then checking with 2.5 may be a 
useful data point as well.

Daniel

On 03/01/2017 04:12 AM, Supriti Singh wrote:
> My mistake. The package corresponds to tag 2.4.1.
>
>
> --
> Supriti Singh
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nürnberg)
>
 Daniel Gryniewicz  02/28/17 7:13 PM >>>
> Okay, that's a very old tag, and lots of changes have gone in since. We
> won't be able to easily nail down what changed, but -dev27 isn't
> necessarily expected to work properly.
>
> Daniel
>
> On 02/28/2017 01:06 PM, Supriti Singh wrote:
>> This package was created from the tag 2.4-dev-27.
>>
>>
>>
>> --
>> Supriti Singh
>> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
>> HRB 21284 (AG Nürnberg)
>>
> Daniel Gryniewicz  02/28/17 6:44 PM >>>
>> Which exact version of 2.4 are they using? If it's 2.4.0.3 or earlier,
>> then attribute access was reworked in 2.4.0.4 to fix a lot of races.
>>
>> Daniel
>>
>> On 02/28/2017 12:18 PM, Supriti Singh wrote:
>>> Hello All,
>>>
>>> For one of our client, using nfs-ganesha v2.4 and ceph v10.2.4, readdir
>>> fails with error:
>>>
>>> On nfs-client: "/*Remote I/O error*/"
>>>
>>> In nfs-ganesha server log (Removed some part of readability)
>>>
>>> */mdcache_readdir :NFS READDIR :F_DBG :About to readdir in
>>> mdcache_readdir: directory=0x1f8dc10 cookie=0 collisions 0
>>> mdcache_getattrs :INODE :F_DBG :attrs obj attributes Mask = 0005dfce
>>> NO_FILE_TYPE
>>> Encode FAILED for attr 1, name = FATTR4_TYPE
>>> NFS READDIR :F_DBG :Returning NFS4ERR_SERVERFAULT /*
>>>
>>>
>>> But for the same directory=0x1f8dc10 , readdir works sometime.
>>> In the log, it prints the correct attr "/*nfs4_FSALattr_To_Fattr :NFS4
>>> :F_DBG :Encoded attr 1, name = FATTR4_TYPE*/"
>>>
>>> Could there be a possible race condition somewhere in accessing mdcache,
>>> because of which for the same directory it works sometime?
>>>
>>> I have not been able to reproduce it. Looking at code, it seems somehow
>>> mdcache and attributes are not in sync.
>>> Client is also using cephfs cache tiering, but I think that should not
>>> have any effect on nfs-ganesha mdcache.
>>>
>>> Any hints on how to debug it further?
>>>
>>> Thanks,
>>> Supriti
>>>
>>> --
>>> Supriti Singh
>>> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
>>> HRB 21284 (AG Nürnberg)
>>>
>>>
>>>
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>>>
>>>
>>>
>>> ___
>>> Nfs-ganesha-devel mailing list
>>> Nfs-ganesha-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>>
>>
>>
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>> ___
>> Nfs-ganesha-devel mailing list
>> Nfs-ganesha-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel