Re: [Nfs-ganesha-devel] nTI-RPC refcounting and locking

2017-01-16 Thread Matt Benjamin
It sounds like your change is a step in the direction of unifying CLNT and 
SVCXPRT handle structures.  As we've discussed off-list, if you take on the 
project of unifying the actual handle structures, you get the lock 
consolidation for free.  In any event, if rpc_dplx_rec contains a service 
handle expanded, it appears to need a client handle as well.

Matt

- Original Message -
> From: "William Allen Simpson" 
> To: "NFS Ganesha Developers" , "Swen 
> Schillig" 
> Sent: Monday, January 16, 2017 1:53:24 PM
> Subject: [Nfs-ganesha-devel] nTI-RPC refcounting and locking
> 
> Swen, I've been looking at your patch, and it has some good ideas.
> For some odd reason, I awoke at 1:30 am thinking about it, and
> got up and wrote some code.
> 
> I've taken another patch of mine, and added the SVCXPRT into the
> rpc_dplx_rec, eliminating the refcnt entirely (using the SVCXPRT).
> 
> After all, there's no reason to zalloc them separately.  They
> always are created at the same time.
> 
> So I'm wondering about your thoughts on the locking.  They seem
> redundant.  I'm thinking about changing REC_LOCK to use the
> SVCXPRT xp_lock, instead.
> 
> There's a spot in the existing rpc_dplx_rec creation code where
> there's a timing hole in the code after an existing one is
> found so the extra refcount is decremented.  Another process
> could also decrement and free, and there could be a pointer into
> freed memory.  Unifying the lock would be one solution (better
> and faster than the usual solution with two locks).
> 
> The SVCXPRT lock code has a lot more debugging and testing, too.
> 
> Any other related ideas?
> 
> BTW, I got rid of the , too.  Changed it to a callback
> function ;)
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
> 

-- 
Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Request: I'm looking for good file system test programs

2017-01-16 Thread Kevin C.
Frank asked: What sorts of tests?

My answer:

I am hoping to benefit from the experience of this group since this group has 
spent much more time than I working with NFS-Ganesha (and perhaps other file 
systems). I hope that this group has identified useful tests (beside throughput 
tests). For example; a file data write test would not only confirm that the 
write function did not report errors but would also read back previously 
written data to verify that the contents remain correct. Reads immediately 
after write are very likely to read back from cache so I expect a good test 
program would allow read back validation of data at a later time (e.g. after 
reboot). This group is probably more knowledgeable than I am regarding the many 
possible file system operations: create, delete, open, close, read, write, get 
attribute, set attribute, etc.


On 01/16/2017 02:59 PM, Frank Filz wrote:
>> I'm looking for file system operation/stability test programs.
>>
>>
>> I'm most interested in test programs that do many file operations that are
>> then verified rather than programs that concentrate on performance tests.
> What sorts of tests? There is the pjd-fstest test suite that tests POSIX
> compliance.
>
> Frank
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>
>


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] [mdcache] unreachable directory

2017-01-16 Thread Daniel Gryniewicz
On 01/16/2017 03:08 PM, Swen Schillig wrote:
> On Mo, 2017-01-16 at 10:52 -0500, Daniel Gryniewicz wrote:
>> Hi, Swen.
>>
>> Looking through that log, the failures of unlink() are returned from
>> the
>> sub_fsal, not directly caused by MDCACHE, so it's whatever's
>> underneath
>> (GPFS, presumably?) that's returning ENOTEMPTY:
> Hi Dan
>
> sorry for dumping that issue on you but I didn't really investigate and
> my best guess was that it got something to do with mdcache.
> Anyhow, it was not in relation with GPFS, I'm using VFS for my local
> tests.
> I will see if I can try the same with GPFS.

No problem, it took me all of 2 minutes to find the line in the log, and 
where it came from, and I'm glad to help.

Daniel

>
> Cheers Swen
>>
>> 16/01/2017 13:56:16 : epoch 587cb7dc : dhcp-9-244-58-137 :
>> ganesha.nfsd-14293[work-26] mdcache_unlink :INODE :DEBUG :unlink i
>> returned The directory is not empty
>>
>> Daniel
>>
>> On 01/16/2017 08:25 AM, Swen Schillig wrote:
>>>
>>> Dan
>>>
>>> while I was performing some simple tests to validate some of my
>>> code,
>>> I stumbled over some possible mdcache issue.
>>>
>>> Here's what I'm doing. (ganesha-2.5-dev9)
>>> I create the following directory structure
>>>
>>> mkdir -p /home/swen//d/c/def/g/h/i/j/
>>>
>>> where /home/swen/ is the mount point for a [V3|V4.0] mounted
>>> VFS.
>>>
>>> While executing
>>>
>>> dd if=/dev/urandom of=/home/swen//d/c/def/g/h/i/j/sepp.dd bs=4k
>>> count=10 &
>>>
>>> I'm trying to delete
>>>
>>> rm -r /home/swen//d/c/def/g
>>>
>>> which fails with directory not empty, which might be ok while still
>>> writing.
>>> But even after the write command is finished the error persists.
>>>
>>> Even though it shouldn't matter but looking at the directory I can
>>> see that it is empty
>>> [swen@localhost ~]$ ls -lia /home/swen//d/c/def/g
>>> total 8
>>> 1376503 drwxrwxr-x 3 swen swen 4096 16. Jan 13:54 .
>>> 1376502 drwxrwxr-x 3 swen swen 4096 16. Jan 13:54 ..
>>>
>>> Removing the directory from the FS directly(not via the NFS mount-
>>> point) succeeds.
>>>
>>> I've collected the logs for CACHE_INODE with FULL_DEBUG in case
>>> that might help.
>>>
>>> # logs - trying to delete while writing with dd
>>> ##
>>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
>>> ganesha.nfsd-14293[work-240] mdcache_getattrs :INODE :F_DBG
>>> :attrs  obj attributes Mask = 0680Mask = 0015dfce DIRECTORY
>>> numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8
>>> atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45
>>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
>>> ganesha.nfsd-14293[work-240] mdcache_getattrs :INODE :F_DBG
>>> :attrs  obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY
>>> numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8
>>> atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45
>>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
>>> ganesha.nfsd-14293[work-192] mdcache_getattrs :INODE :F_DBG
>>> :attrs  obj attributes Mask = 0680Mask = 0015dfce DIRECTORY
>>> numlinks=0x2 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8
>>> atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45
>>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
>>> ganesha.nfsd-14293[work-173] mdcache_getattrs :INODE :F_DBG
>>> :attrs  obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY
>>> numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8
>>> atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45
>>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
>>> ganesha.nfsd-14293[work-192] mdcache_getattrs :INODE :F_DBG
>>> :attrs  obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY
>>> numlinks=0x2 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8
>>> atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45
>>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
>>> ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG
>>> :attrs  obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY
>>> numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8
>>> atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45
>>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
>>> ganesha.nfsd-14293[work-225] mdcache_getattrs :INODE :F_DBG
>>> :attrs  obj attributes Mask = 0680Mask = 0015dfce DIRECTORY
>>> numlinks=0x2 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8
>>> atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45
>>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
>>> ganesha.nfsd-14293[work-147] mdc_lookup :INODE :F_DBG :Lookup ..
>>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
>>> ganesha.nfsd-14293[work-147] mdc_lookup :INODE :F_DBG :Lookup
>>> parent (..)
>>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
>>> ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG
>>> :attrs  obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY
>>> numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8
>>> 

Re: [Nfs-ganesha-devel] [mdcache] unreachable directory

2017-01-16 Thread Swen Schillig
On Mo, 2017-01-16 at 10:52 -0500, Daniel Gryniewicz wrote:
> Hi, Swen.
> 
> Looking through that log, the failures of unlink() are returned from
> the 
> sub_fsal, not directly caused by MDCACHE, so it's whatever's
> underneath 
> (GPFS, presumably?) that's returning ENOTEMPTY:
Hi Dan

sorry for dumping that issue on you but I didn't really investigate and
my best guess was that it got something to do with mdcache.
Anyhow, it was not in relation with GPFS, I'm using VFS for my local
tests.
I will see if I can try the same with GPFS.

Cheers Swen
> 
> 16/01/2017 13:56:16 : epoch 587cb7dc : dhcp-9-244-58-137 : 
> ganesha.nfsd-14293[work-26] mdcache_unlink :INODE :DEBUG :unlink i 
> returned The directory is not empty
> 
> Daniel
> 
> On 01/16/2017 08:25 AM, Swen Schillig wrote:
> > 
> > Dan
> > 
> > while I was performing some simple tests to validate some of my
> > code,
> > I stumbled over some possible mdcache issue.
> > 
> > Here's what I'm doing. (ganesha-2.5-dev9)
> > I create the following directory structure
> > 
> > mkdir -p /home/swen//d/c/def/g/h/i/j/
> > 
> > where /home/swen/ is the mount point for a [V3|V4.0] mounted
> > VFS.
> > 
> > While executing
> > 
> > dd if=/dev/urandom of=/home/swen//d/c/def/g/h/i/j/sepp.dd bs=4k
> > count=10 &
> > 
> > I'm trying to delete
> > 
> > rm -r /home/swen//d/c/def/g
> > 
> > which fails with directory not empty, which might be ok while still
> > writing.
> > But even after the write command is finished the error persists.
> > 
> > Even though it shouldn't matter but looking at the directory I can
> > see that it is empty
> > [swen@localhost ~]$ ls -lia /home/swen//d/c/def/g
> > total 8
> > 1376503 drwxrwxr-x 3 swen swen 4096 16. Jan 13:54 .
> > 1376502 drwxrwxr-x 3 swen swen 4096 16. Jan 13:54 ..
> > 
> > Removing the directory from the FS directly(not via the NFS mount-
> > point) succeeds.
> > 
> > I've collected the logs for CACHE_INODE with FULL_DEBUG in case
> > that might help.
> > 
> > # logs - trying to delete while writing with dd
> > ##
> > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
> > ganesha.nfsd-14293[work-240] mdcache_getattrs :INODE :F_DBG
> > :attrs  obj attributes Mask = 0680Mask = 0015dfce DIRECTORY
> > numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8
> > atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45
> > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
> > ganesha.nfsd-14293[work-240] mdcache_getattrs :INODE :F_DBG
> > :attrs  obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY
> > numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8
> > atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45
> > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
> > ganesha.nfsd-14293[work-192] mdcache_getattrs :INODE :F_DBG
> > :attrs  obj attributes Mask = 0680Mask = 0015dfce DIRECTORY
> > numlinks=0x2 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8
> > atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45
> > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
> > ganesha.nfsd-14293[work-173] mdcache_getattrs :INODE :F_DBG
> > :attrs  obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY
> > numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8
> > atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45
> > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
> > ganesha.nfsd-14293[work-192] mdcache_getattrs :INODE :F_DBG
> > :attrs  obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY
> > numlinks=0x2 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8
> > atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45
> > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
> > ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG
> > :attrs  obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY
> > numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8
> > atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45
> > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
> > ganesha.nfsd-14293[work-225] mdcache_getattrs :INODE :F_DBG
> > :attrs  obj attributes Mask = 0680Mask = 0015dfce DIRECTORY
> > numlinks=0x2 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8
> > atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45
> > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
> > ganesha.nfsd-14293[work-147] mdc_lookup :INODE :F_DBG :Lookup ..
> > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
> > ganesha.nfsd-14293[work-147] mdc_lookup :INODE :F_DBG :Lookup
> > parent (..)
> > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
> > ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG
> > :attrs  obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY
> > numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8
> > atime=16/01/2017 13:53:15 mtime=16/01/2017 13:54:45
> > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 :
> > ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG
> > 

Re: [Nfs-ganesha-devel] nTI-RPC fds

2017-01-16 Thread Swen Schillig
On Mo, 2017-01-16 at 14:21 -0500, William Allen Simpson wrote:
> Swen, your other very short patch fixes a problem with closing the
> fd.  And that's a good thing.  But the underlying problem is that
> we have multiple copies of the fd, and do not know whether it has
> been closed.
> 
> I'm thinking that it would be best to have one copy, in the SVCXPRT,
> and have everybody use that one.  It starts at 0, and could be set
> to -1 to indicate that it has been closed -- rather than a closeit
> flag as was done.
> 
> Any thoughts?
> 
Sounds good, go ahead.

Cheers Swen.


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nTI-RPC refcounting and locking

2017-01-16 Thread Swen Schillig
On Mo, 2017-01-16 at 13:53 -0500, William Allen Simpson wrote:
> Swen, I've been looking at your patch, and it has some good ideas.
> For some odd reason, I awoke at 1:30 am thinking about it, and
> got up and wrote some code.
> 
I never intended to give you sleepless nights :-)

> I've taken another patch of mine, and added the SVCXPRT into the
> rpc_dplx_rec, eliminating the refcnt entirely (using the SVCXPRT).
> 
> After all, there's no reason to zalloc them separately.  They
> always are created at the same time.
> 
> So I'm wondering about your thoughts on the locking.  They seem
> redundant.  I'm thinking about changing REC_LOCK to use the
> SVCXPRT xp_lock, instead.
> 
> There's a spot in the existing rpc_dplx_rec creation code where
> there's a timing hole in the code after an existing one is
> found so the extra refcount is decremented.  Another process
> could also decrement and free, and there could be a pointer into
> freed memory.  Unifying the lock would be one solution (better
> and faster than the usual solution with two locks).
> 
> The SVCXPRT lock code has a lot more debugging and testing, too.
> 
> Any other related ideas?
> 
> BTW, I got rid of the , too.  Changed it to a callback
> function ;)
That sounds like a much more invasive change.

I wasn't that brave at the time and just tried to fix what was wrong,
anyhow, a good rewrite of that area is always favorable.

It would be good if you could post/include all your patches as soon as
possible as I believe the ntirpc area does need some updates.

I hope I will find some time again soon and try to help out as well, 
if that's ok.

Cheers Swen


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Request: I'm looking for good file system test programs

2017-01-16 Thread Frank Filz
> I'm looking for file system operation/stability test programs.
> 
> 
> I'm most interested in test programs that do many file operations that are
> then verified rather than programs that concentrate on performance tests.

What sorts of tests? There is the pjd-fstest test suite that tests POSIX
compliance.

Frank


---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] Request: I'm looking for good file system test programs

2017-01-16 Thread Kevin C.

I'm looking for file system operation/stability test programs.


I'm most interested in test programs that do many file operations that
are then verified rather than programs that concentrate on performance
tests.


Thanks in advance,

Kevin




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] nTI-RPC fds

2017-01-16 Thread William Allen Simpson
Swen, your other very short patch fixes a problem with closing the
fd.  And that's a good thing.  But the underlying problem is that
we have multiple copies of the fd, and do not know whether it has
been closed.

I'm thinking that it would be best to have one copy, in the SVCXPRT,
and have everybody use that one.  It starts at 0, and could be set
to -1 to indicate that it has been closed -- rather than a closeit
flag as was done.

Any thoughts?

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] nTI-RPC refcounting and locking

2017-01-16 Thread William Allen Simpson
Swen, I've been looking at your patch, and it has some good ideas.
For some odd reason, I awoke at 1:30 am thinking about it, and
got up and wrote some code.

I've taken another patch of mine, and added the SVCXPRT into the
rpc_dplx_rec, eliminating the refcnt entirely (using the SVCXPRT).

After all, there's no reason to zalloc them separately.  They
always are created at the same time.

So I'm wondering about your thoughts on the locking.  They seem
redundant.  I'm thinking about changing REC_LOCK to use the
SVCXPRT xp_lock, instead.

There's a spot in the existing rpc_dplx_rec creation code where
there's a timing hole in the code after an existing one is
found so the extra refcount is decremented.  Another process
could also decrement and free, and there could be a pointer into
freed memory.  Unifying the lock would be one solution (better
and faster than the usual solution with two locks).

The SVCXPRT lock code has a lot more debugging and testing, too.

Any other related ideas?

BTW, I got rid of the , too.  Changed it to a callback
function ;)

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] [mdcache] unreachable directory

2017-01-16 Thread Daniel Gryniewicz
Hi, Swen.

Looking through that log, the failures of unlink() are returned from the 
sub_fsal, not directly caused by MDCACHE, so it's whatever's underneath 
(GPFS, presumably?) that's returning ENOTEMPTY:

16/01/2017 13:56:16 : epoch 587cb7dc : dhcp-9-244-58-137 : 
ganesha.nfsd-14293[work-26] mdcache_unlink :INODE :DEBUG :unlink i 
returned The directory is not empty

Daniel

On 01/16/2017 08:25 AM, Swen Schillig wrote:
> Dan
>
> while I was performing some simple tests to validate some of my code,
> I stumbled over some possible mdcache issue.
>
> Here's what I'm doing. (ganesha-2.5-dev9)
> I create the following directory structure
>
> mkdir -p /home/swen//d/c/def/g/h/i/j/
>
> where /home/swen/ is the mount point for a [V3|V4.0] mounted VFS.
>
> While executing
>
> dd if=/dev/urandom of=/home/swen//d/c/def/g/h/i/j/sepp.dd bs=4k 
> count=10 &
>
> I'm trying to delete
>
> rm -r /home/swen//d/c/def/g
>
> which fails with directory not empty, which might be ok while still writing.
> But even after the write command is finished the error persists.
>
> Even though it shouldn't matter but looking at the directory I can see that 
> it is empty
> [swen@localhost ~]$ ls -lia /home/swen//d/c/def/g
> total 8
> 1376503 drwxrwxr-x 3 swen swen 4096 16. Jan 13:54 .
> 1376502 drwxrwxr-x 3 swen swen 4096 16. Jan 13:54 ..
>
> Removing the directory from the FS directly(not via the NFS mount-point) 
> succeeds.
>
> I've collected the logs for CACHE_INODE with FULL_DEBUG in case that might 
> help.
>
> # logs - trying to delete while writing with dd ##
> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
> ganesha.nfsd-14293[work-240] mdcache_getattrs :INODE :F_DBG :attrs  obj 
> attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 
> mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 
> 13:54:45
> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
> ganesha.nfsd-14293[work-240] mdcache_getattrs :INODE :F_DBG :attrs  obj 
> attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 
> mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 
> 13:54:45
> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
> ganesha.nfsd-14293[work-192] mdcache_getattrs :INODE :F_DBG :attrs  obj 
> attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x2 size=0x1000 
> mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 
> 13:54:45
> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
> ganesha.nfsd-14293[work-173] mdcache_getattrs :INODE :F_DBG :attrs  obj 
> attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 
> mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 
> 13:54:45
> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
> ganesha.nfsd-14293[work-192] mdcache_getattrs :INODE :F_DBG :attrs  obj 
> attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x2 size=0x1000 
> mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 
> 13:54:45
> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
> ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG :attrs  obj 
> attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 
> mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 
> 13:54:45
> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
> ganesha.nfsd-14293[work-225] mdcache_getattrs :INODE :F_DBG :attrs  obj 
> attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x2 size=0x1000 
> mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 
> 13:54:45
> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
> ganesha.nfsd-14293[work-147] mdc_lookup :INODE :F_DBG :Lookup ..
> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
> ganesha.nfsd-14293[work-147] mdc_lookup :INODE :F_DBG :Lookup parent (..)
> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
> ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG :attrs  obj 
> attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 
> mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:53:15 mtime=16/01/2017 
> 13:54:45
> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
> ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG :attrs  obj 
> attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 
> mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 
> 13:54:45
> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
> ganesha.nfsd-14293[work-225] mdc_lookup :INODE :F_DBG :Lookup sepp.dd
> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
> ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG :attrs  obj 
> attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 

[Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: [valgrind] Mem-leak: Free channel client memory.

2017-01-16 Thread GerritHub
>From Swen Schillig :

Swen Schillig has uploaded a new change for review. ( 
https://review.gerrithub.io/343215


Change subject: [valgrind] Mem-leak: Free channel client memory.
..

[valgrind] Mem-leak: Free channel client memory.

Memory is not free'd due to calling the wrong CB (release vs destroy).

==11341== 2,364,219 (288 direct, 2,363,931 indirect) bytes in 3 blocks are 
definitely lost in loss record 347 of 355
==11341==at 0x4C2BBAD: malloc (vg_replace_malloc.c:299)
==11341==by 0x44B611: gsh_malloc__ (abstract_mem.h:78)
==11341==by 0x6A8C2BB: clnt_vc_ncreate2 (clnt_vc.c:237)
==11341==by 0x6AA7129: clnt_vc_ncreate_svc (svc_vc.c:1292)
==11341==by 0x43C3DB: nfs_rpc_create_chan_v41 (nfs_rpc_callback.c:656)
==11341==by 0x461338: nfs4_op_create_session (nfs4_op_create_session.c:498)
==11341==by 0x45CC5A: nfs4_Compound (nfs4_Compound.c:734)
==11341==by 0x44A9D4: nfs_rpc_execute (nfs_worker_thread.c:1281)
==11341==by 0x44B296: worker_run (nfs_worker_thread.c:1548)
==11341==by 0x4FEE0E: fridgethr_start_routine (fridgethr.c:550)
==11341==by 0x665E5C9: start_thread (in /usr/lib64/libpthread-2.23.so)
==11341==by 0x6FE50EC: clone (in /usr/lib64/libc-2.23.so)

Change-Id: I24fa3b5c31eb219304f6e723d6b30ede720dc2ef
Signed-off-by: Swen Schillig 
---
M src/MainNFSD/nfs_rpc_callback.c
1 file changed, 7 insertions(+), 44 deletions(-)



  git pull ssh://review.gerrithub.io:29419/ffilz/nfs-ganesha 
refs/changes/15/343215/1
-- 
To view, visit https://review.gerrithub.io/343215
To unsubscribe, visit https://review.gerrithub.io/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I24fa3b5c31eb219304f6e723d6b30ede720dc2ef
Gerrit-Change-Number: 343215
Gerrit-PatchSet: 1
Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-Owner: Swen Schillig 
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] create_export :FSAL :CRIT :RGW module: librgw init failed (-5)

2017-01-16 Thread Daniel Gryniewicz
The lack of verbosity is not Ganesha's fault; it only gets the single 
error code back from Ceph.

Try turning up all your client related logging in your ceph.conf, and 
check it's logging?

Daniel

On 01/14/2017 03:02 PM, Alessandro De Salvo wrote:
> Hi Daniel,
>
> indeed, this is the root cause, but I do not understand what's wrong
> here and the real cause of the failure. Since I was suspicious about the
> ceph.conf setup I've already checked that it all works, and it does
> indeed. I can issue ceph -s or rados df without any problem.
>
> Here I give you a couple of examples:
>
>
> $ ceph -s
>
>  cluster aac2c2c4-5953-44d7-b90c-9922a8ccd77a
>   health HEALTH_OK
>   monmap e4: 3 mons at
> {mon1=:6789/0,mon2=:6789/0,mon3=:6789/0}
>  election epoch 124, quorum 0,1,2 mon3,mon2,mon1
>fsmap e42: 1/1/1 up {0=mds1=up:active}, 1 up:standby
>  mgr no daemons active
>   osdmap e16477: 54 osds: 52 up, 52 in
>  flags sortbitwise,require_jewel_osds,require_kraken_osds
>pgmap v656338: 800 pgs, 16 pools, 4136 MB data, 1326 objects
>  48076 MB used, 71839 GB / 71886 GB avail
>   800 active+clean
>
>
> $ rados df
>
> POOL_NAME  USED  OBJECTS CLONES COPIES
> MISSING_ON_PRIMARY UNFOUND DEGRAED RD_OPS RDWR_OPS WR
> .rgw.root   1681   4  0 12
> 0   0   0156  121k  4  5120
> cephfs_data0   0  0 0
> 0   0   0  0 0  0 0
> cephfs_metadata 2148  20  0 60
> 0   0   0 23 24576 41  7168
> default.rgw.buckets.data   4136M1092  0 3276
> 0   0   0150  124k   9745 4136M
> default.rgw.buckets.index  0   2  0 6
> 0   0   0796 1264k442 0
> default.rgw.buckets.non-ec 0   0  0 0
> 0   0   0161  161k128 0
> default.rgw.control0   8  0 24
> 0   0   0  0 0  0 0
> default.rgw.data.root   1228   4  0 12
> 0   0   0 27 24576 61 15360
> default.rgw.gc 0  32  0 96
> 0   0   0   6620 6588k   4416 0
> default.rgw.lc 0  32  0 96
> 0   0   0894  862k448 0
> default.rgw.log0 128  0 384
> 0   0   0 151070  147M 100778 0
> default.rgw.users.keys11   1  0 3
> 0   0   0 21 14336  1  1024
> default.rgw.users.uid358   2  0 6
> 0   0   0   4507 4503k   4385  1024
> kraken-test0   1  0 3
> 0   0   0 652378  199G 371644  507G
> rbd0   0  0 0
> 0   0   0   1309 5196M   2518 5000M
> scbench0   0  0 0
> 0   0   0   1154 4608M   3458 4608M
>
>
> Any other hint? Of course, having more verbosity from the rados init
> would be very helpful, but even with FULL_DEBUG I always get the same
> messages and nothing more.
> Thanks,
>
>  Alessandro
>
> Il 13/01/17 19:37, Daniel Gryniewicz ha scritto:
>> Hi, Alessandro.
>>
>> This error (-5) is caused by the failure to initialize the RADOS client
>> in librados.  Can you perform ceph operations from that same host?
>> (say, ceph -s)  It's likely to be a problem in your ceph.conf, I think,
>> such as wrong or unreachable MON addresses.
>>
>> Daniel
>>
>> On 01/13/2017 12:39 PM, Alessandro De Salvo wrote:
>>> Hi,
>>> I'm trying to use the RGW FSAL on CentOS 7 with ceph kraken v11.1.1 and
>>> ganesha 2.4.1-2. I have rebuilt the RPMS from the rawhide fedora
>>> version, who is now including the RGW FSAL. When trying to run the
>>> ganesha daemon I get the following error:
>>>
>>> 13/01/2017 17:21:15 : epoch 58790c88 : node1 : ganesha.nfsd-1[main]
>>> init :FSAL :DEBUG :RGW module registering.
>>> 13/01/2017 17:21:15 : epoch 58790c88 : node1 : ganesha.nfsd-1[main]
>>> init_config :FSAL :DEBUG :RGW module setup.
>>> 13/01/2017 17:21:15 : epoch 58790c88 : node1 : ganesha.nfsd-1[main]
>>> create_export :FSAL :CRIT :RGW module: librgw init failed (-5)
>>> 13/01/2017 17:21:15 : epoch 58790c88 : node1 : ganesha.nfsd-1[main]
>>> fsal_put :FSAL :INFO :FSAL RGW now unused
>>>
>>>
>>> The daemon is run in a privileged docker container with
>>>
>>> /usr/bin/ganesha.nfsd -F -N NIV_DEBUG -L /var/log/ganesha.log
>>> -f /etc/ganesha/ganesha.conf
>>>
>>> All the ceph.conf and keyrings are properly installed in the machine and
>>> container, and in fact I can access the ceph cluster correctly and the
>>> RGW instance.
>>> The ganesha configuration is the following:
>>>
>>> EXPORT
>>> {
>>> Export_ID=1;
>>>
>>> Path = "/atlas";
>>>
>>> Pseudo = "/atlas";
>>>
>>> Access_Type = RW;
>>>
>>>  SecType = "sys";
>>>
>>> FSAL {
>>> Name = RGW;
>>> User_Id = "testuser";
>>> Access_Key_Id ="testkey";
>>> 

[Nfs-ganesha-devel] [mdcache] unreachable directory

2017-01-16 Thread Swen Schillig
Dan

while I was performing some simple tests to validate some of my code,
I stumbled over some possible mdcache issue.

Here's what I'm doing. (ganesha-2.5-dev9)
I create the following directory structure

mkdir -p /home/swen//d/c/def/g/h/i/j/

where /home/swen/ is the mount point for a [V3|V4.0] mounted VFS.

While executing

dd if=/dev/urandom of=/home/swen//d/c/def/g/h/i/j/sepp.dd bs=4k 
count=10 &

I'm trying to delete 

rm -r /home/swen//d/c/def/g

which fails with directory not empty, which might be ok while still writing. 
But even after the write command is finished the error persists.

Even though it shouldn't matter but looking at the directory I can see that it 
is empty
[swen@localhost ~]$ ls -lia /home/swen//d/c/def/g
total 8
1376503 drwxrwxr-x 3 swen swen 4096 16. Jan 13:54 .
1376502 drwxrwxr-x 3 swen swen 4096 16. Jan 13:54 ..

Removing the directory from the FS directly(not via the NFS mount-point) 
succeeds.

I've collected the logs for CACHE_INODE with FULL_DEBUG in case that might help.

# logs - trying to delete while writing with dd ##
16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
ganesha.nfsd-14293[work-240] mdcache_getattrs :INODE :F_DBG :attrs  obj 
attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 
mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 
13:54:45
16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
ganesha.nfsd-14293[work-240] mdcache_getattrs :INODE :F_DBG :attrs  obj 
attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 
mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 
13:54:45
16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
ganesha.nfsd-14293[work-192] mdcache_getattrs :INODE :F_DBG :attrs  obj 
attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x2 size=0x1000 
mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 
13:54:45
16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
ganesha.nfsd-14293[work-173] mdcache_getattrs :INODE :F_DBG :attrs  obj 
attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 
mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 
13:54:45
16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
ganesha.nfsd-14293[work-192] mdcache_getattrs :INODE :F_DBG :attrs  obj 
attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x2 size=0x1000 
mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 
13:54:45
16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG :attrs  obj 
attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 
mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 
13:54:45
16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
ganesha.nfsd-14293[work-225] mdcache_getattrs :INODE :F_DBG :attrs  obj 
attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x2 size=0x1000 
mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 
13:54:45
16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
ganesha.nfsd-14293[work-147] mdc_lookup :INODE :F_DBG :Lookup ..
16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
ganesha.nfsd-14293[work-147] mdc_lookup :INODE :F_DBG :Lookup parent (..)
16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG :attrs  obj 
attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 
mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:53:15 mtime=16/01/2017 
13:54:45
16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG :attrs  obj 
attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 
mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 
13:54:45
16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
ganesha.nfsd-14293[work-225] mdc_lookup :INODE :F_DBG :Lookup sepp.dd
16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG :attrs  obj 
attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 
mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 
13:54:45
16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
ganesha.nfsd-14293[work-147] mdc_try_get_cached :INODE :F_DBG :Look in cache h, 
trust content yes
16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
ganesha.nfsd-14293[work-225] mdc_try_get_cached :INODE :F_DBG :Look in cache 
sepp.dd, trust content yes
16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : 
ganesha.nfsd-14293[work-147] mdcache_avl_qp_lookup_s :INODE :F_DBG :Lookup h
16/01/2017 13:54:45 : epoch 587cb7dc :