Re: [Nfs-ganesha-devel] nTI-RPC refcounting and locking
It sounds like your change is a step in the direction of unifying CLNT and SVCXPRT handle structures. As we've discussed off-list, if you take on the project of unifying the actual handle structures, you get the lock consolidation for free. In any event, if rpc_dplx_rec contains a service handle expanded, it appears to need a client handle as well. Matt - Original Message - > From: "William Allen Simpson"> To: "NFS Ganesha Developers" , "Swen > Schillig" > Sent: Monday, January 16, 2017 1:53:24 PM > Subject: [Nfs-ganesha-devel] nTI-RPC refcounting and locking > > Swen, I've been looking at your patch, and it has some good ideas. > For some odd reason, I awoke at 1:30 am thinking about it, and > got up and wrote some code. > > I've taken another patch of mine, and added the SVCXPRT into the > rpc_dplx_rec, eliminating the refcnt entirely (using the SVCXPRT). > > After all, there's no reason to zalloc them separately. They > always are created at the same time. > > So I'm wondering about your thoughts on the locking. They seem > redundant. I'm thinking about changing REC_LOCK to use the > SVCXPRT xp_lock, instead. > > There's a spot in the existing rpc_dplx_rec creation code where > there's a timing hole in the code after an existing one is > found so the extra refcount is decremented. Another process > could also decrement and free, and there could be a pointer into > freed memory. Unifying the lock would be one solution (better > and faster than the usual solution with two locks). > > The SVCXPRT lock code has a lot more debugging and testing, too. > > Any other related ideas? > > BTW, I got rid of the , too. Changed it to a callback > function ;) > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, SlashDot.org! http://sdm.link/slashdot > ___ > Nfs-ganesha-devel mailing list > Nfs-ganesha-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel > -- Matt Benjamin Red Hat, Inc. 315 West Huron Street, Suite 140A Ann Arbor, Michigan 48103 http://www.redhat.com/en/technologies/storage tel. 734-821-5101 fax. 734-769-8938 cel. 734-216-5309 -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] Request: I'm looking for good file system test programs
Frank asked: What sorts of tests? My answer: I am hoping to benefit from the experience of this group since this group has spent much more time than I working with NFS-Ganesha (and perhaps other file systems). I hope that this group has identified useful tests (beside throughput tests). For example; a file data write test would not only confirm that the write function did not report errors but would also read back previously written data to verify that the contents remain correct. Reads immediately after write are very likely to read back from cache so I expect a good test program would allow read back validation of data at a later time (e.g. after reboot). This group is probably more knowledgeable than I am regarding the many possible file system operations: create, delete, open, close, read, write, get attribute, set attribute, etc. On 01/16/2017 02:59 PM, Frank Filz wrote: >> I'm looking for file system operation/stability test programs. >> >> >> I'm most interested in test programs that do many file operations that are >> then verified rather than programs that concentrate on performance tests. > What sorts of tests? There is the pjd-fstest test suite that tests POSIX > compliance. > > Frank > > > --- > This email has been checked for viruses by Avast antivirus software. > https://www.avast.com/antivirus > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] [mdcache] unreachable directory
On 01/16/2017 03:08 PM, Swen Schillig wrote: > On Mo, 2017-01-16 at 10:52 -0500, Daniel Gryniewicz wrote: >> Hi, Swen. >> >> Looking through that log, the failures of unlink() are returned from >> the >> sub_fsal, not directly caused by MDCACHE, so it's whatever's >> underneath >> (GPFS, presumably?) that's returning ENOTEMPTY: > Hi Dan > > sorry for dumping that issue on you but I didn't really investigate and > my best guess was that it got something to do with mdcache. > Anyhow, it was not in relation with GPFS, I'm using VFS for my local > tests. > I will see if I can try the same with GPFS. No problem, it took me all of 2 minutes to find the line in the log, and where it came from, and I'm glad to help. Daniel > > Cheers Swen >> >> 16/01/2017 13:56:16 : epoch 587cb7dc : dhcp-9-244-58-137 : >> ganesha.nfsd-14293[work-26] mdcache_unlink :INODE :DEBUG :unlink i >> returned The directory is not empty >> >> Daniel >> >> On 01/16/2017 08:25 AM, Swen Schillig wrote: >>> >>> Dan >>> >>> while I was performing some simple tests to validate some of my >>> code, >>> I stumbled over some possible mdcache issue. >>> >>> Here's what I'm doing. (ganesha-2.5-dev9) >>> I create the following directory structure >>> >>> mkdir -p /home/swen//d/c/def/g/h/i/j/ >>> >>> where /home/swen/ is the mount point for a [V3|V4.0] mounted >>> VFS. >>> >>> While executing >>> >>> dd if=/dev/urandom of=/home/swen//d/c/def/g/h/i/j/sepp.dd bs=4k >>> count=10 & >>> >>> I'm trying to delete >>> >>> rm -r /home/swen//d/c/def/g >>> >>> which fails with directory not empty, which might be ok while still >>> writing. >>> But even after the write command is finished the error persists. >>> >>> Even though it shouldn't matter but looking at the directory I can >>> see that it is empty >>> [swen@localhost ~]$ ls -lia /home/swen//d/c/def/g >>> total 8 >>> 1376503 drwxrwxr-x 3 swen swen 4096 16. Jan 13:54 . >>> 1376502 drwxrwxr-x 3 swen swen 4096 16. Jan 13:54 .. >>> >>> Removing the directory from the FS directly(not via the NFS mount- >>> point) succeeds. >>> >>> I've collected the logs for CACHE_INODE with FULL_DEBUG in case >>> that might help. >>> >>> # logs - trying to delete while writing with dd >>> ## >>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : >>> ganesha.nfsd-14293[work-240] mdcache_getattrs :INODE :F_DBG >>> :attrs obj attributes Mask = 0680Mask = 0015dfce DIRECTORY >>> numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 >>> atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 >>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : >>> ganesha.nfsd-14293[work-240] mdcache_getattrs :INODE :F_DBG >>> :attrs obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY >>> numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 >>> atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 >>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : >>> ganesha.nfsd-14293[work-192] mdcache_getattrs :INODE :F_DBG >>> :attrs obj attributes Mask = 0680Mask = 0015dfce DIRECTORY >>> numlinks=0x2 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 >>> atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 >>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : >>> ganesha.nfsd-14293[work-173] mdcache_getattrs :INODE :F_DBG >>> :attrs obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY >>> numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 >>> atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 >>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : >>> ganesha.nfsd-14293[work-192] mdcache_getattrs :INODE :F_DBG >>> :attrs obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY >>> numlinks=0x2 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 >>> atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 >>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : >>> ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG >>> :attrs obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY >>> numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 >>> atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 >>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : >>> ganesha.nfsd-14293[work-225] mdcache_getattrs :INODE :F_DBG >>> :attrs obj attributes Mask = 0680Mask = 0015dfce DIRECTORY >>> numlinks=0x2 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 >>> atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 >>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : >>> ganesha.nfsd-14293[work-147] mdc_lookup :INODE :F_DBG :Lookup .. >>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : >>> ganesha.nfsd-14293[work-147] mdc_lookup :INODE :F_DBG :Lookup >>> parent (..) >>> 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : >>> ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG >>> :attrs obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY >>> numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 >>>
Re: [Nfs-ganesha-devel] [mdcache] unreachable directory
On Mo, 2017-01-16 at 10:52 -0500, Daniel Gryniewicz wrote: > Hi, Swen. > > Looking through that log, the failures of unlink() are returned from > the > sub_fsal, not directly caused by MDCACHE, so it's whatever's > underneath > (GPFS, presumably?) that's returning ENOTEMPTY: Hi Dan sorry for dumping that issue on you but I didn't really investigate and my best guess was that it got something to do with mdcache. Anyhow, it was not in relation with GPFS, I'm using VFS for my local tests. I will see if I can try the same with GPFS. Cheers Swen > > 16/01/2017 13:56:16 : epoch 587cb7dc : dhcp-9-244-58-137 : > ganesha.nfsd-14293[work-26] mdcache_unlink :INODE :DEBUG :unlink i > returned The directory is not empty > > Daniel > > On 01/16/2017 08:25 AM, Swen Schillig wrote: > > > > Dan > > > > while I was performing some simple tests to validate some of my > > code, > > I stumbled over some possible mdcache issue. > > > > Here's what I'm doing. (ganesha-2.5-dev9) > > I create the following directory structure > > > > mkdir -p /home/swen//d/c/def/g/h/i/j/ > > > > where /home/swen/ is the mount point for a [V3|V4.0] mounted > > VFS. > > > > While executing > > > > dd if=/dev/urandom of=/home/swen//d/c/def/g/h/i/j/sepp.dd bs=4k > > count=10 & > > > > I'm trying to delete > > > > rm -r /home/swen//d/c/def/g > > > > which fails with directory not empty, which might be ok while still > > writing. > > But even after the write command is finished the error persists. > > > > Even though it shouldn't matter but looking at the directory I can > > see that it is empty > > [swen@localhost ~]$ ls -lia /home/swen//d/c/def/g > > total 8 > > 1376503 drwxrwxr-x 3 swen swen 4096 16. Jan 13:54 . > > 1376502 drwxrwxr-x 3 swen swen 4096 16. Jan 13:54 .. > > > > Removing the directory from the FS directly(not via the NFS mount- > > point) succeeds. > > > > I've collected the logs for CACHE_INODE with FULL_DEBUG in case > > that might help. > > > > # logs - trying to delete while writing with dd > > ## > > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > > ganesha.nfsd-14293[work-240] mdcache_getattrs :INODE :F_DBG > > :attrs obj attributes Mask = 0680Mask = 0015dfce DIRECTORY > > numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 > > atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 > > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > > ganesha.nfsd-14293[work-240] mdcache_getattrs :INODE :F_DBG > > :attrs obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY > > numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 > > atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 > > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > > ganesha.nfsd-14293[work-192] mdcache_getattrs :INODE :F_DBG > > :attrs obj attributes Mask = 0680Mask = 0015dfce DIRECTORY > > numlinks=0x2 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 > > atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 > > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > > ganesha.nfsd-14293[work-173] mdcache_getattrs :INODE :F_DBG > > :attrs obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY > > numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 > > atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 > > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > > ganesha.nfsd-14293[work-192] mdcache_getattrs :INODE :F_DBG > > :attrs obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY > > numlinks=0x2 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 > > atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 > > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > > ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG > > :attrs obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY > > numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 > > atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 > > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > > ganesha.nfsd-14293[work-225] mdcache_getattrs :INODE :F_DBG > > :attrs obj attributes Mask = 0680Mask = 0015dfce DIRECTORY > > numlinks=0x2 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 > > atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 > > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > > ganesha.nfsd-14293[work-147] mdc_lookup :INODE :F_DBG :Lookup .. > > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > > ganesha.nfsd-14293[work-147] mdc_lookup :INODE :F_DBG :Lookup > > parent (..) > > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > > ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG > > :attrs obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY > > numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 > > atime=16/01/2017 13:53:15 mtime=16/01/2017 13:54:45 > > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > > ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG > >
Re: [Nfs-ganesha-devel] nTI-RPC fds
On Mo, 2017-01-16 at 14:21 -0500, William Allen Simpson wrote: > Swen, your other very short patch fixes a problem with closing the > fd. And that's a good thing. But the underlying problem is that > we have multiple copies of the fd, and do not know whether it has > been closed. > > I'm thinking that it would be best to have one copy, in the SVCXPRT, > and have everybody use that one. It starts at 0, and could be set > to -1 to indicate that it has been closed -- rather than a closeit > flag as was done. > > Any thoughts? > Sounds good, go ahead. Cheers Swen. -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] nTI-RPC refcounting and locking
On Mo, 2017-01-16 at 13:53 -0500, William Allen Simpson wrote: > Swen, I've been looking at your patch, and it has some good ideas. > For some odd reason, I awoke at 1:30 am thinking about it, and > got up and wrote some code. > I never intended to give you sleepless nights :-) > I've taken another patch of mine, and added the SVCXPRT into the > rpc_dplx_rec, eliminating the refcnt entirely (using the SVCXPRT). > > After all, there's no reason to zalloc them separately. They > always are created at the same time. > > So I'm wondering about your thoughts on the locking. They seem > redundant. I'm thinking about changing REC_LOCK to use the > SVCXPRT xp_lock, instead. > > There's a spot in the existing rpc_dplx_rec creation code where > there's a timing hole in the code after an existing one is > found so the extra refcount is decremented. Another process > could also decrement and free, and there could be a pointer into > freed memory. Unifying the lock would be one solution (better > and faster than the usual solution with two locks). > > The SVCXPRT lock code has a lot more debugging and testing, too. > > Any other related ideas? > > BTW, I got rid of the , too. Changed it to a callback > function ;) That sounds like a much more invasive change. I wasn't that brave at the time and just tried to fix what was wrong, anyhow, a good rewrite of that area is always favorable. It would be good if you could post/include all your patches as soon as possible as I believe the ntirpc area does need some updates. I hope I will find some time again soon and try to help out as well, if that's ok. Cheers Swen -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] Request: I'm looking for good file system test programs
> I'm looking for file system operation/stability test programs. > > > I'm most interested in test programs that do many file operations that are > then verified rather than programs that concentrate on performance tests. What sorts of tests? There is the pjd-fstest test suite that tests POSIX compliance. Frank --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
[Nfs-ganesha-devel] Request: I'm looking for good file system test programs
I'm looking for file system operation/stability test programs. I'm most interested in test programs that do many file operations that are then verified rather than programs that concentrate on performance tests. Thanks in advance, Kevin -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
[Nfs-ganesha-devel] nTI-RPC fds
Swen, your other very short patch fixes a problem with closing the fd. And that's a good thing. But the underlying problem is that we have multiple copies of the fd, and do not know whether it has been closed. I'm thinking that it would be best to have one copy, in the SVCXPRT, and have everybody use that one. It starts at 0, and could be set to -1 to indicate that it has been closed -- rather than a closeit flag as was done. Any thoughts? -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
[Nfs-ganesha-devel] nTI-RPC refcounting and locking
Swen, I've been looking at your patch, and it has some good ideas. For some odd reason, I awoke at 1:30 am thinking about it, and got up and wrote some code. I've taken another patch of mine, and added the SVCXPRT into the rpc_dplx_rec, eliminating the refcnt entirely (using the SVCXPRT). After all, there's no reason to zalloc them separately. They always are created at the same time. So I'm wondering about your thoughts on the locking. They seem redundant. I'm thinking about changing REC_LOCK to use the SVCXPRT xp_lock, instead. There's a spot in the existing rpc_dplx_rec creation code where there's a timing hole in the code after an existing one is found so the extra refcount is decremented. Another process could also decrement and free, and there could be a pointer into freed memory. Unifying the lock would be one solution (better and faster than the usual solution with two locks). The SVCXPRT lock code has a lot more debugging and testing, too. Any other related ideas? BTW, I got rid of the , too. Changed it to a callback function ;) -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] [mdcache] unreachable directory
Hi, Swen. Looking through that log, the failures of unlink() are returned from the sub_fsal, not directly caused by MDCACHE, so it's whatever's underneath (GPFS, presumably?) that's returning ENOTEMPTY: 16/01/2017 13:56:16 : epoch 587cb7dc : dhcp-9-244-58-137 : ganesha.nfsd-14293[work-26] mdcache_unlink :INODE :DEBUG :unlink i returned The directory is not empty Daniel On 01/16/2017 08:25 AM, Swen Schillig wrote: > Dan > > while I was performing some simple tests to validate some of my code, > I stumbled over some possible mdcache issue. > > Here's what I'm doing. (ganesha-2.5-dev9) > I create the following directory structure > > mkdir -p /home/swen//d/c/def/g/h/i/j/ > > where /home/swen/ is the mount point for a [V3|V4.0] mounted VFS. > > While executing > > dd if=/dev/urandom of=/home/swen//d/c/def/g/h/i/j/sepp.dd bs=4k > count=10 & > > I'm trying to delete > > rm -r /home/swen//d/c/def/g > > which fails with directory not empty, which might be ok while still writing. > But even after the write command is finished the error persists. > > Even though it shouldn't matter but looking at the directory I can see that > it is empty > [swen@localhost ~]$ ls -lia /home/swen//d/c/def/g > total 8 > 1376503 drwxrwxr-x 3 swen swen 4096 16. Jan 13:54 . > 1376502 drwxrwxr-x 3 swen swen 4096 16. Jan 13:54 .. > > Removing the directory from the FS directly(not via the NFS mount-point) > succeeds. > > I've collected the logs for CACHE_INODE with FULL_DEBUG in case that might > help. > > # logs - trying to delete while writing with dd ## > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > ganesha.nfsd-14293[work-240] mdcache_getattrs :INODE :F_DBG :attrs obj > attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 > mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 > 13:54:45 > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > ganesha.nfsd-14293[work-240] mdcache_getattrs :INODE :F_DBG :attrs obj > attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 > mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 > 13:54:45 > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > ganesha.nfsd-14293[work-192] mdcache_getattrs :INODE :F_DBG :attrs obj > attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x2 size=0x1000 > mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 > 13:54:45 > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > ganesha.nfsd-14293[work-173] mdcache_getattrs :INODE :F_DBG :attrs obj > attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 > mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 > 13:54:45 > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > ganesha.nfsd-14293[work-192] mdcache_getattrs :INODE :F_DBG :attrs obj > attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x2 size=0x1000 > mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 > 13:54:45 > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG :attrs obj > attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 > mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 > 13:54:45 > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > ganesha.nfsd-14293[work-225] mdcache_getattrs :INODE :F_DBG :attrs obj > attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x2 size=0x1000 > mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 > 13:54:45 > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > ganesha.nfsd-14293[work-147] mdc_lookup :INODE :F_DBG :Lookup .. > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > ganesha.nfsd-14293[work-147] mdc_lookup :INODE :F_DBG :Lookup parent (..) > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG :attrs obj > attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 > mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:53:15 mtime=16/01/2017 > 13:54:45 > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG :attrs obj > attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 > mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 > 13:54:45 > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > ganesha.nfsd-14293[work-225] mdc_lookup :INODE :F_DBG :Lookup sepp.dd > 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : > ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG :attrs obj > attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000
[Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: [valgrind] Mem-leak: Free channel client memory.
>From Swen Schillig: Swen Schillig has uploaded a new change for review. ( https://review.gerrithub.io/343215 Change subject: [valgrind] Mem-leak: Free channel client memory. .. [valgrind] Mem-leak: Free channel client memory. Memory is not free'd due to calling the wrong CB (release vs destroy). ==11341== 2,364,219 (288 direct, 2,363,931 indirect) bytes in 3 blocks are definitely lost in loss record 347 of 355 ==11341==at 0x4C2BBAD: malloc (vg_replace_malloc.c:299) ==11341==by 0x44B611: gsh_malloc__ (abstract_mem.h:78) ==11341==by 0x6A8C2BB: clnt_vc_ncreate2 (clnt_vc.c:237) ==11341==by 0x6AA7129: clnt_vc_ncreate_svc (svc_vc.c:1292) ==11341==by 0x43C3DB: nfs_rpc_create_chan_v41 (nfs_rpc_callback.c:656) ==11341==by 0x461338: nfs4_op_create_session (nfs4_op_create_session.c:498) ==11341==by 0x45CC5A: nfs4_Compound (nfs4_Compound.c:734) ==11341==by 0x44A9D4: nfs_rpc_execute (nfs_worker_thread.c:1281) ==11341==by 0x44B296: worker_run (nfs_worker_thread.c:1548) ==11341==by 0x4FEE0E: fridgethr_start_routine (fridgethr.c:550) ==11341==by 0x665E5C9: start_thread (in /usr/lib64/libpthread-2.23.so) ==11341==by 0x6FE50EC: clone (in /usr/lib64/libc-2.23.so) Change-Id: I24fa3b5c31eb219304f6e723d6b30ede720dc2ef Signed-off-by: Swen Schillig --- M src/MainNFSD/nfs_rpc_callback.c 1 file changed, 7 insertions(+), 44 deletions(-) git pull ssh://review.gerrithub.io:29419/ffilz/nfs-ganesha refs/changes/15/343215/1 -- To view, visit https://review.gerrithub.io/343215 To unsubscribe, visit https://review.gerrithub.io/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I24fa3b5c31eb219304f6e723d6b30ede720dc2ef Gerrit-Change-Number: 343215 Gerrit-PatchSet: 1 Gerrit-Project: ffilz/nfs-ganesha Gerrit-Branch: next Gerrit-Owner: Swen Schillig -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] create_export :FSAL :CRIT :RGW module: librgw init failed (-5)
The lack of verbosity is not Ganesha's fault; it only gets the single error code back from Ceph. Try turning up all your client related logging in your ceph.conf, and check it's logging? Daniel On 01/14/2017 03:02 PM, Alessandro De Salvo wrote: > Hi Daniel, > > indeed, this is the root cause, but I do not understand what's wrong > here and the real cause of the failure. Since I was suspicious about the > ceph.conf setup I've already checked that it all works, and it does > indeed. I can issue ceph -s or rados df without any problem. > > Here I give you a couple of examples: > > > $ ceph -s > > cluster aac2c2c4-5953-44d7-b90c-9922a8ccd77a > health HEALTH_OK > monmap e4: 3 mons at > {mon1=:6789/0,mon2=:6789/0,mon3=:6789/0} > election epoch 124, quorum 0,1,2 mon3,mon2,mon1 >fsmap e42: 1/1/1 up {0=mds1=up:active}, 1 up:standby > mgr no daemons active > osdmap e16477: 54 osds: 52 up, 52 in > flags sortbitwise,require_jewel_osds,require_kraken_osds >pgmap v656338: 800 pgs, 16 pools, 4136 MB data, 1326 objects > 48076 MB used, 71839 GB / 71886 GB avail > 800 active+clean > > > $ rados df > > POOL_NAME USED OBJECTS CLONES COPIES > MISSING_ON_PRIMARY UNFOUND DEGRAED RD_OPS RDWR_OPS WR > .rgw.root 1681 4 0 12 > 0 0 0156 121k 4 5120 > cephfs_data0 0 0 0 > 0 0 0 0 0 0 0 > cephfs_metadata 2148 20 0 60 > 0 0 0 23 24576 41 7168 > default.rgw.buckets.data 4136M1092 0 3276 > 0 0 0150 124k 9745 4136M > default.rgw.buckets.index 0 2 0 6 > 0 0 0796 1264k442 0 > default.rgw.buckets.non-ec 0 0 0 0 > 0 0 0161 161k128 0 > default.rgw.control0 8 0 24 > 0 0 0 0 0 0 0 > default.rgw.data.root 1228 4 0 12 > 0 0 0 27 24576 61 15360 > default.rgw.gc 0 32 0 96 > 0 0 0 6620 6588k 4416 0 > default.rgw.lc 0 32 0 96 > 0 0 0894 862k448 0 > default.rgw.log0 128 0 384 > 0 0 0 151070 147M 100778 0 > default.rgw.users.keys11 1 0 3 > 0 0 0 21 14336 1 1024 > default.rgw.users.uid358 2 0 6 > 0 0 0 4507 4503k 4385 1024 > kraken-test0 1 0 3 > 0 0 0 652378 199G 371644 507G > rbd0 0 0 0 > 0 0 0 1309 5196M 2518 5000M > scbench0 0 0 0 > 0 0 0 1154 4608M 3458 4608M > > > Any other hint? Of course, having more verbosity from the rados init > would be very helpful, but even with FULL_DEBUG I always get the same > messages and nothing more. > Thanks, > > Alessandro > > Il 13/01/17 19:37, Daniel Gryniewicz ha scritto: >> Hi, Alessandro. >> >> This error (-5) is caused by the failure to initialize the RADOS client >> in librados. Can you perform ceph operations from that same host? >> (say, ceph -s) It's likely to be a problem in your ceph.conf, I think, >> such as wrong or unreachable MON addresses. >> >> Daniel >> >> On 01/13/2017 12:39 PM, Alessandro De Salvo wrote: >>> Hi, >>> I'm trying to use the RGW FSAL on CentOS 7 with ceph kraken v11.1.1 and >>> ganesha 2.4.1-2. I have rebuilt the RPMS from the rawhide fedora >>> version, who is now including the RGW FSAL. When trying to run the >>> ganesha daemon I get the following error: >>> >>> 13/01/2017 17:21:15 : epoch 58790c88 : node1 : ganesha.nfsd-1[main] >>> init :FSAL :DEBUG :RGW module registering. >>> 13/01/2017 17:21:15 : epoch 58790c88 : node1 : ganesha.nfsd-1[main] >>> init_config :FSAL :DEBUG :RGW module setup. >>> 13/01/2017 17:21:15 : epoch 58790c88 : node1 : ganesha.nfsd-1[main] >>> create_export :FSAL :CRIT :RGW module: librgw init failed (-5) >>> 13/01/2017 17:21:15 : epoch 58790c88 : node1 : ganesha.nfsd-1[main] >>> fsal_put :FSAL :INFO :FSAL RGW now unused >>> >>> >>> The daemon is run in a privileged docker container with >>> >>> /usr/bin/ganesha.nfsd -F -N NIV_DEBUG -L /var/log/ganesha.log >>> -f /etc/ganesha/ganesha.conf >>> >>> All the ceph.conf and keyrings are properly installed in the machine and >>> container, and in fact I can access the ceph cluster correctly and the >>> RGW instance. >>> The ganesha configuration is the following: >>> >>> EXPORT >>> { >>> Export_ID=1; >>> >>> Path = "/atlas"; >>> >>> Pseudo = "/atlas"; >>> >>> Access_Type = RW; >>> >>> SecType = "sys"; >>> >>> FSAL { >>> Name = RGW; >>> User_Id = "testuser"; >>> Access_Key_Id ="testkey"; >>>
[Nfs-ganesha-devel] [mdcache] unreachable directory
Dan while I was performing some simple tests to validate some of my code, I stumbled over some possible mdcache issue. Here's what I'm doing. (ganesha-2.5-dev9) I create the following directory structure mkdir -p /home/swen//d/c/def/g/h/i/j/ where /home/swen/ is the mount point for a [V3|V4.0] mounted VFS. While executing dd if=/dev/urandom of=/home/swen//d/c/def/g/h/i/j/sepp.dd bs=4k count=10 & I'm trying to delete rm -r /home/swen//d/c/def/g which fails with directory not empty, which might be ok while still writing. But even after the write command is finished the error persists. Even though it shouldn't matter but looking at the directory I can see that it is empty [swen@localhost ~]$ ls -lia /home/swen//d/c/def/g total 8 1376503 drwxrwxr-x 3 swen swen 4096 16. Jan 13:54 . 1376502 drwxrwxr-x 3 swen swen 4096 16. Jan 13:54 .. Removing the directory from the FS directly(not via the NFS mount-point) succeeds. I've collected the logs for CACHE_INODE with FULL_DEBUG in case that might help. # logs - trying to delete while writing with dd ## 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : ganesha.nfsd-14293[work-240] mdcache_getattrs :INODE :F_DBG :attrs obj attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : ganesha.nfsd-14293[work-240] mdcache_getattrs :INODE :F_DBG :attrs obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : ganesha.nfsd-14293[work-192] mdcache_getattrs :INODE :F_DBG :attrs obj attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x2 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : ganesha.nfsd-14293[work-173] mdcache_getattrs :INODE :F_DBG :attrs obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : ganesha.nfsd-14293[work-192] mdcache_getattrs :INODE :F_DBG :attrs obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x2 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG :attrs obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : ganesha.nfsd-14293[work-225] mdcache_getattrs :INODE :F_DBG :attrs obj attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x2 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : ganesha.nfsd-14293[work-147] mdc_lookup :INODE :F_DBG :Lookup .. 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : ganesha.nfsd-14293[work-147] mdc_lookup :INODE :F_DBG :Lookup parent (..) 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG :attrs obj attributes Mask = 0001dfceMask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:53:15 mtime=16/01/2017 13:54:45 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG :attrs obj attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : ganesha.nfsd-14293[work-225] mdc_lookup :INODE :F_DBG :Lookup sepp.dd 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : ganesha.nfsd-14293[work-147] mdcache_getattrs :INODE :F_DBG :attrs obj attributes Mask = 0680Mask = 0015dfce DIRECTORY numlinks=0x3 size=0x1000 mode=0775 owner=0x3e8 group=0x3e8 atime=16/01/2017 13:54:45 mtime=16/01/2017 13:54:45 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : ganesha.nfsd-14293[work-147] mdc_try_get_cached :INODE :F_DBG :Look in cache h, trust content yes 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : ganesha.nfsd-14293[work-225] mdc_try_get_cached :INODE :F_DBG :Look in cache sepp.dd, trust content yes 16/01/2017 13:54:45 : epoch 587cb7dc : dhcp-9-244-58-137 : ganesha.nfsd-14293[work-147] mdcache_avl_qp_lookup_s :INODE :F_DBG :Lookup h 16/01/2017 13:54:45 : epoch 587cb7dc :