[nfs-discuss] Lots of .nfs files being left around
On Wed, 03 Dec 2008 12:53:34 +0100, Mike Gerdts wrote: > On Wed, Dec 3, 2008 at 4:57 AM, Frank Batschulat (Home) > wrote: >> On Tue, 02 Dec 2008 23:04:39 +0100, Mike Gerdts wrote: >> >>> Unless there is some long-latent bug in CVS, this looks to be a >>> regression in the NFSv3 client. >> >> quite possibly: >> >> nfs3_inactive can leave .nfsXXX files behind >> http://bugs.opensolaris.org/view_bug.do?bug_id=5029852 > > Ahh... search term should have been .nfsXXX rather than .nfs. :) > > The bug says that the problem exists in 5.10 as well but I have been > unable to reproduce on S10u4 (different network topology) or S10u6 > (same network topology) against the same NFS file system. Is there interesting indeed. > something that changed in Nevada that would cause this condition to be > triggered more frequently? nothing I'm aware of yet as far as V3 is concerned, though that does not mean there's nothing new in a different path ;-) what we do have already, but for V4 is: NFSv4 clients leave too many .nfsXXX files around http://bugs.opensolaris.org/view_bug.do?bug_id=6636160 though it specifically claims using V3 cures the problem... so this must be something rather new or yet unknown, would it be possible to reproduce this somehow ? --- frankB
[nfs-discuss] Lots of .nfs files being left around
On Tue, 02 Dec 2008 23:04:39 +0100, Mike Gerdts wrote: > Unless there is some long-latent bug in CVS, this looks to be a > regression in the NFSv3 client. quite possibly: nfs3_inactive can leave .nfsXXX files behind http://bugs.opensolaris.org/view_bug.do?bug_id=5029852 I've moved the comments to the description, but they are not yet visible, so here they are for the time being: While testing fix for 4903465, I found an error path which could leave behind .nfsXXX files on the server. If a open file has been renamed or unlinkied then r_unldvp will be set. nfs3_inactive() checks if r_unldvp is present and if so does the remove operation for the .nfs file on the server. Before we do the remove we unset r_unldvp. If the thread which is doing the remove gets signalled before entering rfscall(), then rfscall() will fail returning RPC_INTR as the status. Back in nfs3_inactive() we do not do anything with unldvp if the call has failed. And we add the rnode to the free list. So the .nfsXXX file gets left behind. The number of files left behind can be huge if we are entering nfs3_inactive() through nfs_purge_caches() of a directory and the thread gets signalled in between. Since the only place we check for the signal is rfscall() in the code path of nfs_purge_caches() -> dnlc_purge_vp() -> nfs3_inactive() -> rfs3call() -> rfscall() we will end up with lot of .nfsXXX files on the server. -- frankB It is always possible to agglutinate multiple separate problems into a single complex interdependent solution. In most cases this is a bad idea.
[nfs-discuss] Lots of .nfs files being left around
On Wed, Dec 3, 2008 at 6:59 AM, Frank Batschulat (Home)
wrote:
> On Wed, 03 Dec 2008 12:53:34 +0100, Mike Gerdts wrote:
>
>> On Wed, Dec 3, 2008 at 4:57 AM, Frank Batschulat (Home)
>> wrote:
>>> On Tue, 02 Dec 2008 23:04:39 +0100, Mike Gerdts
>>> wrote:
>>>
Unless there is some long-latent bug in CVS, this looks to be a
regression in the NFSv3 client.
>>>
>>> quite possibly:
>>>
>>> nfs3_inactive can leave .nfsXXX files behind
>>> http://bugs.opensolaris.org/view_bug.do?bug_id=5029852
>>
>> Ahh... search term should have been .nfsXXX rather than .nfs. :)
>>
>> The bug says that the problem exists in 5.10 as well but I have been
>> unable to reproduce on S10u4 (different network topology) or S10u6
>> (same network topology) against the same NFS file system. Is there
>
> interesting indeed.
>
>> something that changed in Nevada that would cause this condition to be
>> triggered more frequently?
>
> nothing I'm aware of yet as far as V3 is concerned, though that does not
> mean there's nothing new in a different path ;-)
>
> what we do have already, but for V4 is:
I was almost sure you were going to say "just use NFSv4". :)
> NFSv4 clients leave too many .nfsXXX files around
> http://bugs.opensolaris.org/view_bug.do?bug_id=6636160
>
> though it specifically claims using V3 cures the problem...
>
> so this must be something rather new or yet unknown, would it
> be possible to reproduce this somehow ?
I have been able to reproduce it with the following (bash syntax).
NFS client is SXCE snv_99.
export CVSROOT=/tmp/repo
mkdir $CVSROOT
cvs init
cd $nfsdir
mkdir foo
cd foo
touch {a..z} # creates 26 files
cvs import foo bar baz
cd ..
rm -rf foo
cvs co foo
find foo -name .nfs\*
Typically there are a few .nfs files left in foo/CVS. When I tried it
against a NetApp, I typically got about 5 - 10 .nfs files. I just
reproduced against a S10u4 + 127111-09 server and got one .nfs file.
In each case, there was a high-speed (gigabit+, ~1.6 ms latency) MAN
(metropolitan area network) between the client and server.
--
Mike Gerdts
http://mgerdts.blogspot.com/
[nfs-discuss] Lots of .nfs files being left around
On Wed, Dec 3, 2008 at 4:57 AM, Frank Batschulat (Home) wrote: > On Tue, 02 Dec 2008 23:04:39 +0100, Mike Gerdts wrote: > >> Unless there is some long-latent bug in CVS, this looks to be a >> regression in the NFSv3 client. > > quite possibly: > > nfs3_inactive can leave .nfsXXX files behind > http://bugs.opensolaris.org/view_bug.do?bug_id=5029852 Ahh... search term should have been .nfsXXX rather than .nfs. :) The bug says that the problem exists in 5.10 as well but I have been unable to reproduce on S10u4 (different network topology) or S10u6 (same network topology) against the same NFS file system. Is there something that changed in Nevada that would cause this condition to be triggered more frequently? Thank you for tracking this down for me. -- Mike Gerdts http://mgerdts.blogspot.com/
[nfs-discuss] Lots of .nfs files being left around
On Tue, Dec 2, 2008 at 12:36 PM, Ben Rockwood wrote: > Mike Gerdts wrote: >> Over the last couple of months I have noticed lots of .nfs files being >> left around while using cvs. >> >> A typical command: >> >> cvs -d $cvsroot co -d dotnfs-`uname -r` -r $release jass >> >> On snv_99: >> >> $ find dotnfs-5.11 -name .nfs\* | wc -l >> 822 >> >> That number does not diminish over time, unless I use rm to get rid of >> the .nfs files. >> >> However, Solaris 10 looks lots better: >> >> $ find dotnfs-5.10 -name .nfs\* | wc -l >>0 >> >> I saw similar things on snv_93. I have confirmed that only NFSv3 is >> in use on each system using "nfsstat -c". >> >> Any clues? >> > > .nfs files are created when a file is deleted that is still open. You > should see a cronjob in the root crontab: > > /usr/lib/fs/nfs/nfsfind The NFS server is a NetApp, so there is likely a different solution needed. It does seem as though it automatically cleans them up on the first read. Unfortunately, if that first read is part of a software release process, the release code may have .nfs files in it. > That cleans up the .nfs* files each week. > > So far the party line has been "fix your app". I've suffered a lot of > .nfs* pain, but I'm not aware of any related bugs atm. $ type cvs cvs is hashed (/usr/bin/cvs) $ cvs --version Concurrent Versions System (CVS) 1.12.13 (client/server) ... Based upon the pstamp for SUNWcvs my guess is that the "go fix your app" is directed at the SFW consolidation. $ pkgparam SUNWcvs VERSION PSTAMP 11.11.0,REV=2008.09.17.14.32 sfwnv20080917143303 FWIW the problem exists with CSWcvs 1.11.22,REV=2006.12.11 as well. Unless there is some long-latent bug in CVS, this looks to be a regression in the NFSv3 client. -- Mike Gerdts http://mgerdts.blogspot.com/
[nfs-discuss] Lots of .nfs files being left around
Mike Gerdts wrote: > Over the last couple of months I have noticed lots of .nfs files being > left around while using cvs. > > A typical command: > > cvs -d $cvsroot co -d dotnfs-`uname -r` -r $release jass > > On snv_99: > > $ find dotnfs-5.11 -name .nfs\* | wc -l > 822 > > That number does not diminish over time, unless I use rm to get rid of > the .nfs files. > > However, Solaris 10 looks lots better: > > $ find dotnfs-5.10 -name .nfs\* | wc -l >0 > > I saw similar things on snv_93. I have confirmed that only NFSv3 is > in use on each system using "nfsstat -c". > > Any clues? > .nfs files are created when a file is deleted that is still open. You should see a cronjob in the root crontab: /usr/lib/fs/nfs/nfsfind That cleans up the .nfs* files each week. So far the party line has been "fix your app". I've suffered a lot of .nfs* pain, but I'm not aware of any related bugs atm. benr.
[nfs-discuss] Lots of .nfs files being left around
Over the last couple of months I have noticed lots of .nfs files being left around while using cvs. A typical command: cvs -d $cvsroot co -d dotnfs-`uname -r` -r $release jass On snv_99: $ find dotnfs-5.11 -name .nfs\* | wc -l 822 That number does not diminish over time, unless I use rm to get rid of the .nfs files. However, Solaris 10 looks lots better: $ find dotnfs-5.10 -name .nfs\* | wc -l 0 I saw similar things on snv_93. I have confirmed that only NFSv3 is in use on each system using "nfsstat -c". Any clues? -- Mike Gerdts http://mgerdts.blogspot.com/
