Bug#1017720: nfs-common: No such file or directory

2022-09-22 Thread Jason Breitman
The issue also occurs when using the lookupcache=none option along with the 
5.10.X kernel.
I was hoping for this option to succeed and to investigate the performance 
impact, but it is no longer viable.
I believe that I am out of options to try with the 5.10.X kernel.
Please let me know where we stand.

> -Original Message-
> From: Jason Breitman
> Sent: Wednesday, September 21, 2022 1:01 PM
> To: Ben Hutchings ; 1017...@bugs.debian.org
> Subject: RE: Bug#1017720: nfs-common: No such file or directory
> 
> I now know that this behavior does exist in Debian Buster 10.8 and more
> specifically in the 4.19.X kernel after running stricter testing on more 
> servers.
> The 4.19.X kernel resolves itself immediately following the No such file or
> directory error which is different than the 5.X kernel requiring me to clear 
> the
> inode and dentry cache by running echo 2 > /proc/sys/vm/drop_caches.
> What further information is required to resolve this issue?
> 
> > -Original Message-
> > From: Jason Breitman
> > Sent: Tuesday, September 13, 2022 4:41 PM
> > To: Ben Hutchings ; 1017...@bugs.debian.org
> > Subject: RE: Bug#1017720: nfs-common: No such file or directory
> >
> > I downgraded the nfs-common package which required the downgrade of
> > the libevent packages and am using the 4.19.X kernel.
> > I see the issue running the initial test, but then the issue is gone when
> > running the test a subsequent time.
> >
> > libevent-2.1-6:amd64  2.1.8-stable-4
> > amd64
> > Asynchronous event notification library
> > libevent-core-2.1-6:amd64 2.1.8-stable-4
> > amd64
> > Asynchronous event notification library (core)
> > libevent-pthreads-2.1-6:amd64 2.1.8-stable-4
> > amd64
> > Asynchronous event notification library (pthreads)
> > linux-image-4.19.0-21-amd644.19.249-2  
> > amd64Linux
> > 4.19 for 64-bit PCs (signed)
> > nfs-common  1:1.3.4-2.5+deb10u1 
> >amd64NFS
> > support files common to client and server
> >
> > What other packages do I need to downgrade in order to get Debian 11.4 to
> > behave like Debian 10.8?
> > What additional questions can I answer so that we can move forward?
> >
> > > -Original Message-
> > > From: Jason Breitman
> > > Sent: Tuesday, September 6, 2022 5:18 PM
> > > To: Ben Hutchings ; 1017...@bugs.debian.org
> > > Subject: RE: Bug#1017720: nfs-common: No such file or directory
> > >
> > > I also see the failure with the kernels below, but the 4.19.X kernel
> resolves
> > > the issue without dropping caches.
> > > linux-image-4.19.0-14-amd64   4.19.171-2 amd64
> > > Linux 4.19
> > for
> > > 64-bit PCs (signed)
> > > linux-image-4.19.0-21-amd64   4.19.249-2 amd64
> > > Linux 4.19
> > for
> > > 64-bit PCs (signed)
> > >
> > > I see the issue running the initial test, but then the issue is gone when
> > > running the test a subsequent time.
> > > I ran several tests to verify the behavior differences between the 4.19.X
> > and
> > > 5.X kernels.
> > >
> > > -- Test
> > > ls -l /mnt/dir/someOtherDir/* | grep '?'
> > >
> > > -- Error message - the error message is showing files that have been
> erased
> > > via rsync --delete
> > > ls: cannot access 'filename': No such file or directory
> > > -? ? ???? filename
> > >
> > > > -Original Message-
> > > > From: Jason Breitman
> > > > Sent: Friday, September 2, 2022 5:17 PM
> > > > To: Ben Hutchings ; 1017...@bugs.debian.org
> > > > Subject: RE: Bug#1017720: nfs-common: No such file or directory
> > > >
> > > > I have tested with the following kernels and see this issue in each 
> > > > case.
> > > >
> > > > linux-image-5.10.0-16-amd64  5.10.127-1 
> > > >  amd64
> > > Linux
> > > > 5.10 for 64-bit PCs (signed)
> > > > linux-image-5.15.0-0.bpo.3-amd64 5.15.15-2~bpo11+1  
> > > > amd64
> > > > Linux 5.15 for 64-bit PCs (signed)
> > > > linux-image-5.18.0-0.deb11.3-amd64 5.18.14-1~bpo11+1  amd64
> > > &g

Bug#1017720: nfs-common: No such file or directory

2022-09-21 Thread Jason Breitman
I now know that this behavior does exist in Debian Buster 10.8 and more 
specifically in the 4.19.X kernel after running stricter testing on more 
servers.
The 4.19.X kernel resolves itself immediately following the No such file or 
directory error which is different than the 5.X kernel requiring me to clear 
the inode and dentry cache by running echo 2 > /proc/sys/vm/drop_caches.
What further information is required to resolve this issue?

> -Original Message-
> From: Jason Breitman
> Sent: Tuesday, September 13, 2022 4:41 PM
> To: Ben Hutchings ; 1017...@bugs.debian.org
> Subject: RE: Bug#1017720: nfs-common: No such file or directory
> 
> I downgraded the nfs-common package which required the downgrade of
> the libevent packages and am using the 4.19.X kernel.
> I see the issue running the initial test, but then the issue is gone when
> running the test a subsequent time.
> 
> libevent-2.1-6:amd64  2.1.8-stable-4  
>   amd64
> Asynchronous event notification library
> libevent-core-2.1-6:amd64 2.1.8-stable-4
> amd64
> Asynchronous event notification library (core)
> libevent-pthreads-2.1-6:amd64 2.1.8-stable-4amd64
> Asynchronous event notification library (pthreads)
> linux-image-4.19.0-21-amd644.19.249-2  
> amd64Linux
> 4.19 for 64-bit PCs (signed)
> nfs-common  1:1.3.4-2.5+deb10u1   
>  amd64NFS
> support files common to client and server
> 
> What other packages do I need to downgrade in order to get Debian 11.4 to
> behave like Debian 10.8?
> What additional questions can I answer so that we can move forward?
> 
> > -Original Message-
> > From: Jason Breitman
> > Sent: Tuesday, September 6, 2022 5:18 PM
> > To: Ben Hutchings ; 1017...@bugs.debian.org
> > Subject: RE: Bug#1017720: nfs-common: No such file or directory
> >
> > I also see the failure with the kernels below, but the 4.19.X kernel 
> > resolves
> > the issue without dropping caches.
> > linux-image-4.19.0-14-amd64   4.19.171-2 amd64  
> >   Linux 4.19
> for
> > 64-bit PCs (signed)
> > linux-image-4.19.0-21-amd64   4.19.249-2 amd64  
> >   Linux 4.19
> for
> > 64-bit PCs (signed)
> >
> > I see the issue running the initial test, but then the issue is gone when
> > running the test a subsequent time.
> > I ran several tests to verify the behavior differences between the 4.19.X
> and
> > 5.X kernels.
> >
> > -- Test
> > ls -l /mnt/dir/someOtherDir/* | grep '?'
> >
> > -- Error message - the error message is showing files that have been erased
> > via rsync --delete
> > ls: cannot access 'filename': No such file or directory
> > -? ? ???? filename
> >
> > > -Original Message-
> > > From: Jason Breitman
> > > Sent: Friday, September 2, 2022 5:17 PM
> > > To: Ben Hutchings ; 1017...@bugs.debian.org
> > > Subject: RE: Bug#1017720: nfs-common: No such file or directory
> > >
> > > I have tested with the following kernels and see this issue in each case.
> > >
> > > linux-image-5.10.0-16-amd64  5.10.127-1   
> > >amd64
> > Linux
> > > 5.10 for 64-bit PCs (signed)
> > > linux-image-5.15.0-0.bpo.3-amd64 5.15.15-2~bpo11+1  amd64
> > > Linux 5.15 for 64-bit PCs (signed)
> > > linux-image-5.18.0-0.deb11.3-amd64 5.18.14-1~bpo11+1  amd64
> > > Linux 5.18 for 64-bit PCs (signed)
> > >
> > > An interesting note is that when using the 5.18 kernel, I had to run echo 
> > > 3
> >
> > > /proc/sys/vm/drop_caches to resolve the issue.
> > > echo 2 > /proc/sys/vm/drop_caches did not work as it did on the 5.10 and
> > > 5.15 kernels.
> > >
> > > > -Original Message-
> > > > From: Jason Breitman
> > > > Sent: Friday, August 26, 2022 3:36 PM
> > > > To: 'Ben Hutchings' ;
> '1017...@bugs.debian.org'
> > > > <1017...@bugs.debian.org>
> > > > Subject: RE: Bug#1017720: nfs-common: No such file or directory
> > > >
> > > > I was able to identify another workaround today which may help you to
> > > > identify the issue.
> > > > The workaround is to touch the directory where the troubled files live
> on
> > > the
> > > > file server.
> >

Bug#1017720: nfs-common: No such file or directory

2022-09-13 Thread Jason Breitman
I downgraded the nfs-common package which required the downgrade of the 
libevent packages and am using the 4.19.X kernel.
I see the issue running the initial test, but then the issue is gone when 
running the test a subsequent time.

libevent-2.1-6:amd64  2.1.8-stable-4
amd64Asynchronous event notification library
libevent-core-2.1-6:amd64 2.1.8-stable-4
amd64Asynchronous event notification library (core)
libevent-pthreads-2.1-6:amd64 2.1.8-stable-4amd64   
 Asynchronous event notification library (pthreads)
linux-image-4.19.0-21-amd644.19.249-2  
amd64Linux 4.19 for 64-bit PCs (signed)
nfs-common  1:1.3.4-2.5+deb10u1
amd64NFS support files common to client and server

What other packages do I need to downgrade in order to get Debian 11.4 to 
behave like Debian 10.8?
What additional questions can I answer so that we can move forward?

> -Original Message-
> From: Jason Breitman
> Sent: Tuesday, September 6, 2022 5:18 PM
> To: Ben Hutchings ; 1017...@bugs.debian.org
> Subject: RE: Bug#1017720: nfs-common: No such file or directory
> 
> I also see the failure with the kernels below, but the 4.19.X kernel resolves
> the issue without dropping caches.
> linux-image-4.19.0-14-amd64   4.19.171-2 amd64
> Linux 4.19 for
> 64-bit PCs (signed)
> linux-image-4.19.0-21-amd64   4.19.249-2 amd64
> Linux 4.19 for
> 64-bit PCs (signed)
> 
> I see the issue running the initial test, but then the issue is gone when
> running the test a subsequent time.
> I ran several tests to verify the behavior differences between the 4.19.X and
> 5.X kernels.
> 
> -- Test
> ls -l /mnt/dir/someOtherDir/* | grep '?'
> 
> -- Error message - the error message is showing files that have been erased
> via rsync --delete
> ls: cannot access 'filename': No such file or directory
> -? ? ??    ?? filename
> 
> > -Original Message-
> > From: Jason Breitman
> > Sent: Friday, September 2, 2022 5:17 PM
> > To: Ben Hutchings ; 1017...@bugs.debian.org
> > Subject: RE: Bug#1017720: nfs-common: No such file or directory
> >
> > I have tested with the following kernels and see this issue in each case.
> >
> > linux-image-5.10.0-16-amd64  5.10.127-1 
> >  amd64
> Linux
> > 5.10 for 64-bit PCs (signed)
> > linux-image-5.15.0-0.bpo.3-amd64 5.15.15-2~bpo11+1  amd64
> > Linux 5.15 for 64-bit PCs (signed)
> > linux-image-5.18.0-0.deb11.3-amd64 5.18.14-1~bpo11+1  amd64
> > Linux 5.18 for 64-bit PCs (signed)
> >
> > An interesting note is that when using the 5.18 kernel, I had to run echo 3 
> > >
> > /proc/sys/vm/drop_caches to resolve the issue.
> > echo 2 > /proc/sys/vm/drop_caches did not work as it did on the 5.10 and
> > 5.15 kernels.
> >
> > > -Original Message-
> > > From: Jason Breitman
> > > Sent: Friday, August 26, 2022 3:36 PM
> > > To: 'Ben Hutchings' ; '1017...@bugs.debian.org'
> > > <1017...@bugs.debian.org>
> > > Subject: RE: Bug#1017720: nfs-common: No such file or directory
> > >
> > > I was able to identify another workaround today which may help you to
> > > identify the issue.
> > > The workaround is to touch the directory where the troubled files live on
> > the
> > > file server.
> > > I believe this tells us that updating the modify time attribute is used by
> the
> > > cache.
> > > It should be noted that access time updates are disabled on the file
> server.
> > >
> > > I also wanted to restate that we use rsync to push out these application
> > > updates and also use rsync to sync data files.
> > > Our rsync options preserve timestamps, so it is possible that the new 
> > > files
> > > have an older timestamp than "now".
> > > It is not the case that the new files have an older timestamp than the
> prior
> > > version that is stuck in the cache.
> > >
> > > The rsync process that I describe has not changed and has been in use for
> > > many years.
> > >
> > > > -Original Message-
> > > > From: Jason Breitman
> > > > Sent: Thursday, August 25, 2022 11:54 AM
> > > > To: Ben Hutchings ; 1017...@bugs.debian.org
> > > > Subject: RE: Bug#1017720: nfs-common: No such fi

Bug#1017720: nfs-common: No such file or directory

2022-09-06 Thread Jason Breitman
I also see the failure with the kernels below, but the 4.19.X kernel resolves 
the issue without dropping caches.
linux-image-4.19.0-14-amd64   4.19.171-2 amd64
Linux 4.19 for 64-bit PCs (signed)
linux-image-4.19.0-21-amd64   4.19.249-2 amd64
Linux 4.19 for 64-bit PCs (signed)

I see the issue running the initial test, but then the issue is gone when 
running the test a subsequent time.
I ran several tests to verify the behavior differences between the 4.19.X and 
5.X kernels.

-- Test
ls -l /mnt/dir/someOtherDir/* | grep '?'

-- Error message - the error message is showing files that have been erased via 
rsync --delete
ls: cannot access 'filename': No such file or directory
-? ? ???? filename

> -Original Message-
> From: Jason Breitman
> Sent: Friday, September 2, 2022 5:17 PM
> To: Ben Hutchings ; 1017...@bugs.debian.org
> Subject: RE: Bug#1017720: nfs-common: No such file or directory
> 
> I have tested with the following kernels and see this issue in each case.
> 
> linux-image-5.10.0-16-amd64  5.10.127-1   
>amd64Linux
> 5.10 for 64-bit PCs (signed)
> linux-image-5.15.0-0.bpo.3-amd64 5.15.15-2~bpo11+1  amd64
> Linux 5.15 for 64-bit PCs (signed)
> linux-image-5.18.0-0.deb11.3-amd64 5.18.14-1~bpo11+1  amd64
> Linux 5.18 for 64-bit PCs (signed)
> 
> An interesting note is that when using the 5.18 kernel, I had to run echo 3 >
> /proc/sys/vm/drop_caches to resolve the issue.
> echo 2 > /proc/sys/vm/drop_caches did not work as it did on the 5.10 and
> 5.15 kernels.
> 
> > -Original Message-
> > From: Jason Breitman
> > Sent: Friday, August 26, 2022 3:36 PM
> > To: 'Ben Hutchings' ; '1017...@bugs.debian.org'
> > <1017...@bugs.debian.org>
> > Subject: RE: Bug#1017720: nfs-common: No such file or directory
> >
> > I was able to identify another workaround today which may help you to
> > identify the issue.
> > The workaround is to touch the directory where the troubled files live on
> the
> > file server.
> > I believe this tells us that updating the modify time attribute is used by 
> > the
> > cache.
> > It should be noted that access time updates are disabled on the file server.
> >
> > I also wanted to restate that we use rsync to push out these application
> > updates and also use rsync to sync data files.
> > Our rsync options preserve timestamps, so it is possible that the new files
> > have an older timestamp than "now".
> > It is not the case that the new files have an older timestamp than the prior
> > version that is stuck in the cache.
> >
> > The rsync process that I describe has not changed and has been in use for
> > many years.
> >
> > > -Original Message-
> > > From: Jason Breitman
> > > Sent: Thursday, August 25, 2022 11:54 AM
> > > To: Ben Hutchings ; 1017...@bugs.debian.org
> > > Subject: RE: Bug#1017720: nfs-common: No such file or directory
> > >
> > > I have the same issue after adding actimeo=30 to /etc/fstab, rebooting
> and
> > > testing.
> > > I also confirmed that those settings applied via /proc/mounts which
> shows
> > > the below snippet for each mountpoint.
> > > nfs4
> > >
> >
> rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,acregmin=30,a
> > >
> >
> cregmax=30,acdirmax=30,hard,noresvport,proto=tcp,timeo=600,retrans=2,s
> > >
> >
> ec=krb5,clientaddr=X.X.X.X,lookupcache=pos,local_lock=none,addr=Y.Y.Y.Y 0
> > > 0
> > >
> > > > -Original Message-
> > > > From: Jason Breitman
> > > > Sent: Tuesday, August 23, 2022 2:42 PM
> > > > To: Ben Hutchings ; 1017...@bugs.debian.org
> > > > Subject: RE: Bug#1017720: nfs-common: No such file or directory
> > > >
> > > > What additional information can I provide for us to move forward with
> > this
> > > > process?
> > > >
> > > > To summarize and include further details, rsync is used to sync
> > applications
> > > to
> > > > a file server which behaves like a repository.
> > > > We do preserve timestamps from the build server and also use --
> delete.
> > > We
> > > > do not run the applications from the file server.  All servers use NTP.
> > > >
> > > > The application has a sub-directory that contain files with version
> > numbers.
> > > > These are librari

Bug#1017720: nfs-common: No such file or directory

2022-09-02 Thread Jason Breitman
I have tested with the following kernels and see this issue in each case.

linux-image-5.10.0-16-amd64  5.10.127-1 
 amd64Linux 5.10 for 64-bit PCs (signed)
linux-image-5.15.0-0.bpo.3-amd64 5.15.15-2~bpo11+1  amd64   
 Linux 5.15 for 64-bit PCs (signed)
linux-image-5.18.0-0.deb11.3-amd64 5.18.14-1~bpo11+1  amd64
Linux 5.18 for 64-bit PCs (signed)

An interesting note is that when using the 5.18 kernel, I had to run echo 3 > 
/proc/sys/vm/drop_caches to resolve the issue.
echo 2 > /proc/sys/vm/drop_caches did not work as it did on the 5.10 and 5.15 
kernels.

> -Original Message-
> From: Jason Breitman
> Sent: Friday, August 26, 2022 3:36 PM
> To: 'Ben Hutchings' ; '1017...@bugs.debian.org'
> <1017...@bugs.debian.org>
> Subject: RE: Bug#1017720: nfs-common: No such file or directory
> 
> I was able to identify another workaround today which may help you to
> identify the issue.
> The workaround is to touch the directory where the troubled files live on the
> file server.
> I believe this tells us that updating the modify time attribute is used by the
> cache.
> It should be noted that access time updates are disabled on the file server.
> 
> I also wanted to restate that we use rsync to push out these application
> updates and also use rsync to sync data files.
> Our rsync options preserve timestamps, so it is possible that the new files
> have an older timestamp than "now".
> It is not the case that the new files have an older timestamp than the prior
> version that is stuck in the cache.
> 
> The rsync process that I describe has not changed and has been in use for
> many years.
> 
> > -Original Message-
> > From: Jason Breitman
> > Sent: Thursday, August 25, 2022 11:54 AM
> > To: Ben Hutchings ; 1017...@bugs.debian.org
> > Subject: RE: Bug#1017720: nfs-common: No such file or directory
> >
> > I have the same issue after adding actimeo=30 to /etc/fstab, rebooting and
> > testing.
> > I also confirmed that those settings applied via /proc/mounts which shows
> > the below snippet for each mountpoint.
> > nfs4
> >
> rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,acregmin=30,a
> >
> cregmax=30,acdirmax=30,hard,noresvport,proto=tcp,timeo=600,retrans=2,s
> >
> ec=krb5,clientaddr=X.X.X.X,lookupcache=pos,local_lock=none,addr=Y.Y.Y.Y 0
> > 0
> >
> > > -Original Message-
> > > From: Jason Breitman
> > > Sent: Tuesday, August 23, 2022 2:42 PM
> > > To: Ben Hutchings ; 1017...@bugs.debian.org
> > > Subject: RE: Bug#1017720: nfs-common: No such file or directory
> > >
> > > What additional information can I provide for us to move forward with
> this
> > > process?
> > >
> > > To summarize and include further details, rsync is used to sync
> applications
> > to
> > > a file server which behaves like a repository.
> > > We do preserve timestamps from the build server and also use --delete.
> > We
> > > do not run the applications from the file server.  All servers use NTP.
> > >
> > > The application has a sub-directory that contain files with version
> numbers.
> > > These are libraries.
> > > When a new build is complete, a developer pushes their updates via
> rsync
> > to
> > > the file server / repository.
> > >
> > > I believe that the dentry cache thinks the "old" files exist and 
> > > generates a
> > No
> > > such file or directory error showing question marks for that files
> attributes.
> > > Dropping the dentry cache via echo 2 > /proc/sys/vm/drop_caches
> > resolves
> > > the issue.
> > >
> > > This behavior is not observed in Debian 10.8 with that distributions
> > associated
> > > kernel and packages.
> > >
> > > > -Original Message-
> > > > From: Jason Breitman
> > > > Sent: Friday, August 19, 2022 9:52 PM
> > > > To: Ben Hutchings ; 1017...@bugs.debian.org
> > > > Subject: RE: Bug#1017720: nfs-common: No such file or directory
> > > >
> > > > > -Original Message-
> > > > > From: Ben Hutchings 
> > > > > Sent: Friday, August 19, 2022 7:27 PM
> > > > > To: Jason Breitman ;
> > > > > 1017...@bugs.debian.org
> > > > > Subject: Re: Bug#1017720: nfs-common: No such file or directory
> > > > >
> > > > > Control: tag -1 moreinfo
> > > > >
> > > > >

Bug#1017720: nfs-common: No such file or directory

2022-08-26 Thread Jason Breitman
I was able to identify another workaround today which may help you to identify 
the issue.
The workaround is to touch the directory where the troubled files live on the 
file server.
I believe this tells us that updating the modify time attribute is used by the 
cache.
It should be noted that access time updates are disabled on the file server.

I also wanted to restate that we use rsync to push out these application 
updates and also use rsync to sync data files.
Our rsync options preserve timestamps, so it is possible that the new files 
have an older timestamp than "now".
It is not the case that the new files have an older timestamp than the prior 
version that is stuck in the cache.

The rsync process that I describe has not changed and has been in use for many 
years.

> -Original Message-
> From: Jason Breitman
> Sent: Thursday, August 25, 2022 11:54 AM
> To: Ben Hutchings ; 1017...@bugs.debian.org
> Subject: RE: Bug#1017720: nfs-common: No such file or directory
> 
> I have the same issue after adding actimeo=30 to /etc/fstab, rebooting and
> testing.
> I also confirmed that those settings applied via /proc/mounts which shows
> the below snippet for each mountpoint.
> nfs4
> rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,acregmin=30,a
> cregmax=30,acdirmax=30,hard,noresvport,proto=tcp,timeo=600,retrans=2,s
> ec=krb5,clientaddr=X.X.X.X,lookupcache=pos,local_lock=none,addr=Y.Y.Y.Y 0
> 0
> 
> > -Original Message-
> > From: Jason Breitman
> > Sent: Tuesday, August 23, 2022 2:42 PM
> > To: Ben Hutchings ; 1017...@bugs.debian.org
> > Subject: RE: Bug#1017720: nfs-common: No such file or directory
> >
> > What additional information can I provide for us to move forward with this
> > process?
> >
> > To summarize and include further details, rsync is used to sync applications
> to
> > a file server which behaves like a repository.
> > We do preserve timestamps from the build server and also use --delete.
> We
> > do not run the applications from the file server.  All servers use NTP.
> >
> > The application has a sub-directory that contain files with version numbers.
> > These are libraries.
> > When a new build is complete, a developer pushes their updates via rsync
> to
> > the file server / repository.
> >
> > I believe that the dentry cache thinks the "old" files exist and generates a
> No
> > such file or directory error showing question marks for that files 
> > attributes.
> > Dropping the dentry cache via echo 2 > /proc/sys/vm/drop_caches
> resolves
> > the issue.
> >
> > This behavior is not observed in Debian 10.8 with that distributions
> associated
> > kernel and packages.
> >
> > > -Original Message-
> > > From: Jason Breitman
> > > Sent: Friday, August 19, 2022 9:52 PM
> > > To: Ben Hutchings ; 1017...@bugs.debian.org
> > > Subject: RE: Bug#1017720: nfs-common: No such file or directory
> > >
> > > > -Original Message-
> > > > From: Ben Hutchings 
> > > > Sent: Friday, August 19, 2022 7:27 PM
> > > > To: Jason Breitman ;
> > > > 1017...@bugs.debian.org
> > > > Subject: Re: Bug#1017720: nfs-common: No such file or directory
> > > >
> > > > Control: tag -1 moreinfo
> > > >
> > > > On Fri, 2022-08-19 at 13:16 +, Jason Breitman wrote:
> > > > > Package: nfs-common
> > > > > Version: 1:1.3.4-6
> > > > > Severity: important
> > > > >
> > > > > Kernel: 5.10.0-16-amd64 #1 SMP Debian 5.10.127-1 (2022-06-30)
> x86_64
> > > > > GNU/Linux
> > > > >
> > > > > -- Description
> > > > > After updating and or creating new files on our file server via
> > > > > rsync, we see many files report the error message below from NFSv4
> > > > > clients since upgrading from Debian 10.8 to Debian 11.4.
> > > > > Clearing the dentry cache resolves the issue right away.
> > > > > I am not sure that nfs-common is the package to blame, but listed
> > > > > it based on the bug submission recommendations.
> > > >
> > > > The NFS implementation is mostly in the kernel, so probably this issue
> > > > belongs there.  But the kernel team is responsible for both packages.
> > > >
> > > > [...]
> > > > > -- Error message
> > > > > ls: cannot access 'filename': No such file or directory
> > > > > -? ? ??  

Bug#1017720: nfs-common: No such file or directory

2022-08-25 Thread Jason Breitman
I have the same issue after adding actimeo=30 to /etc/fstab, rebooting and 
testing.
I also confirmed that those settings applied via /proc/mounts which shows the 
below snippet for each mountpoint.
nfs4 
rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,acregmin=30,acregmax=30,acdirmax=30,hard,noresvport,proto=tcp,timeo=600,retrans=2,sec=krb5,clientaddr=X.X.X.X,lookupcache=pos,local_lock=none,addr=Y.Y.Y.Y
 0 0

> -Original Message-
> From: Jason Breitman
> Sent: Tuesday, August 23, 2022 2:42 PM
> To: Ben Hutchings ; 1017...@bugs.debian.org
> Subject: RE: Bug#1017720: nfs-common: No such file or directory
> 
> What additional information can I provide for us to move forward with this
> process?
> 
> To summarize and include further details, rsync is used to sync applications 
> to
> a file server which behaves like a repository.
> We do preserve timestamps from the build server and also use --delete.  We
> do not run the applications from the file server.  All servers use NTP.
> 
> The application has a sub-directory that contain files with version numbers.
> These are libraries.
> When a new build is complete, a developer pushes their updates via rsync to
> the file server / repository.
> 
> I believe that the dentry cache thinks the "old" files exist and generates a 
> No
> such file or directory error showing question marks for that files attributes.
> Dropping the dentry cache via echo 2 > /proc/sys/vm/drop_caches resolves
> the issue.
> 
> This behavior is not observed in Debian 10.8 with that distributions 
> associated
> kernel and packages.
> 
> > -Original Message-
> > From: Jason Breitman
> > Sent: Friday, August 19, 2022 9:52 PM
> > To: Ben Hutchings ; 1017...@bugs.debian.org
> > Subject: RE: Bug#1017720: nfs-common: No such file or directory
> >
> > > -Original Message-
> > > From: Ben Hutchings 
> > > Sent: Friday, August 19, 2022 7:27 PM
> > > To: Jason Breitman ;
> > > 1017...@bugs.debian.org
> > > Subject: Re: Bug#1017720: nfs-common: No such file or directory
> > >
> > > Control: tag -1 moreinfo
> > >
> > > On Fri, 2022-08-19 at 13:16 +, Jason Breitman wrote:
> > > > Package: nfs-common
> > > > Version: 1:1.3.4-6
> > > > Severity: important
> > > >
> > > > Kernel: 5.10.0-16-amd64 #1 SMP Debian 5.10.127-1 (2022-06-30) x86_64
> > > > GNU/Linux
> > > >
> > > > -- Description
> > > > After updating and or creating new files on our file server via
> > > > rsync, we see many files report the error message below from NFSv4
> > > > clients since upgrading from Debian 10.8 to Debian 11.4.
> > > > Clearing the dentry cache resolves the issue right away.
> > > > I am not sure that nfs-common is the package to blame, but listed
> > > > it based on the bug submission recommendations.
> > >
> > > The NFS implementation is mostly in the kernel, so probably this issue
> > > belongs there.  But the kernel team is responsible for both packages.
> > >
> > > [...]
> > > > -- Error message
> > > > ls: cannot access 'filename': No such file or directory
> > > > -? ? ???? filename
> > > [...]
> > >
> > > So we know the file's there but can't stat it.  I think this means the
> > > client has cached the handle of the old file of that name, which has
> > > been deleted.
> > >
> > > - Are client and server clocks closely synchronised?  If not, that
> > > needs to be fixed.
> > >
> > The clocks are synchronized using NTP.
> >
> > > - Are clients likely to read this directory while rsync is running, or
> > > shortly before?  If so, it may help to reduce the attribute caching
> > > timeout on the client.  See the "Directory entry caching" section in
> > > the nfs(5) manual page.
> > >
> > Clients are not likely to read this directory while rsync is running for the
> > observed cases.  That can happen in our environment, but not in this case.
> > I am using the lookupcache=pos option.  I tried noac, but the performance
> > penalty was too much.  Which option are you referring to and what setting
> > do you recommend testing?
> >
> > > I don't know why you're only seeing this after an upgrade of the
> > > clients, though.  I'm not aware that there has been any big change to
> > > attribute caching.
> > >
> > I appreciate you respond

Bug#1017720: nfs-common: No such file or directory

2022-08-23 Thread Jason Breitman
What additional information can I provide for us to move forward with this 
process?

To summarize and include further details, rsync is used to sync applications to 
a file server which behaves like a repository.
We do preserve timestamps from the build server and also use --delete.  We do 
not run the applications from the file server.  All servers use NTP.

The application has a sub-directory that contain files with version numbers.  
These are libraries.
When a new build is complete, a developer pushes their updates via rsync to the 
file server / repository.

I believe that the dentry cache thinks the "old" files exist and generates a No 
such file or directory error showing question marks for that files attributes.
Dropping the dentry cache via echo 2 > /proc/sys/vm/drop_caches resolves the 
issue. 

This behavior is not observed in Debian 10.8 with that distributions associated 
kernel and packages.

> -Original Message-----
> From: Jason Breitman
> Sent: Friday, August 19, 2022 9:52 PM
> To: Ben Hutchings ; 1017...@bugs.debian.org
> Subject: RE: Bug#1017720: nfs-common: No such file or directory
> 
> > -Original Message-
> > From: Ben Hutchings 
> > Sent: Friday, August 19, 2022 7:27 PM
> > To: Jason Breitman ;
> > 1017...@bugs.debian.org
> > Subject: Re: Bug#1017720: nfs-common: No such file or directory
> >
> > Control: tag -1 moreinfo
> >
> > On Fri, 2022-08-19 at 13:16 +, Jason Breitman wrote:
> > > Package: nfs-common
> > > Version: 1:1.3.4-6
> > > Severity: important
> > >
> > > Kernel: 5.10.0-16-amd64 #1 SMP Debian 5.10.127-1 (2022-06-30) x86_64
> > > GNU/Linux
> > >
> > > -- Description
> > > After updating and or creating new files on our file server via
> > > rsync, we see many files report the error message below from NFSv4
> > > clients since upgrading from Debian 10.8 to Debian 11.4.
> > > Clearing the dentry cache resolves the issue right away.
> > > I am not sure that nfs-common is the package to blame, but listed
> > > it based on the bug submission recommendations.
> >
> > The NFS implementation is mostly in the kernel, so probably this issue
> > belongs there.  But the kernel team is responsible for both packages.
> >
> > [...]
> > > -- Error message
> > > ls: cannot access 'filename': No such file or directory
> > > -? ? ???? filename
> > [...]
> >
> > So we know the file's there but can't stat it.  I think this means the
> > client has cached the handle of the old file of that name, which has
> > been deleted.
> >
> > - Are client and server clocks closely synchronised?  If not, that
> > needs to be fixed.
> >
> The clocks are synchronized using NTP.
> 
> > - Are clients likely to read this directory while rsync is running, or
> > shortly before?  If so, it may help to reduce the attribute caching
> > timeout on the client.  See the "Directory entry caching" section in
> > the nfs(5) manual page.
> >
> Clients are not likely to read this directory while rsync is running for the
> observed cases.  That can happen in our environment, but not in this case.
> I am using the lookupcache=pos option.  I tried noac, but the performance
> penalty was too much.  Which option are you referring to and what setting
> do you recommend testing?
> 
> > I don't know why you're only seeing this after an upgrade of the
> > clients, though.  I'm not aware that there has been any big change to
> > attribute caching.
> >
> I appreciate you responding to my report and am happy to answer any
> questions.
> We have multiple monitors and log scrapers to detect "file not found"
> exceptions that would let us know if this was happening before.
> To share more, I have 2 environments mounting from the same file server.
> Each environment has several servers.  The issue is only seen in the
> environment running Debian 11.4.
> I also should have mentioned that the files in question have a version
> number appended.  filename-.  When the file is updated via rsync, it is
> called filename-1112 and the prior file is removed.  The error is about
> filename-.
> I am not sure if this is the proper terminology, but the issue appears to be
> the negative dentry cache.
> 
> > Ben.
> >
> > --
> > Ben Hutchings
> > Beware of bugs in the above code;
> > I have only proved it correct, not tried it. - Donald Knuth
> 
> Jason Breitman
Jason Breitman


Bug#1017720: nfs-common: No such file or directory

2022-08-19 Thread Jason Breitman
> -Original Message-
> From: Ben Hutchings 
> Sent: Friday, August 19, 2022 7:27 PM
> To: Jason Breitman ;
> 1017...@bugs.debian.org
> Subject: Re: Bug#1017720: nfs-common: No such file or directory
> 
> Control: tag -1 moreinfo
> 
> On Fri, 2022-08-19 at 13:16 +, Jason Breitman wrote:
> > Package: nfs-common
> > Version: 1:1.3.4-6
> > Severity: important
> >
> > Kernel: 5.10.0-16-amd64 #1 SMP Debian 5.10.127-1 (2022-06-30) x86_64
> > GNU/Linux
> >
> > -- Description
> > After updating and or creating new files on our file server via
> > rsync, we see many files report the error message below from NFSv4
> > clients since upgrading from Debian 10.8 to Debian 11.4.
> > Clearing the dentry cache resolves the issue right away.
> > I am not sure that nfs-common is the package to blame, but listed
> > it based on the bug submission recommendations.
> 
> The NFS implementation is mostly in the kernel, so probably this issue
> belongs there.  But the kernel team is responsible for both packages.
> 
> [...]
> > -- Error message
> > ls: cannot access 'filename': No such file or directory
> > -? ? ???? filename
> [...]
> 
> So we know the file's there but can't stat it.  I think this means the
> client has cached the handle of the old file of that name, which has
> been deleted.
> 
> - Are client and server clocks closely synchronised?  If not, that
> needs to be fixed.
> 
The clocks are synchronized using NTP.  

> - Are clients likely to read this directory while rsync is running, or
> shortly before?  If so, it may help to reduce the attribute caching
> timeout on the client.  See the "Directory entry caching" section in
> the nfs(5) manual page.
>
Clients are not likely to read this directory while rsync is running for the 
observed cases.  That can happen in our environment, but not in this case.
I am using the lookupcache=pos option.  I tried noac, but the performance 
penalty was too much.  Which option are you referring to and what setting do 
you recommend testing?

> I don't know why you're only seeing this after an upgrade of the
> clients, though.  I'm not aware that there has been any big change to
> attribute caching.
> 
I appreciate you responding to my report and am happy to answer any questions.
We have multiple monitors and log scrapers to detect "file not found" 
exceptions that would let us know if this was happening before.
To share more, I have 2 environments mounting from the same file server.  Each 
environment has several servers.  The issue is only seen in the environment 
running Debian 11.4.
I also should have mentioned that the files in question have a version number 
appended.  filename-.  When the file is updated via rsync, it is called 
filename-1112 and the prior file is removed.  The error is about filename-.
I am not sure if this is the proper terminology, but the issue appears to be 
the negative dentry cache.

> Ben.
> 
> --
> Ben Hutchings
> Beware of bugs in the above code;
> I have only proved it correct, not tried it. - Donald Knuth

Jason Breitman


Bug#1017720: nfs-common: No such file or directory

2022-08-19 Thread Jason Breitman
Package: nfs-common
Version: 1:1.3.4-6
Severity: important

Kernel: 5.10.0-16-amd64 #1 SMP Debian 5.10.127-1 (2022-06-30) x86_64 GNU/Linux

-- Description
After updating and or creating new files on our file server via rsync, we 
see many files report the error message below from NFSv4 clients since 
upgrading from Debian 10.8 to Debian 11.4.
Clearing the dentry cache resolves the issue right away.
I am not sure that nfs-common is the package to blame, but listed it based 
on the bug submission recommendations. 

-- Test
ls -l /mnt/dir/someOtherDir/* | grep '?'

-- Error message
ls: cannot access 'filename': No such file or directory
-? ? ???? filename

-- Workaround
/usr/bin/sync && echo 2 > /proc/sys/vm/drop_caches

-- /etc/fstab snippet --
nfs-server.domain.com:/dir  /mnt/dirnfs4
lookupcache=pos,noresvport,sec=krb5,hard,rsize=1048576,wsize=10485760   0

Jason Breitman