Bug#855632: Bug#883217: linux: open on NFSv4 exported file on nfs server: "Resource temporarily unavailable" under reproducible conditions when client has granted read delegation on file
Hi Stephen, On Thu, Dec 14, 2017 at 05:51:12PM -0700, Stephen Dowdy wrote: > > > On 12/14/2017 12:51 PM, Salvatore Bonaccorso wrote: > > Hi Stephen, > > > > On Mon, Dec 04, 2017 at 09:24:55PM +0100, Salvatore Bonaccorso wrote: > >> Hi > >> > >> On Thu, Nov 30, 2017 at 03:35:40PM -0700, Stephen Dowdy wrote: > >>> On 11/30/2017 01:39 PM, Salvatore Bonaccorso wrote: > Is this worth trying to be fixed for the jessie kernel? > >>> > >>> Salvatore, > >>> > >>> I believe this is likely the reason for my bug report: > >>> > >>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=855632 > >>> > >>> as that system has thrown EAGAIN errors since i installed it in April, > >>> 2015. > >>> It's a 10 NIC NFS server for the department, and often throws the error > >>> when i update files that are likely being read/open by client systems. > >>> (it doesn't have a huge resource consumption load ever and i get that > >>> failure) > >>> > >>> So, i vote yeah ;) > >> > >> Okay. > > > > Did you got a chance to test this as well for your case of #855632? > > > > Regards, > > Salvatore > > > Salvatore, > > > Sorry i didn't respond. things have been way crazy. Unfortunately, i > probably won't be able to test because: >- problem is not reproducible easily sometimes >- this machine services several hundred systems w/o any upcoming scheduled > downtime. > > I haven't noticed the problem on any other machines we have, though, so don't > have any other candidates for testing. Many thanks for the reply now, is much appreciated to see were we stand. Yes I can perfectly understand the reasoning. the change is now pending for 3.16.51-4 (or any later interation via a point release of jessie), so if it happens to you to not have updated to stretch yet and see your issue resolved as well we can close the second bug. Regards, Salvatore
Bug#855632: Bug#883217: linux: open on NFSv4 exported file on nfs server: "Resource temporarily unavailable" under reproducible conditions when client has granted read delegation on file
On 12/14/2017 12:51 PM, Salvatore Bonaccorso wrote: > Hi Stephen, > > On Mon, Dec 04, 2017 at 09:24:55PM +0100, Salvatore Bonaccorso wrote: >> Hi >> >> On Thu, Nov 30, 2017 at 03:35:40PM -0700, Stephen Dowdy wrote: >>> On 11/30/2017 01:39 PM, Salvatore Bonaccorso wrote: Is this worth trying to be fixed for the jessie kernel? >>> >>> Salvatore, >>> >>> I believe this is likely the reason for my bug report: >>> >>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=855632 >>> >>> as that system has thrown EAGAIN errors since i installed it in April, 2015. >>> It's a 10 NIC NFS server for the department, and often throws the error >>> when i update files that are likely being read/open by client systems. >>> (it doesn't have a huge resource consumption load ever and i get that >>> failure) >>> >>> So, i vote yeah ;) >> >> Okay. > > Did you got a chance to test this as well for your case of #855632? > > Regards, > Salvatore > Salvatore, Sorry i didn't respond. things have been way crazy. Unfortunately, i probably won't be able to test because: - problem is not reproducible easily sometimes - this machine services several hundred systems w/o any upcoming scheduled downtime. I haven't noticed the problem on any other machines we have, though, so don't have any other candidates for testing. I may just take the "upgrade to stretch" solution out of this when i have some scheduled downtime. thanks, --stephen
Bug#855632: Bug#883217: linux: open on NFSv4 exported file on nfs server: "Resource temporarily unavailable" under reproducible conditions when client has granted read delegation on file
Hi Stephen, On Mon, Dec 04, 2017 at 09:24:55PM +0100, Salvatore Bonaccorso wrote: > Hi > > On Thu, Nov 30, 2017 at 03:35:40PM -0700, Stephen Dowdy wrote: > > On 11/30/2017 01:39 PM, Salvatore Bonaccorso wrote: > > > Is this worth trying to be fixed for the jessie kernel? > > > > Salvatore, > > > > I believe this is likely the reason for my bug report: > > > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=855632 > > > > as that system has thrown EAGAIN errors since i installed it in April, 2015. > > It's a 10 NIC NFS server for the department, and often throws the error > > when i update files that are likely being read/open by client systems. > > (it doesn't have a huge resource consumption load ever and i get that > > failure) > > > > So, i vote yeah ;) > > Okay. Did you got a chance to test this as well for your case of #855632? Regards, Salvatore
Bug#855632: Bug#883217: linux: open on NFSv4 exported file on nfs server: "Resource temporarily unavailable" under reproducible conditions when client has granted read delegation on file
Hi On Thu, Nov 30, 2017 at 03:35:40PM -0700, Stephen Dowdy wrote: > On 11/30/2017 01:39 PM, Salvatore Bonaccorso wrote: > > Is this worth trying to be fixed for the jessie kernel? > > Salvatore, > > I believe this is likely the reason for my bug report: > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=855632 > > as that system has thrown EAGAIN errors since i installed it in April, 2015. > It's a 10 NIC NFS server for the department, and often throws the error when > i update files that are likely being read/open by client systems. > (it doesn't have a huge resource consumption load ever and i get that failure) > > So, i vote yeah ;) Okay. I tried to track that further down, and attached 0001-locks-remove-i_have_this_lease-check-from-__break_le.patch 0002-locks-__break_lease-cleanup-in-preparation-of-allowi.patch to be applied on top of the current jessie branch in git. Attached are the two individual patches: locks-remove-i_have_this_lease-check-from-__break_le.patch locks-__break_lease-cleanup-in-preparation-of-allowi.patch With these two patches applied I was not able to reproduce the problem now for a while, whereas previously it was relatively fast triggerable. Can you confirm the issue would be addressed as well for you? See the kernel-handbook for the simple-patching guideline: https://kernel-handbook.alioth.debian.org/ch-common-tasks.html#s4.2.2 Still since the patches were integrated in a bigger rewrite/touching of fs/locks.c, fs/nfsd this might need a proper/deeper review if that is complete and does not break anything. Regards, Salvatore >From 6997c3a97579e46cb839c334b4b9b6f96c3b573b Mon Sep 17 00:00:00 2001 From: Salvatore BonaccorsoDate: Mon, 4 Dec 2017 11:11:28 +0100 Subject: [PATCH 1/2] locks: remove i_have_this_lease check from __break_lease --- debian/changelog | 6 +++ ...e-i_have_this_lease-check-from-__break_le.patch | 54 ++ debian/patches/series | 1 + 3 files changed, 61 insertions(+) create mode 100644 debian/patches/bugfix/all/locks-remove-i_have_this_lease-check-from-__break_le.patch diff --git a/debian/changelog b/debian/changelog index 977e1cea3..955b86f56 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,9 @@ +linux (3.16.51-3) UNRELEASED; urgency=medium + + * locks: remove i_have_this_lease check from __break_lease + + -- Salvatore Bonaccorso Mon, 04 Dec 2017 12:17:53 +0100 + linux (3.16.51-2) jessie; urgency=medium * [mips*] inst: Avoid ABI change in 3.16.51 diff --git a/debian/patches/bugfix/all/locks-remove-i_have_this_lease-check-from-__break_le.patch b/debian/patches/bugfix/all/locks-remove-i_have_this_lease-check-from-__break_le.patch new file mode 100644 index 0..04a778b40 --- /dev/null +++ b/debian/patches/bugfix/all/locks-remove-i_have_this_lease-check-from-__break_le.patch @@ -0,0 +1,54 @@ +From: Jeff Layton +Date: Mon, 1 Sep 2014 14:27:43 -0400 +Subject: locks: remove i_have_this_lease check from __break_lease +Origin: https://git.kernel.org/linus/843c6b2f4cef384af8e0de6b7ac7191675030e3a + +I think that the intent of this code was to ensure that a process won't +deadlock if it has one fd open with a lease on it and then breaks that +lease by opening another fd. In that case it'll treat the __break_lease +call as if it were non-blocking. + +This seems wrong -- the process could (for instance) be multithreaded +and managing different fds via different threads. I also don't see any +mention of this limitation in the (somewhat sketchy) documentation. + +Remove the check and the non-blocking behavior when i_have_this_lease +is true. + +Signed-off-by: Jeff Layton +[carnil: Backport for 3.16: + - adjust context +] +--- + fs/locks.c | 6 ++ + 1 file changed, 2 insertions(+), 4 deletions(-) + +--- a/fs/locks.c b/fs/locks.c +@@ -1326,7 +1326,6 @@ int __break_lease(struct inode *inode, u + struct file_lock *new_fl, *flock; + struct file_lock *fl; + unsigned long break_time; +- int i_have_this_lease = 0; + bool lease_conflict = false; + int want_write = (mode & O_ACCMODE) != O_RDONLY; + +@@ -1346,8 +1345,7 @@ int __break_lease(struct inode *inode, u + for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) { + if (leases_conflict(fl, new_fl)) { + lease_conflict = true; +- if (fl->fl_owner == current->files) +-i_have_this_lease = 1; ++ break; + } + } + if (!lease_conflict) +@@ -1377,7 +1375,7 @@ int __break_lease(struct inode *inode, u + fl->fl_lmops->lm_break(fl); + } + +- if (i_have_this_lease || (mode & O_NONBLOCK)) { ++ if (mode & O_NONBLOCK) { + trace_break_lease_noblock(inode, new_fl); + error = -EWOULDBLOCK; + goto out; diff --git a/debian/patches/series b/debian/patches/series index 4cd4a739c..4ab96adb2 100644 --- a/debian/patches/series +++ b/debian/patches/series @@ -251,6 +251,7 @@