Public bug reported:

We've got a cluster of diskless compute nodes that we use for highly
CPU-intensive tasks.  To keep them minimal, they're headless and
diskless, and they PXE-boot and run NFS-root from a central server.  We
recently upgraded the PXE-boot kernel for these nodes to an Ubuntu 16.04
series kernel.  Specifically, uname -a on the nodes returns:

Linux fluorine 4.4.0-83-generic #106-Ubuntu SMP Mon Jun 26 17:54:43 UTC
2017 x86_64 x86_64 x86_64 GNU/Linux

After the upgrade, we started noticing that anytime the NFS link got
heavily loaded, and the NFS server started getting bogged down trying to
keep up with the requests, the diskless clients would spontaneously
unmount their /proc filesystem.  It sounds an awful lot like this
report:

https://forums.gentoo.org/viewtopic-t-1010040-start-0.html

...the fix for which was supposedly integrated into the mainline kernel
as of 3.19.2.  But we never saw this issue until just recently, right
after upgrading to 4.4.0.  Did the patch that fixed this in 3.19 get
somehow reverted in 4.4?  Or was a new trigger for this bug somehow
introduced?

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: linux-image-4.4.0-83-generic (not installed)
ProcVersionSignature: Ubuntu 4.4.0-97.120-generic 4.4.87
Uname: Linux 4.4.0-97-generic x86_64
ApportVersion: 2.20.1-0ubuntu2.10
Architecture: amd64
AudioDevicesInUse:
 USER        PID ACCESS COMMAND
 /dev/snd/controlC1:  chaz       2928 F.... pulseaudio
 /dev/snd/controlC0:  chaz       2928 F.... pulseaudio
CurrentDesktop: XFCE
Date: Fri Nov  3 14:26:29 2017
InstallationDate: Installed on 2014-07-22 (1200 days ago)
InstallationMedia: Xubuntu 14.04 LTS "Trusty Tahr" - Release amd64 (20140416.2)
MachineType: Micro-Star International Co., Ltd. GT70 2PC
ProcFB:
 0 inteldrmfb
 1 nouveaufb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.4.0-97-generic 
root=UUID=9dda5f94-8c6c-45a4-8a4c-38b3f69a8b15 ro rootfstype=btrfs 
rootflags=discard,ssd,compress=no,subvol=@ nosplash
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-97-generic N/A
 linux-backports-modules-4.4.0-97-generic  N/A
 linux-firmware                            1.157.12
SourcePackage: linux
UpgradeStatus: Upgraded to xenial on 2016-08-08 (452 days ago)
dmi.bios.date: 04/17/2015
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: E1763IMS.71D
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: MS-1763
dmi.board.vendor: Micro-Star International Co., Ltd.
dmi.board.version: REV:0.C
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 10
dmi.chassis.vendor: Micro-Star International Co., Ltd.
dmi.chassis.version: To Be Filled By O.E.M.
dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvrE1763IMS.71D:bd04/17/2015:svnMicro-StarInternationalCo.,Ltd.:pnGT702PC:pvrREV0.C:rvnMicro-StarInternationalCo.,Ltd.:rnMS-1763:rvrREV0.C:cvnMicro-StarInternationalCo.,Ltd.:ct10:cvrToBeFilledByO.E.M.:
dmi.product.name: GT70 2PC
dmi.product.version: REV:0.C
dmi.sys.vendor: Micro-Star International Co., Ltd.

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: amd64 apport-bug xenial

** Attachment added: "proc_version and lspci from affected diskless client"
   
https://bugs.launchpad.net/bugs/1729993/+attachment/5003214/+files/debug_traces_from_fluorine.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1729993

Title:
  proc umounts on nfsroot client when nfs server is heavily loaded

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1729993/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to