** Also affects: linux (Ubuntu)
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1639299

Title:
  acpi_pad consumes 100% of resources

Status in Nvidia:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  acpi_pad will take up 100% of the CPU resources and slow the system to
  a crawl.  'rmmod acpi_pad' removes the offender and brings the system
  response back.

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
  20765 root      20   0       0      0      0 R 100.0  0.0   5:07.99 xhpl
  20879 root      -2   0       0      0      0 R 100.0  0.0   7:12.40 acpi_pad/5
  20887 root      -2   0       0      0      0 R 100.0  0.0   6:57.72 
acpi_pad/13
  20891 root      -2   0       0      0      0 R 100.0  0.0   7:05.74 
acpi_pad/17
  20874 root      -2   0       0      0      0 R 100.0  0.0   7:15.16 acpi_pad/0
  20875 root      -2   0       0      0      0 R 100.0  0.0   7:14.76 acpi_pad/1
  20876 root      -2   0       0      0      0 R 100.0  0.0   7:13.54 acpi_pad/2
  20877 root      -2   0       0      0      0 R 100.0  0.0   7:13.54 acpi_pad/3
  20880 root      -2   0       0      0      0 R 100.0  0.0   7:11.44 acpi_pad/6
  20881 root      -2   0       0      0      0 R 100.0  0.0   7:11.17 acpi_pad/7
  20882 root      -2   0       0      0      0 R 100.0  0.0   7:05.42 acpi_pad/8
  20883 root      -2   0       0      0      0 R 100.0  0.0   7:10.80 acpi_pad/9
  20884 root      -2   0       0      0      0 R 100.0  0.0   7:09.50 
acpi_pad/10
  20885 root      -2   0       0      0      0 R 100.0  0.0   7:09.66 
acpi_pad/11
  20888 root      -2   0       0      0      0 R 100.0  0.0   7:07.30 
acpi_pad/14
  20889 root      -2   0       0      0      0 R 100.0  0.0   7:07.37 
acpi_pad/15
  20890 root      -2   0       0      0      0 R 100.0  0.0   7:05.50 
acpi_pad/16
  20892 root      -2   0       0      0      0 R 100.0  0.0   7:04.40 
acpi_pad/18
  20893 root      -2   0       0      0      0 R 100.0  0.0   7:04.21 
acpi_pad/19
  20894 root      -2   0       0      0      0 R 100.0  0.0   7:03.70 
acpi_pad/20
  20895 root      -2   0       0      0      0 R 100.0  0.0   7:03.63 
acpi_pad/21
  20896 root      -2   0       0      0      0 R 100.0  0.0   7:01.61 
acpi_pad/22
  20897 root      -2   0       0      0      0 R 100.0  0.0   7:01.66 
acpi_pad/23
  20898 root      -2   0       0      0      0 R 100.0  0.0   7:00.80 
acpi_pad/24
  20899 root      -2   0       0      0      0 R 100.0  0.0   7:00.81 
acpi_pad/25
  20901 root      -2   0       0      0      0 R 100.0  0.0   6:58.79 
acpi_pad/26
  20902 root      -2   0       0      0      0 R 100.0  0.0   6:58.96 
acpi_pad/27
  20903 root      -2   0       0      0      0 R 100.0  0.0   6:57.82 
acpi_pad/28
  20904 root      -2   0       0      0      0 R 100.0  0.0   6:57.83 
acpi_pad/29
  20906 root      -2   0       0      0      0 R 100.0  0.0   6:55.54 
acpi_pad/31
  20886 root      -2   0       0      0      0 R  99.7  0.0   7:08.80 
acpi_pad/12
  20878 root      -2   0       0      0      0 R  98.4  0.0   7:12.20 acpi_pad/4
  20905 root      -2   0       0      0      0 R  98.4  0.0   6:55.85 
acpi_pad/30
   3049 newrelic  20   0  245800   8388   4724 S  22.3  0.0   0:14.74 nrsysmond
  22126 root      20   0   19592   3876   2392 R   6.0  0.0   0:00.99 top
   1441 root      39  19       0      0      0 S   3.4  0.0   3:05.47 kipmi0
  20720 root      20   0  870276  13080   6208 S   1.6  0.0   0:01.50 collectd
      8 root      20   0       0      0      0 S   0.9  0.0   0:03.19 rcu_sched
     13 root      rt   0       0      0      0 S   0.3  0.0   0:00.03 
watchdog/0 

  
  This has been seen on the 4.2 and 4.4 kernels.  I believe the LINPACK test 
suite was running in all cases this was seen.  However, it occurs pretty 
infrequently, and I don't know how to reliably recreate the issue.  It has only 
been seen on the DGX-1 Server, not on the DGX Station.  I'm not sure if any 
other systems have seen it.

  Another data point which may or may not be relevant is that C-states
  and P-states are enabled.

  We can workaround this issue by blacklisting the acpi_pad module, or
  by using the acpi_pad.disable=1 kernel bootarg.  What are the
  implications of disabling acpi_pad are?

  Googling "acpi_pad uses up all the resource" returns many hits where
  they all suggest to simply disable it.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nvidia/+bug/1639299/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to