Public bug reported:

I've just upgraded a Dell XPS 15" (9550, early 2016 model) with a
Samsung NVME drive. Machine was stable under Kubuntu 16.10 with the same
drive. After the upgrade to Zesty I've now seen 3 hard lockups (machine
loses root fs) with the following message printed:

    nvme controller is down will reset

there are also messages printed to the virtual console reporting failure
to write to the underlying disk from the home-directory encfs.

Linux tass 4.10.0-19-generic #21-Ubuntu SMP Thu Apr 6 17:04:57 UTC 2017
x86_64 x86_64 x86_64 GNU/Linux

Ubuntu 17.04 (Kubuntu)

dmesg about nvme:
[    1.748864] nvme nvme0: pci function 0000:04:00.0
[    1.864553]  nvme0n1: p1 p2 p3 p4 p5 p6
[    2.961181] EXT4-fs (nvme0n1p6): mounted filesystem with ordered data mode. 
Opts: (null)
[    4.172546] EXT4-fs (nvme0n1p6): re-mounted. Opts: errors=remount-ro

NVME cli shows 57 errors in the error-log, all seeming to be invalid
field or invalid namespace. Not sure if that's since boot or since
machine creation.

Smartctrl shows...
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.10.0-19-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       PM951 NVMe SAMSUNG 512GB
Serial Number:                      S29PNXAH142328
Firmware Version:                   BXV77D0Q
PCI Vendor/Subsystem ID:            0x144d
IEEE OUI Identifier:                0x002538
Controller ID:                      1
Number of Namespaces:               1
Namespace 1 Size/Capacity:          512,110,190,592 [512 GB]
Namespace 1 Utilization:            365,503,283,200 [365 GB]
Namespace 1 Formatted LBA Size:     512
Local Time is:                      Thu Apr 13 23:21:32 2017 EDT
Firmware Updates (0x06):            3 Slots
Optional Admin Commands (0x0017):   Security Format Frmw_DL *Other*
Optional NVM Commands (0x001f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat
Maximum Data Transfer Size:         32 Pages

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     6.00W       -        -    0  0  0  0        5       5
 1 +     4.20W       -        -    1  1  1  1       30      30
 2 +     3.10W       -        -    2  2  2  2      100     100
 3 -   0.0700W       -        -    3  3  3  3      500    5000
 4 -   0.0050W       -        -    4  4  4  4     2000   22000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
Critical Warning:                   0x00
Temperature:                        35 Celsius
Available Spare:                    100%
Available Spare Threshold:          50%
Percentage Used:                    0%
Data Units Read:                    2,724,346 [1.39 TB]
Data Units Written:                 6,568,756 [3.36 TB]
Host Read Commands:                 52,921,997
Host Write Commands:                157,530,880
Controller Busy Time:               1,349
Power Cycles:                       831
Power On Hours:                     5,358
Unsafe Shutdowns:                   46
Media and Data Integrity Errors:    0
Error Information Log Entries:      57

Error Information (NVMe Log 0x01, max 64 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS
  0         57     0  0x0004  0x4016  0x000            0     1     -
  1         56     0  0x0004  0x4016  0x000            0     1     -
  2         55     0  0x0004  0x4016  0x000            0     1     -
  3         54     0  0x0004  0x4016  0x000            0     1     -
  4         53     0  0x0004  0x4016  0x000            0     1     -
  5         52     0  0x0004  0x4016  0x000            0     1     -
  6         51     0  0x0004  0x4016  0x000            0     1     -
  7         50     0  0x0004  0x4016  0x000            0     1     -
  8         49     0  0x001f  0x4004  0x000            0     0     -
  9         48     0  0x001e  0x4004  0x000            0     0     -
 10         47     0  0x001f  0x4004  0x000            0     0     -
 11         46     0  0x001e  0x4004  0x000            0     0     -
 12         45     0  0x001f  0x4004  0x000            0     0     -
 13         44     0  0x001e  0x4004  0x000            0     0     -
 14         43     0  0x0000  0x4016  0x000            0     1     -
 15         42     0  0x0004  0x4016  0x000            0     1     -
... (41 entries not shown)

** Affects: linux-signed (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-signed in Ubuntu.
https://bugs.launchpad.net/bugs/1682704

Title:
  nvme controller is down will reset (regression in zesty on XPS laptop)

Status in linux-signed package in Ubuntu:
  New

Bug description:
  I've just upgraded a Dell XPS 15" (9550, early 2016 model) with a
  Samsung NVME drive. Machine was stable under Kubuntu 16.10 with the
  same drive. After the upgrade to Zesty I've now seen 3 hard lockups
  (machine loses root fs) with the following message printed:

      nvme controller is down will reset

  there are also messages printed to the virtual console reporting
  failure to write to the underlying disk from the home-directory encfs.

  Linux tass 4.10.0-19-generic #21-Ubuntu SMP Thu Apr 6 17:04:57 UTC
  2017 x86_64 x86_64 x86_64 GNU/Linux

  Ubuntu 17.04 (Kubuntu)

  dmesg about nvme:
  [    1.748864] nvme nvme0: pci function 0000:04:00.0
  [    1.864553]  nvme0n1: p1 p2 p3 p4 p5 p6
  [    2.961181] EXT4-fs (nvme0n1p6): mounted filesystem with ordered data 
mode. Opts: (null)
  [    4.172546] EXT4-fs (nvme0n1p6): re-mounted. Opts: errors=remount-ro

  NVME cli shows 57 errors in the error-log, all seeming to be invalid
  field or invalid namespace. Not sure if that's since boot or since
  machine creation.

  Smartctrl shows...
  smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.10.0-19-generic] (local build)
  Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

  === START OF INFORMATION SECTION ===
  Model Number:                       PM951 NVMe SAMSUNG 512GB
  Serial Number:                      S29PNXAH142328
  Firmware Version:                   BXV77D0Q
  PCI Vendor/Subsystem ID:            0x144d
  IEEE OUI Identifier:                0x002538
  Controller ID:                      1
  Number of Namespaces:               1
  Namespace 1 Size/Capacity:          512,110,190,592 [512 GB]
  Namespace 1 Utilization:            365,503,283,200 [365 GB]
  Namespace 1 Formatted LBA Size:     512
  Local Time is:                      Thu Apr 13 23:21:32 2017 EDT
  Firmware Updates (0x06):            3 Slots
  Optional Admin Commands (0x0017):   Security Format Frmw_DL *Other*
  Optional NVM Commands (0x001f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat
  Maximum Data Transfer Size:         32 Pages

  Supported Power States
  St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
   0 +     6.00W       -        -    0  0  0  0        5       5
   1 +     4.20W       -        -    1  1  1  1       30      30
   2 +     3.10W       -        -    2  2  2  2      100     100
   3 -   0.0700W       -        -    3  3  3  3      500    5000
   4 -   0.0050W       -        -    4  4  4  4     2000   22000

  Supported LBA Sizes (NSID 0x1)
  Id Fmt  Data  Metadt  Rel_Perf
   0 +     512       0         0

  === START OF SMART DATA SECTION ===
  SMART overall-health self-assessment test result: PASSED

  SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
  Critical Warning:                   0x00
  Temperature:                        35 Celsius
  Available Spare:                    100%
  Available Spare Threshold:          50%
  Percentage Used:                    0%
  Data Units Read:                    2,724,346 [1.39 TB]
  Data Units Written:                 6,568,756 [3.36 TB]
  Host Read Commands:                 52,921,997
  Host Write Commands:                157,530,880
  Controller Busy Time:               1,349
  Power Cycles:                       831
  Power On Hours:                     5,358
  Unsafe Shutdowns:                   46
  Media and Data Integrity Errors:    0
  Error Information Log Entries:      57

  Error Information (NVMe Log 0x01, max 64 entries)
  Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS
    0         57     0  0x0004  0x4016  0x000            0     1     -
    1         56     0  0x0004  0x4016  0x000            0     1     -
    2         55     0  0x0004  0x4016  0x000            0     1     -
    3         54     0  0x0004  0x4016  0x000            0     1     -
    4         53     0  0x0004  0x4016  0x000            0     1     -
    5         52     0  0x0004  0x4016  0x000            0     1     -
    6         51     0  0x0004  0x4016  0x000            0     1     -
    7         50     0  0x0004  0x4016  0x000            0     1     -
    8         49     0  0x001f  0x4004  0x000            0     0     -
    9         48     0  0x001e  0x4004  0x000            0     0     -
   10         47     0  0x001f  0x4004  0x000            0     0     -
   11         46     0  0x001e  0x4004  0x000            0     0     -
   12         45     0  0x001f  0x4004  0x000            0     0     -
   13         44     0  0x001e  0x4004  0x000            0     0     -
   14         43     0  0x0000  0x4016  0x000            0     1     -
   15         42     0  0x0004  0x4016  0x000            0     1     -
  ... (41 entries not shown)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1682704/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to