Bug#947803: smartmontools: smartctl -l error causes Micron 2200S NVME to fail
On Thursday, 7 January 2021 7:42:02 PM AEDT Dominique Dumont wrote: > smartctl 7.2 works fine on my system (see below). Thank you for confirmation. > Dmitry, could you update smartmontools on Debian ? Given that this bug may > freeze KDE on login, it would be unfortunate not to update it for bullseye. Of course, of course. Uploading now. :) -- Kind regards, Dmitry Smirnov GPG key : 4096R/52B6BBD953968D1B --- And how long a lockdown is enough? If we open now, will lockdown recur in autumn? Next year? Whenever authoritarianism so wishes? No dictatorship could imagine a better precedent for absolute control. -- https://www.bmj.com/content/369/bmj.m1924.long :: BMJ 2020;369:m1924 "Should governments continue lockdown to slow the spread of covid-19?" signature.asc Description: This is a digitally signed message part.
Bug#947803: smartmontools: smartctl -l error causes Micron 2200S NVME to fail
On Tue, 8 Dec 2020 18:44:21 +0100 Christian Franke wrote: > Please test recent build from https://builds.smartmontools.org/ smartctl 7.2 works fine on my system (see below). Dmitry, could you update smartmontools on Debian ? Given that this bug may freeze KDE on login, it would be unfortunate not to update it for bullseye. All the best Dod $ sudo /home/domi/bin/smartctl -a /dev/nvme0 smartctl 7.2 2020-12-30 r5154 [x86_64-linux-5.10.0-1-amd64] (CircleCI) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Number: Micron 2200S NVMe 512GB Serial Number: [redacted] Firmware Version: 22001040 PCI Vendor/Subsystem ID:0x1344 IEEE OUI Identifier:0x00a075 Controller ID: 0 NVMe Version: 1.2.1 Number of Namespaces: 1 Namespace 1 Size/Capacity: 512,110,190,592 [512 GB] Namespace 1 Formatted LBA Size: 512 Namespace 1 IEEE EUI-64:00a075 0126c27b15 Local Time is: Thu Jan 7 09:36:15 2021 CET Firmware Updates (0x02):1 Slot Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test Optional NVM Commands (0x0017): Comp Wr_Unc DS_Mngmt Sav/Sel_Feat Log Page Attributes (0x02): Cmd_Eff_Lg Maximum Data Transfer Size: 128 Pages Warning Comp. Temp. Threshold: 82 Celsius Critical Comp. Temp. Threshold: 85 Celsius Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 8.25W --0 0 0 00 0 1 + 2.40W --1 1 1 10 0 2 + 1.90W --2 2 2 20 0 3 - 0.0800W --3 3 3 312500 4 - 0.0050W --4 4 4 45 175000 Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 512 0 0 === START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x00 Temperature:29 Celsius Available Spare:100% Available Spare Threshold: 50% Percentage Used:1% Data Units Read:2,059,578 [1.05 TB] Data Units Written: 4,331,740 [2.21 TB] Host Read Commands: 34,359,590 Host Write Commands:68,933,776 Controller Busy Time: 1,383 Power Cycles: 201 Power On Hours: 2,517 Unsafe Shutdowns: 83 Media and Data Integrity Errors:0 Error Information Log Entries: 0 Warning Comp. Temperature Time:0 Critical Comp. Temperature Time:0 Temperature Sensor 1: 29 Celsius Temperature Sensor 2: 31 Celsius Error Information (NVMe Log 0x01, 16 of 256 entries) No Errors Logged
Bug#947803: smartmontools: smartctl -l error causes Micron 2200S NVME to fail
Please test recent build from https://builds.smartmontools.org/
Bug#947803: smartmontools: smartctl -l error causes Micron 2200S NVME to fail
Possibly related report for Micron_2200_MTFDHBA1T0TCK with firmware P1MU003: https://www.smartmontools.org/ticket/1404
Bug#947803: smartmontools: smartctl -l error causes Micron 2200S NVME to fail
PS: output of lspci -vvvs 71:00.0 71:00.0 Non-Volatile memory controller: Micron Technology Inc Device 5410 (rev 01) (prog-if 02 [NVM Express]) Subsystem: Micron Technology Inc Device 0100 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Capabilities: [108 v1] Latency Tolerance Reporting Max snoop latency: 3145728ns Max no snoop latency: 3145728ns Capabilities: [110 v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ PortCommonModeRestoreTime=32us PortTPowerOnTime=20us L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ T_CommonMode=0us LTR1.2_Threshold=81920ns L1SubCtl2: T_PwrOn=44us Capabilities: [200 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 0401 000f 7107 3a74fcf2 Capabilities: [300 v1] Secondary PCI Express LnkCtl3: LnkEquIntrruptEn-, PerformEqu- LaneErrStat: 0 Kernel driver in use: nvme Kernel modules: nvme
Bug#947803: smartmontools: smartctl -l error causes Micron 2200S NVME to fail
same here – Debian sid smartctl --version: smartctl 7.1 2019-12-30 r5022 uname -srvmpio: Linux 5.6.0-2-amd64 #1 SMP Debian 5.6.14-1 (2020-05-23) x86_64 unknown unknown GNU/Linux running smartctl -a /dev/nvme0 results in consecutive messages: [DRHD]: handling fault status reg 3 [DMA Read]: Request device [71:00.0] PASID fault addr fd26 [fault reason 06] PTE read access is not set hope this helps a little ...
Bug#947803: smartmontools: smartctl -l error causes Micron 2200S NVME to fail
Package: smartmontools Version: 7.0-2 Severity: normal I've discovered that running "smartctl -l error" against my new Dell XPS 13 laptop with a Micron 2200S NVMe causes the drive to die. This obviously causes the entire system to fail, because the filesystem is no longer readable, until the power is pulled and then I can boot normally again. The system is a Dell XPS 13 7390 tested with EFI version 1.3.1 and 1.4.0. The NVME is a Micron 2200S NVMe 512GB with firmware version 22001030. I am on Debian unstable/sid. The problem occurs on kernel 5.4.0-1 and 5.3.0-3. smartctl --version says it's "7.0 2018-12-30 r4883 [x86_64-linux-5.3.0-3-amd64] (local build)". I first saw the problem when running smartctrl -a against the NVME drive. Then I narrowed it down to being caused by "smartctrl -l error". When the drive dies I get repeating errors in my syslog: kernel: DMAR: DRHD: handling fault status reg 3 kernel: DMAR: [DMA Read] Request device [71:00.0] fault addr ffe48000 [fault reason 06] PTE Read access is not set I tried and failed to reproduce the problem on live images ubuntu-18.04.3-desktop-amd64.iso and ubuntu-19.10-desktop-amd64.iso. If my memory is correct I also booted on the old 4.19.67-2+deb10u2 image and it worked okay there too, though that kernel lacks support for this hardware in many other respects. I sent this report to the smartmontools mailing list and Christian Franke replied, saying he had never heard of such a thing before and had no idea. I suspect this is some kernel problem rather than something wrong with smartmontools, but that's just a guess based on the evidence I've seen. I'm reporting against smartmontools first, but you might just want to reassign. -- System Information: Debian Release: bullseye/sid APT prefers unstable APT policy: (500, 'unstable'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 5.4.0-1-amd64 (SMP w/12 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE= (charmap=UTF-8) Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages smartmontools depends on: ii debianutils 4.9.1 ii libc62.29-6 ii libcap-ng0 0.7.9-2.1+b1 ii libgcc1 1:9.2.1-21 ii libselinux1 3.0-1 ii libstdc++6 9.2.1-21 ii libsystemd0 244-3 ii lsb-base 11.1.0 smartmontools recommends no packages. Versions of packages smartmontools suggests: ii bsd-mailx [mailx] 8.1.2-0.20180807cvs-1+b1 ii curl 7.67.0-2 ii gpg2.2.17-3 pn gsmartcontrol ii lynx 2.9.0dev.4-1 ii mailutils [mailx] 1:3.7-2 pn smart-notifier ii wget 1.20.3-1+b2 -- no debconf information