Bug#1041745: smartd[…]: Device: /dev/nvme0, number of Error Log entries increased from … to …

2023-08-03 Thread Al Ma
Thanks for looking into this. The solid-state–memory device in question is 
Samsung SSD 970 EVO 1TB, S/N:…, FW:2B2QEXE7, 1.00 TB. It's no longer sold on 
samsung.com but still sold as new on amazon (ASIN B07CGJNLBB; the Web site says 
it has been sold there since April 24, 2018). So yes, given the dates in 
https://en.wikipedia.org/wiki/NVM_Express#Specifications 
https://en.wikipedia.org/wiki/NVM_Express#Specifications, the drive might be 
aware of the NVMe 1.2 or 1.3 specification but, again hypothetically, not 1.4 
or even 2.0. I wouldn't know how to check this, and firmware upgrades seem 
unavailable ( https://semiconductor.samsung.com/consumer-storage/support/tools/ 
http://semiconductor.samsung.com/consumer-storage/support/tools/ mentions the 
same version 2B2QEXE7). Just in case this helps, the motherboard is Asus WS 
C422 PRO/SE.
Here is some debugging data:
$ sudo nvme error-log -e 2 /dev/nvme0
Error Log Entries for device:nvme0 entries:2
.
Entry[ 0]
.
error_count     : 1885
sqid            : 0
cmdid           : 0x14
status_field    : 0x2002(Invalid Field in Command: A reserved coded value or an 
unsupported value in a defined field)
phase_tag       : 0
parm_err_loc    : 0x
lba             : 0
nsid            : 0
vs              : 0
trtype          : The transport type is not indicated or the error is not 
transport related.
cs              : 0
trtype_spec_info: 0
.
Entry[ 1]
.
error_count     : 0
sqid            : 0
cmdid           : 0
status_field    : 0(Successful Completion: The command completed without error)
phase_tag       : 0
parm_err_loc    : 0
lba             : 0
nsid            : 0
vs              : 0
trtype          : The transport type is not indicated or the error is not 
transport related.
cs              : 0
trtype_spec_info: 0
.
$ sudo nvme list
Node                  Generic               SN                   Model          
                          Namespace Usage                      Format           
FW Rev
- -  
 - -- 
 
/dev/nvme0n1          /dev/ng0n1            Anonymized S.N.      Samsung SSD 
970 EVO 1TB                  1         135,04  GB /   1,00  TB    512   B +  0 
B   2B2QEXE7
$ sudo nvme error-log -e 2 /dev/nvme0
Error Log Entries for device:nvme0 entries:2
.
Entry[ 0]
.
error_count     : 1885
sqid            : 0
cmdid           : 0x14
status_field    : 0x2002(Invalid Field in Command: A reserved coded value or an 
unsupported value in a defined field)
phase_tag       : 0
parm_err_loc    : 0x
lba             : 0
nsid            : 0
vs              : 0
trtype          : The transport type is not indicated or the error is not 
transport related.
cs              : 0
trtype_spec_info: 0
.
Entry[ 1]
.
error_count     : 0
sqid            : 0
cmdid           : 0
status_field    : 0(Successful Completion: The command completed without error)
phase_tag       : 0
parm_err_loc    : 0
lba             : 0
nsid            : 0
vs              : 0
trtype          : The transport type is not indicated or the error is not 
transport related.
cs              : 0
trtype_spec_info: 0
.
As you see, the output is slightly different from that in 
https://github.com/linux-nvme/libnvme/issues/550 
https://github.com/linux-nvme/libnvme/issues/550, and `nvme list` does not 
increase error_count (or at least not directly). If there's anything else I can 
help with, please let me know.
Gratefully,
AlMa


Bug#1041745: smartd[…]: Device: /dev/nvme0, number of Error Log entries increased from … to …

2023-08-02 Thread Daniel Swarbrick
This sounds quite similar to this: 
https://github.com/linux-nvme/libnvme/issues/550


Even prior to that bug, I noticed that the smart error log counter would 
increment by one with every reboot. This was not too concerning, but 
when nvme-cli 2.x started to result in (albeit innocent) errors being 
logged each time a "nvme list" command was executed, it became an 
annoyance. As I understand it, it was due to the SSD being fairly old, 
and the firmware only supporting a fairly outdated version of the NVMe 
spec (< 1.2)


At least the _kernel_ should have fixed this, with commit 
https://github.com/torvalds/linux/commit/d7ac8dca938cd60cf7bd9a89a229a173c6bcba87


A fix for nvme-cli (via libnvme) is still being worked on, AFAIK.



OpenPGP_signature
Description: OpenPGP digital signature


Bug#1041745: smartd[…]: Device: /dev/nvme0, number of Error Log entries increased from … to …

2023-07-22 Thread Al Ma
Package: linux-image-6.1.0-10-amd64
Version: 6.1.38-1
Control: affects -1 src:linux smartmontools
In the journal I see a red error message of the form “smartd[…]: Device: 
/dev/nvme0, number of Error Log entries increased from 푛 to 푛+1”.  I think the 
number 푛 increases on each boot. The relevant portion of the log is attached. 
Is this a problem of software, firmware, or hardware?  In plain English, what 
is the real error?  What to do to avoid whichever havoc might occur?
Gratefully,
AlMa
Jul 22 22:35:24 AnonymizedMachineName smartd[815]: Device: /dev/nvme0, opened
Jul 22 22:35:24 AnonymizedMachineName smartd[815]: Device: /dev/nvme0, Samsung 
SSD 970 EVO 1TB, S/N:AnonymizedSerialOne, FW:2B2QEXE7, 1.00 TB
Jul 22 22:35:24 AnonymizedMachineName smartd[815]: Device: /dev/nvme0, is SMART 
capable. Adding to "monitor" list.
Jul 22 22:35:24 AnonymizedMachineName smartd[815]: Device: /dev/nvme0, state 
read from 
/var/lib/smartmontools/smartd.Samsung_SSD_970_EVO_1TB-AnonymizedSerialOne.nvme.state
Jul 22 22:35:24 AnonymizedMachineName smartd[815]: Monitoring 1 ATA/SATA, 0 
SCSI/SAS and 1 NVMe devices
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Info: Initial device: 
auto
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Info: Initial device: 
auto
Jul 22 22:35:24 AnonymizedMachineName lircd[884]: lircd-0.10.1[884]: Info: 
lircd:  Opening log, level: Info
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Options: 
driver: devinput
Jul 22 22:35:24 AnonymizedMachineName lircd[884]: lircd-0.10.1[884]: Notice: 
Using systemd fd
Jul 22 22:35:24 AnonymizedMachineName lircd[884]: lircd-0.10.1[884]: Warning: 
Running as root
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Options: 
output: /var/run/lirc/lircd
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Options: 
nodaemon: 1
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Options: 
plugindir: /usr/lib/x86_64-linux-gnu/lirc/plugins
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Options: 
logfile: syslog
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Options: 
immediate-init: 0
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Options: 
permission: 666
Jul 22 22:35:24 AnonymizedMachineName lircd[884]: lircd-0.10.1[884]: Info: 
Using remote: devinput-64.
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Options: 
driver-options:
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Options: 
pidfile: /var/run/lirc/lircd.pid
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Options: 
listen: 0
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Options: 
connect: (null)
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Options: 
userelease: 0
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Options: 
effective_user: (null)
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Options: 
release_suffix: _EVUP
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Options: 
allow_simulate: 0
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Options: 
repeat_max: 600
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Options: 
configfile: /etc/lirc/lircd.conf
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Options: 
dynamic_codes: (null)
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Current 
driver: devinput
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Driver API 
version: 4
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Driver  
version: 0.10.0
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Driver  info: 
See file:///usr/share/doc/lirc/plugindocs/devinput.html
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Info: lircd:  Opening 
log, level: Info
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: Using systemd 
fd
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Warning: Running as 
root
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Info: Using remote: 
devinput-64.
Jul 22 22:35:24 AnonymizedMachineName lircd[884]: lircd-0.10.1[884]: Notice: 
/etc/lirc/lircd.conf.d/devinput.lircd.conf: devinput-64: Multiple values for 
same code: BTN_MISC
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: 
/etc/lirc/lircd.conf.d/devinput.lircd.conf: devinput-64: Multiple values for 
same code: BTN_MISC
Jul 22 22:35:24 AnonymizedMachineName lircd[884]: lircd-0.10.1[884]: Notice: 
/etc/lirc/lircd.conf.d/devinput.lircd.conf: devinput-64: Multiple values for 
same code: BTN_MOUSE
Jul 22 22:35:24 AnonymizedMachineName lircd[884]: lircd-0.10.1[884]: Notice: 
/etc/lirc/lircd.conf.d/devinput.lircd.conf: devinput-64: Multiple values for 
same code: BTN_SOUTH
Jul 22 22:35:24 AnonymizedMachineName lircd-0.10.1[884]: Notice: 
/etc/lirc/lircd.conf.d/devinput.lircd.conf: devinput-64: Multiple