> On Aug 21, 2025, at 4:07 AM, Miles Goodhew <c...@m0les.com> wrote: > > Hi Robert, > I'm not an expert on the low-level details and "modern" Ceph, so I hope I > don't lead you on any wild goose chases, but I might at least give some leads. > It seems odd that the metrics mention NVM/e - I'm guessing that it's just a > cross-product test and tries all tools on all devices.
Recent releases of smartctl pass through stats for NVMe devices via the name-cli command "nvme". Whether it invokes that for all devices, ordering, etc I don't know. > SMART test failure is more of an issue. It's a pity the error message is so > nondescript. Some things I can think of from simplest to most complicated are: > * Are smartmontools installed on the drive host? Does it happen with other drives on the same host? If you have availability through your chassis vendor, look for a firmware update. > * Does the monitoring UID have sudo access? > * Does a manual "sudo smartctl -a /dev/sdc" give the same or similar result? > * Is the drive managed by a hardware RAID controller or concentrator (Like > Dell PERC or a USB adapter or something) > * (This is a stretch) Is there an OSD for the drive that's given the "NVME" > class? > > Hope that gives you something. > > M0les. > > > On Thu, 21 Aug 2025, at 17:15, Robert Sander wrote: >> Hi, >> >> On a new cluster with version 19.2.3 the device health metrics only show a >> smartctl error: >> >> { >> "20250821-000313": { >> "dev": "/dev/sdc", >> "error": "smartctl failed", >> "nvme_smart_health_information_add_log_error": "nvme returned an >> error: sudo: exit status: 1", >> "nvme_smart_health_information_add_log_error_code": -22, >> "nvme_vendor": "ata", >> "smartctl_error_code": -22, >> "smartctl_output": "smartctl returned an error (1): stderr:\nsudo: >> exit status: 1\nstdout:\n" >> } >> } >> >> The device in question (like all the other in the cluster) is a Samsung >> MZ7L37T6 SATA SSD. >> >> What is happening here? >> >> Regards >> -- >> Robert Sander >> Linux Consultant >> >> Heinlein Consulting GmbH >> Schwedter Str. 8/9b, 10119 Berlin >> >> https://www.heinlein-support.de >> >> Tel: +49 30 405051 - 0 >> Fax: +49 30 405051 - 19 >> >> Amtsgericht Berlin-Charlottenburg - HRB 220009 B >> Geschäftsführer: Peer Heinlein - Sitz: Berlin >> _______________________________________________ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io >> > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io