Package: smartmontools
Version: 6.3+svn4002-2
When booting my Raspberry Pi (running Raspbian, an unofficial Debian
Jessie port) I am *sometimes* finding that smartd fails to pick up an
external USB-attached drive. Looking at syslog the reason for this
appears to be that smartd is often started in the boot sequence before
the drive is being picked up and therefore the /dev/disk/by-id/
reference doesn't (yet) exist:
Jul 20 18:59:17 backup systemd[1]: Starting Self Monitoring and
Reporting Technology (SMART) Daemon...
Jul 20 18:59:17 backup systemd[1]: Started Self Monitoring and Reporting
Technology (SMART) Daemon.
Jul 20 18:59:17 backup smartd[403]: smartd 6.4 2014-10-07 r4002
[armv7l-linux-4.4.13-v7+] (local build)
Jul 20 18:59:17 backup smartd[403]: Copyright (C) 2002-14, Bruce Allen,
Christian Franke, www.smartmontools.org
Jul 20 18:59:17 backup smartd[403]: Opened configuration file
/etc/smartd.conf
Jul 20 18:59:17 backup smartd[403]: Configuration file /etc/smartd.conf
parsed.
Jul 20 18:59:17 backup smartd[403]: Device:
/dev/disk/by-id/ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N6FVL47R [SAT], open()
failed: No such device
Jul 20 18:59:17 backup smartd[403]: Unable to monitor any SMART enabled
devices. Try debug (-d) option. Exiting...
Jul 20 18:59:17 backup systemd[1]: smartd.service: main process exited,
code=exited, status=17/n/a
Jul 20 18:59:17 backup systemd[1]: Unit smartd.service entered failed
state.
|
<unrelated logging snipped>
|
Jul 20 18:59:21 backup kernel: [ 9.092426] usb 1-1.2: New USB device
found, idVendor=174c, idProduct=1053
Jul 20 18:59:21 backup kernel: [ 9.092439] usb 1-1.2: New USB device
strings: Mfr=2, Product=3, SerialNumber=1
Jul 20 18:59:21 backup kernel: [ 9.092446] usb 1-1.2: Product: USB3.0
Device
Jul 20 18:59:21 backup kernel: [ 9.092452] usb 1-1.2: Manufacturer:
Generic
Jul 20 18:59:21 backup kernel: [ 9.092458] usb 1-1.2: SerialNumber:
AC0000000001
Jul 20 18:59:21 backup kernel: [ 9.093049] usb-storage 1-1.2:1.0: USB
Mass Storage device detected
Jul 20 18:59:21 backup kernel: [ 9.094618] scsi host0: usb-storage
1-1.2:1.0
Jul 20 18:59:22 backup kernel: [ 10.091572] scsi 0:0:0:0:
Direct-Access ASMT 2105 0 PQ: 0 ANSI: 6
Jul 20 18:59:22 backup kernel: [ 10.096115] sd 0:0:0:0: [sda] Very big
device. Trying to use READ CAPACITY(16).
Jul 20 18:59:22 backup kernel: [ 10.099979] sd 0:0:0:0: [sda]
5860533168 512-byte logical blocks: (3.00 TB/2.73 TiB)
Jul 20 18:59:22 backup kernel: [ 10.099995] sd 0:0:0:0: [sda]
4096-byte physical blocks
Jul 20 18:59:22 backup kernel: [ 10.104503] sd 0:0:0:0: Attached scsi
generic sg0 type 0
Jul 20 18:59:22 backup kernel: [ 10.104857] sd 0:0:0:0: [sda] Write
Protect is off
Jul 20 18:59:22 backup kernel: [ 10.104873] sd 0:0:0:0: [sda] Mode
Sense: 43 00 00 00
Jul 20 18:59:22 backup kernel: [ 10.106730] sd 0:0:0:0: [sda] Write
cache: enabled, read cache: enabled, doesn't support DPO or FUA
Jul 20 18:59:22 backup kernel: [ 10.110461] sd 0:0:0:0: [sda] Very big
device. Trying to use READ CAPACITY(16).
Jul 20 18:59:22 backup kernel: [ 10.157535] sda: sda1 sda2 sda3
Jul 20 18:59:22 backup kernel: [ 10.161101] sd 0:0:0:0: [sda] Very big
device. Trying to use READ CAPACITY(16).
Jul 20 18:59:22 backup kernel: [ 10.163033] sd 0:0:0:0: [sda] Attached
SCSI disk
In case it is relevant here is my smartd.conf:
/dev/disk/by-id/ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N6FVL47R -d sat -n
standby -a -o on -S on -s (S/../.././03|L/../../7/04) -r 194 -I 194 -W
5,40,45
I am using the /dev/disk/by-id/ reference as I sometimes swap disks
around and each has different smartd configuration parameters. Following
boot completion the symlink does indeed exist:
# ls -l /dev/disk/by-id/ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N6FVL47R
lrwxrwxrwx 1 root root 9 Jul 20 18:59
/dev/disk/by-id/ata-WDC_WD30EFRX-68EUZN0_WD-WCC4N6FVL47R -> ../../sda
...and if I then start smartmontool manually everything is fine.
More often than not it works fine without intervention i.e. the disk is
picked up before smartd starts but as seen above there can be a few
seconds difference the other way which ends in non-obvious failure (i.e.
I only spot if if I check the logs). Should/could there by anything to
prevent this race condition?
Perhaps it is a Raspbian-specific issue? If so, apologies for this
misdirected report.
Regards,
Mathew