Re: storage error

2020-12-15 Thread Russell Coker via luv-main
On Wednesday, 16 December 2020 1:20:38 AM AEDT Craig Sanders via luv-main 
wrote:
> But why rely on a guess when the obvious thing to do is to test it?
> 
> 1. Try the M.2 device in another machine

According to some Google searches the X1 Carbon Gen 1 that I have uses a non-
standard connector so the device won't work in another machine and getting 
another device for it will be difficult and maybe expensive.

> If you don't have another motherboard with M.2 slots free, you can get
> reasonably priced PCI-e adaptors that can take anywhere from 1 M.2 drive
> (using 4 PCI-e lanes) to 4 M.2 drives (using all 16 PCI-e lanes).
> 
> These are a useful thing to have around, so it wouldn't be a one-use waste
> of money.

I've got a M.2 to SATA adapter already.  But it wouldn't work with the 
Thinkpad device.

https://forums.lenovo.com/t5/ThinkPad-X-Series-Laptops/X1-Carbon-Model-3443-SSD-interface-mSATA-M-2-etc/m-p/2031869

Here's the information on the Thinkpad X1 Carbon Gen 1 that I have.  A strange 
small SATA connector that looks like M.2 but isn't.  There are adapters but 
fitting an adapter and a regular M.2 card in there will be difficult.

A new storage device for this laptop will probably cost $100US (or $40AU if I 
get a smaller one).

Jason King replied off-list to suggest that the error messages have been 
correlated to cable-controller issues by other people (which in this case 
means motherboard).

I think I'll keep running it as it is until something dies properly.  Then 
I'll run it with a USB stick for booting and the build-in SD slot for the root 
filesystem until I can get a good deal on a replacement.

-- 
My Main Blog http://etbe.coker.com.au/
My Documents Bloghttp://doc.coker.com.au/



___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: storage error

2020-12-15 Thread Craig Sanders via luv-main
On Tue, Dec 15, 2020 at 06:48:32PM +1100, Russell Coker wrote:
> How likely is the following error (which happens periodically) to be on the
> M.  2 SATA device and how likely is it to be on the motherboard?

My guess would be that it's most likely the M.2 SATA device...because, in my
experience, drives suck and die a lot - which is why i'll never use less than
RAID-1 (or equivalent, such as mirrored drives in ZFS).

OTOH, while I've had LOTS of mechanical hard drives die on me over the years,
I've only ever had one SSD die (and even that died "gracefully" - could
still be read, but writess failed).  SSDs are, IME, a lot more reliable than
spinning rust.


But why rely on a guess when the obvious thing to do is to test it?

1. Try the M.2 device in another machine

If you don't have another motherboard with M.2 slots free, you can get
reasonably priced PCI-e adaptors that can take anywhere from 1 M.2 drive
(using 4 PCI-e lanes) to 4 M.2 drives (using all 16 PCI-e lanes).

These are a useful thing to have around, so it wouldn't be a one-use waste of
money.


2. Try another M.2 device in the motherboard.

The cheapest M.2 drive available now is around $30 for 120GB.  e.g.

$ cplgrep -p m.2 | head -n1
32  Kingston SA400M8/120G A400 M.2 SSD 120GB

You are, IMO, better off just buying another M.2 the same size or larger (if
it turns out to be the drive that's failing, you can immediately use it as
a replacement.  Otherwise, you've got a spare, or a drive to use in another
machine).



BTW, if your motherboard supports it, get M.2 NVME rather than M.2 SATA -
there's very little difference in price, and the NVME will be around 4 to 6
times faster - depending on brand and model, from ~2500 MB/s up to ~3500 MB/s
for PCI-e 3.0 NVME vs ~550 MB/s for SATA.

For PCI-e 4.0 NVME, it could theoretically get up to nearly 8 GB/s (less
protocol overhead), but current models max out around 5.5 or 6 GB/s.

PCI-e 5.0 will double that again in a year or three if SSD speeds keep up with
PCI-e bus speeds.

craig

--
craig sanders 

___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: storage error

2020-12-15 Thread Russell Coker via luv-main
In the logs below I had some errors on sdb, that's not related to this issue.  
sdb is a USB attached device with a known faulty disk that I had removed the 
cover on and run while open to the air for fun.

Only the sda errors are ones that matter.

On Tuesday, 15 December 2020 6:48:32 PM AEDT Russell Coker wrote:
> How likely is the following error (which happens periodically) to be on the
> M. 2 SATA device and how likely is it to be on the motherboard?  If it's on
> the SATA device I can replace that, if it's the motherboard I just need to
> put up with periodic hangs and keep good backups (a new motherboard costs
> more than the value of the laptop).
> 
> [315041.837612] ata1.00: status: { DRDY }
> [315041.837613] ata1.00: failed command: WRITE FPDMA QUEUED
> [315041.837616] ata1.00: cmd 61/20:48:28:1e:3e/00:00:00:00:00/40 tag 9 ncq
> dma 16384 out
>  res 40/00:01:00:00:00/00:00:00:00:00/e0 Emask 0x4
> (timeout)
> [315041.837617] ata1.00: status: { DRDY }
> [315041.837618] ata1.00: failed command: READ FPDMA QUEUED
> [315041.837621] ata1.00: cmd 60/08:50:e0:26:84/00:00:00:00:00/40 tag 10 ncq
> dma 4096 in
>  res 40/00:01:00:00:00/00:00:00:00:00/e0 Emask 0x4
> (timeout)
> [315041.837622] ata1.00: status: { DRDY }
> [315041.837625] ata1: hard resetting link
> [315042.151781] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> [315042.163368] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
> succeeded
> [315042.163370] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
> LOCK) filtered out
> [315042.163372] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES)
> filtered out
> [315042.183332] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES)
> succeeded
> [315042.183334] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE
> LOCK) filtered out
> [315042.183336] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES)
> filtered out
> [315042.193332] ata1.00: configured for UDMA/133
> [315042.193789] sd 0:0:0:0: [sda] tag#10 FAILED Result: hostbyte=DID_OK
> driverbyte=DRIVER_SENSE
> [315042.193791] sd 0:0:0:0: [sda] tag#10 Sense Key : Illegal Request
> [current] [315042.193793] sd 0:0:0:0: [sda] tag#10 Add. Sense: Unaligned
> write command [315042.193795] sd 0:0:0:0: [sda] tag#10 CDB: Read(10) 28 00
> 00 84 26 e0 00 00 08 00
> [315042.193797] print_req_error: I/O error, dev sda, sector 8660704
> [315042.193810] ata1: EH complete
> 
> 
> I'm getting the errors on a wide selection of somewhat random sectors (that
> are all divisible by 8).
> 
> Dec 14 17:40:09 liv kernel: [297451.401459] print_req_error: I/O error, dev
> sdb, sector 2073200
> Dec 14 17:40:29 liv kernel: [297471.674024] print_req_error: I/O error, dev
> sdb, sector 2931718768
> Dec 14 17:40:29 liv kernel: [297471.674295] print_req_error: I/O error, dev
> sdb, sector 1226653088
> Dec 14 17:40:29 liv kernel: [297471.674315] print_req_error: I/O error, dev
> sdb, sector 4156298656
> Dec 14 22:33:20 liv kernel: [315042.193797] print_req_error: I/O error, dev
> sda, sector 8660704
> Dec 11 17:42:23 liv kernel: [182970.726875] print_req_error: I/O error, dev
> sda, sector 42147264
> Nov 30 22:43:31 liv kernel: [399074.746393] print_req_error: I/O error, dev
> sda, sector 231758600
> Nov 26 00:55:16 liv kernel: [212647.753370] print_req_error: I/O error, dev
> sda, sector 23865952
> Nov 26 00:55:16 liv kernel: [212647.753420] print_req_error: I/O error, dev
> sda, sector 23870056
> Nov 26 00:55:16 liv kernel: [212647.753444] print_req_error: I/O error, dev
> sda, sector 5989744
> Nov 26 00:55:16 liv kernel: [212647.753463] print_req_error: I/O error, dev
> sda, sector 6127336
> Nov 26 00:55:16 liv kernel: [212647.753481] print_req_error: I/O error, dev
> sda, sector 8780056
> Nov 26 00:55:16 liv kernel: [212647.753499] print_req_error: I/O error, dev
> sda, sector 9435856
> Nov 26 00:55:16 liv kernel: [212647.753526] print_req_error: I/O error, dev
> sda, sector 9622096
> Nov 26 00:55:16 liv kernel: [212647.753533] print_req_error: I/O error, dev
> sda, sector 9697912
> Nov 26 00:55:16 liv kernel: [212647.753543] print_req_error: I/O error, dev
> sda, sector 9874752
> Nov 26 00:55:16 liv kernel: [212647.753551] print_req_error: I/O error, dev
> sda, sector 9897224
> Nov 27 01:35:55 liv kernel: [240255.929450] print_req_error: 22 callbacks
> suppressed
> Nov 27 01:35:55 liv kernel: [240255.929453] print_req_error: I/O error, dev
> sda, sector 43704312
> Nov 27 01:35:55 liv kernel: [240255.929524] print_req_error: I/O error, dev
> sda, sector 43704432
> Nov 27 01:35:55 liv kernel: [240255.929557] print_req_error: I/O error, dev
> sda, sector 3902144
> Nov 27 01:35:55 liv kernel: [240255.929589] print_req_error: I/O error, dev
> sda, sector 26654472
> Nov 27 01:35:55 liv kernel: [240255.929615] print_req_error: I/O error, dev
> sda, sector 35040632
> Nov 27 01:35:55 liv kernel: [240255.929642] print_req_error: I/O error, dev
> sda, sector 38449048
> Nov 27 01:35:55 liv kernel: 

storage error

2020-12-14 Thread Russell Coker via luv-main
How likely is the following error (which happens periodically) to be on the M.
2 SATA device and how likely is it to be on the motherboard?  If it's on the 
SATA device I can replace that, if it's the motherboard I just need to put up 
with periodic hangs and keep good backups (a new motherboard costs more than 
the value of the laptop).

[315041.837612] ata1.00: status: { DRDY }
[315041.837613] ata1.00: failed command: WRITE FPDMA QUEUED
[315041.837616] ata1.00: cmd 61/20:48:28:1e:3e/00:00:00:00:00/40 tag 9 ncq dma 
16384 out
 res 40/00:01:00:00:00/00:00:00:00:00/e0 Emask 0x4 
(timeout)
[315041.837617] ata1.00: status: { DRDY }
[315041.837618] ata1.00: failed command: READ FPDMA QUEUED
[315041.837621] ata1.00: cmd 60/08:50:e0:26:84/00:00:00:00:00/40 tag 10 ncq 
dma 4096 in
 res 40/00:01:00:00:00/00:00:00:00:00/e0 Emask 0x4 
(timeout)
[315041.837622] ata1.00: status: { DRDY }
[315041.837625] ata1: hard resetting link
[315042.151781] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[315042.163368] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) 
succeeded
[315042.163370] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) 
filtered out
[315042.163372] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered 
out
[315042.183332] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) 
succeeded
[315042.183334] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) 
filtered out
[315042.183336] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered 
out
[315042.193332] ata1.00: configured for UDMA/133
[315042.193789] sd 0:0:0:0: [sda] tag#10 FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE
[315042.193791] sd 0:0:0:0: [sda] tag#10 Sense Key : Illegal Request [current] 
[315042.193793] sd 0:0:0:0: [sda] tag#10 Add. Sense: Unaligned write command
[315042.193795] sd 0:0:0:0: [sda] tag#10 CDB: Read(10) 28 00 00 84 26 e0 00 00 
08 00
[315042.193797] print_req_error: I/O error, dev sda, sector 8660704
[315042.193810] ata1: EH complete


I'm getting the errors on a wide selection of somewhat random sectors (that 
are all divisible by 8).

Dec 14 17:40:09 liv kernel: [297451.401459] print_req_error: I/O error, dev 
sdb, sector 2073200
Dec 14 17:40:29 liv kernel: [297471.674024] print_req_error: I/O error, dev 
sdb, sector 2931718768
Dec 14 17:40:29 liv kernel: [297471.674295] print_req_error: I/O error, dev 
sdb, sector 1226653088
Dec 14 17:40:29 liv kernel: [297471.674315] print_req_error: I/O error, dev 
sdb, sector 4156298656
Dec 14 22:33:20 liv kernel: [315042.193797] print_req_error: I/O error, dev 
sda, sector 8660704
Dec 11 17:42:23 liv kernel: [182970.726875] print_req_error: I/O error, dev 
sda, sector 42147264
Nov 30 22:43:31 liv kernel: [399074.746393] print_req_error: I/O error, dev 
sda, sector 231758600
Nov 26 00:55:16 liv kernel: [212647.753370] print_req_error: I/O error, dev 
sda, sector 23865952
Nov 26 00:55:16 liv kernel: [212647.753420] print_req_error: I/O error, dev 
sda, sector 23870056
Nov 26 00:55:16 liv kernel: [212647.753444] print_req_error: I/O error, dev 
sda, sector 5989744
Nov 26 00:55:16 liv kernel: [212647.753463] print_req_error: I/O error, dev 
sda, sector 6127336
Nov 26 00:55:16 liv kernel: [212647.753481] print_req_error: I/O error, dev 
sda, sector 8780056
Nov 26 00:55:16 liv kernel: [212647.753499] print_req_error: I/O error, dev 
sda, sector 9435856
Nov 26 00:55:16 liv kernel: [212647.753526] print_req_error: I/O error, dev 
sda, sector 9622096
Nov 26 00:55:16 liv kernel: [212647.753533] print_req_error: I/O error, dev 
sda, sector 9697912
Nov 26 00:55:16 liv kernel: [212647.753543] print_req_error: I/O error, dev 
sda, sector 9874752
Nov 26 00:55:16 liv kernel: [212647.753551] print_req_error: I/O error, dev 
sda, sector 9897224
Nov 27 01:35:55 liv kernel: [240255.929450] print_req_error: 22 callbacks 
suppressed
Nov 27 01:35:55 liv kernel: [240255.929453] print_req_error: I/O error, dev 
sda, sector 43704312
Nov 27 01:35:55 liv kernel: [240255.929524] print_req_error: I/O error, dev 
sda, sector 43704432
Nov 27 01:35:55 liv kernel: [240255.929557] print_req_error: I/O error, dev 
sda, sector 3902144
Nov 27 01:35:55 liv kernel: [240255.929589] print_req_error: I/O error, dev 
sda, sector 26654472
Nov 27 01:35:55 liv kernel: [240255.929615] print_req_error: I/O error, dev 
sda, sector 35040632
Nov 27 01:35:55 liv kernel: [240255.929642] print_req_error: I/O error, dev 
sda, sector 38449048
Nov 27 01:35:55 liv kernel: [240255.929667] print_req_error: I/O error, dev 
sda, sector 44228320
Nov 27 01:35:55 liv kernel: [240255.929700] print_req_error: I/O error, dev 
sda, sector 43699720
Nov 27 01:35:55 liv kernel: [240255.929745] print_req_error: I/O error, dev 
sda, sector 43701688
Nov 27 01:35:55 liv kernel: [240255.929772] print_req_error: I/O error, dev 
sda, sector 43809896
Nov 28 14:37:09 liv kernel: [277047.828152] print_req_error: 20 callbacks