Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-02-17 Thread gene heskett
On 2/17/24 13:45, Roy J. Tellason, Sr. wrote: On Friday 16 February 2024 04:42:12 pm Gremlin wrote: On 2/16/24 13:56, Roy J. Tellason, Sr. wrote: On Friday 16 February 2024 04:52:22 am David Christensen wrote: I think the Raspberry Pi, etc., users on this list live with USB storage and have

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-02-17 Thread Roy J. Tellason, Sr.
On Friday 16 February 2024 04:42:12 pm Gremlin wrote: > On 2/16/24 13:56, Roy J. Tellason, Sr. wrote: > > On Friday 16 February 2024 04:52:22 am David Christensen wrote: > >> I think the Raspberry Pi, etc., users on this list live with USB storage > >> and have found it to be reliable enough for

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-02-16 Thread Gremlin
On 2/16/24 13:56, Roy J. Tellason, Sr. wrote: On Friday 16 February 2024 04:52:22 am David Christensen wrote: I think the Raspberry Pi, etc., users on this list live with USB storage and have found it to be reliable enough for personal and SOHO network use. I have one, haven't done much

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-02-16 Thread David Christensen
On 2/16/24 10:56, Roy J. Tellason, Sr. wrote: On Friday 16 February 2024 04:52:22 am David Christensen wrote: I think the Raspberry Pi, etc., users on this list live with USB storage and have found it to be reliable enough for personal and SOHO network use. I have one, haven't done much

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-02-16 Thread Roy J. Tellason, Sr.
On Friday 16 February 2024 04:52:22 am David Christensen wrote: > I think the Raspberry Pi, etc., users on this list live with USB storage > and have found it to be reliable enough for personal and SOHO network use. I have one, haven't done much with it. Are there any alternative ways to

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-02-16 Thread David Christensen
On 2/15/24 07:41, The Wanderer wrote: On 2024-02-15 at 03:09, David Christensen wrote: On 2/14/24 18:54, The Wanderer wrote: On 2024-01-09 at 14:22, The Wanderer wrote: On 2024-01-09 at 14:01, Michael Kjörling wrote: On 9 Jan 2024 13:25 -0500, from The Wanderer I've ordered a 22TB external

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-02-15 Thread Michael Kjörling
On 15 Feb 2024 10:41 -0500, from wande...@fastmail.fm (The Wanderer): >> 65,000 hard links seems to be an ext4 limit: >> >> https://www.linuxquestions.org/questions/linux-kernel-70/max-hard-link-per-file-on-ext4-4175454538/#post4914624 > > That sounds right. > >> I believe ZFS can do more hard

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-02-15 Thread The Wanderer
On 2024-02-15 at 01:18, songbird wrote: > The Wanderer wrote: > >> TL;DR: It worked! I'm back up and running, with what appears to be >> all my data safely recovered from the failing storage stack! > > i'm glad you got it back up and running and i hope all your data is > intact. :) Thank you.

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-02-15 Thread The Wanderer
On 2024-01-11 at 15:25, Stefan Monnier wrote: >> manufacturers in different memory banks, but since it's always >> possible to power down, replace or just remove memory, and power up >> again, > > Hmm... "always"? What about long running computations like that > simulation (or LLM training)

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-02-15 Thread The Wanderer
On 2024-02-15 at 03:09, David Christensen wrote: > On 2/14/24 18:54, The Wanderer wrote: > >> TL;DR: It worked! I'm back up and running, with what appears to be >> all my data safely recovered from the failing storage stack! > > That is good to hear. :-) > >> On 2024-01-09 at 14:22, The

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-02-15 Thread The Wanderer
On 2024-02-15 at 07:14, debian-u...@howorth.org.uk wrote: > The Wanderer wrote: > >> It turns out that there is a hard limit of 65000 hardlinks per >> on-disk file; > > That's a filesystem dependent value. That's the value for ext4. I think I recall reading that while I was flailing over

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-02-15 Thread debian-user
The Wanderer wrote: > It turns out that there is a hard limit of 65000 > hardlinks per on-disk file; That's a filesystem dependent value. That's the value for ext4. XFS has a much larger limit I believe. As well as some other helpful properties for large filesystems. btrfs has different

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-02-15 Thread David Christensen
On 2/14/24 18:54, The Wanderer wrote: TL;DR: It worked! I'm back up and running, with what appears to be all my data safely recovered from the failing storage stack! That is good to hear. :-) On 2024-01-09 at 14:22, The Wanderer wrote: On 2024-01-09 at 14:01, Michael Kjörling wrote:

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-02-14 Thread songbird
The Wanderer wrote: > TL;DR: It worked! I'm back up and running, with what appears to be all > my data safely recovered from the failing storage stack! ... i'm glad you got it back up and running and i hope all your data is intact. :) which SSDs did you use? songbird

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-02-14 Thread The Wanderer
TL;DR: It worked! I'm back up and running, with what appears to be all my data safely recovered from the failing storage stack! On 2024-01-09 at 14:22, The Wanderer wrote: > On 2024-01-09 at 14:01, Michael Kjörling wrote: > >> On 9 Jan 2024 13:25 -0500, from wande...@fastmail.fm (The >>

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-12 Thread Dan Ritter
Stefan Monnier wrote: > > manufacturers in different memory banks, but since it's always > > possible to power down, replace or just remove memory, and power > > up again, > > Hmm... "always"? What about long running computations like that > simulation (or LLM training) launched a month ago and

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-11 Thread Michael Stone
On Thu, Jan 11, 2024 at 03:25:51PM -0500, Stefan Monnier wrote: manufacturers in different memory banks, but since it's always possible to power down, replace or just remove memory, and power up again, Hmm... "always"? What about long running computations like that simulation (or LLM

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-11 Thread Stefan Monnier
> manufacturers in different memory banks, but since it's always > possible to power down, replace or just remove memory, and power > up again, Hmm... "always"? What about long running computations like that simulation (or LLM training) launched a month ago and that's expected to finish in

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-11 Thread Dan Ritter
David Christensen wrote: > On 1/11/24 05:50, Dan Ritter wrote: > > David Christensen wrote: > STFW the Dell PowerEdge 6850 (circa 2004) featured "hot plug" disk drives, > expansion slots, memory risers, power supplies, and system cooling fans: > >

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-11 Thread David Christensen
On 1/11/24 05:50, Dan Ritter wrote: David Christensen wrote: dual network interfaces, and dual power supplies come to mind. I am unclear about dual processors and/or dual memory banks. There are no systems that I'm aware of which allow you to use 2 or more processors of different models;

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-11 Thread Dan Ritter
David Christensen wrote: > On 1/10/24 09:07, Curt wrote: > > On 2024-01-10, David Christensen wrote: > > dual network interfaces, and dual power supplies come to mind. I am unclear > about dual processors and/or dual memory banks. Moving beyond one computer, There are no systems that I'm

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-10 Thread David Christensen
On 1/10/24 09:30, Michael Kjörling wrote: My understanding is that it's even relatively common, at least for flight-critical components, to use totally different implementations (of both hardware and software), not just sourced from different vendors, resellers or batches, such that the same

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-10 Thread David Christensen
On 1/10/24 09:07, Curt wrote: On 2024-01-10, David Christensen wrote: Given the OP's situation -- 8 consumer SSD's, same make and model, possibly from a defective manufacturing batch, all purchased at the same time, all deployed in the same RAID-6, all run 2.5 years 24x7, and all suddenly

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-10 Thread Dan Ritter
Curt wrote: > On 2024-01-10, David Christensen wrote: > > > > > > Given the OP's situation -- 8 consumer SSD's, same make and model, > > possibly from a defective manufacturing batch, all purchased at the same > > time, all deployed in the same RAID-6, all run 2.5 years 24x7, and all > >

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-10 Thread Michael Kjörling
On 10 Jan 2024 17:07 -, from cu...@free.fr (Curt): > It's curious, but I just heard something on French TV from a journalist > that's relevant to this. She said she'd covered the aeronautics field in > the past and mentioned the *principe de dissemblance* (dissimilarity > principle). Critical

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-10 Thread Curt
On 2024-01-10, David Christensen wrote: > > > Given the OP's situation -- 8 consumer SSD's, same make and model, > possibly from a defective manufacturing batch, all purchased at the same > time, all deployed in the same RAID-6, all run 2.5 years 24x7, and all > suddenly showing lots of SMART

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-10 Thread David Christensen
On 1/10/24 01:35, Michael Kjörling wrote: On 9 Jan 2024 14:34 -0800, from dpchr...@holgerdanske.com (David Christensen): I don't know how to interpret the "Pre-fail" notation for the other attributes. AIUI "Pre-fail" indicates the drive is going to fail soon and should be replaced. Only if

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-10 Thread Michael Kjörling
On 9 Jan 2024 14:34 -0800, from dpchr...@holgerdanske.com (David Christensen): >> I don't know how to interpret the "Pre-fail" notation for the other >> attributes. > > AIUI "Pre-fail" indicates the drive is going to fail soon and should be > replaced. Only if the attribute hits the "failure"

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-09 Thread David Christensen
On 1/9/24 14:34, David Christensen wrote: You can always run smartctl manual ... Correction: manually To get protection against two-device failure, you need 3-day mirrors ... Correction: 3-way Perhaps other readers with madm, ... Correction: mdadm David

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-09 Thread David Christensen
On 1/9/24 05:11, The Wanderer wrote: I have an eight-drive RAID-6 array of 2TB SSDs, built back in early-to-mid 2021. Within the past few weeks, I got root-mail notifications from smartd that the ATA error count on two of the drives had increased ... On Sunday (two days ago), I got

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-09 Thread The Wanderer
On 2024-01-09 at 14:01, Michael Kjörling wrote: > On 9 Jan 2024 13:25 -0500, from wande...@fastmail.fm (The Wanderer): > Within the past few weeks, I got root-mail notifications from smartd that the ATA error count on two of the drives had increased - one from 0 to a fairly low

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-09 Thread Michael Kjörling
On 9 Jan 2024 13:25 -0500, from wande...@fastmail.fm (The Wanderer): >>> Within the past few weeks, I got root-mail notifications from >>> smartd that the ATA error count on two of the drives had increased >>> - one from 0 to a fairly low value (I think between 10 and 20), the >>> other from 0 to

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-09 Thread Michael Kjörling
On 9 Jan 2024 10:21 -0500, from wande...@fastmail.fm (The Wanderer): >>> Model Family: Samsung based SSDs >>> Device Model: Samsung SSD 870 EVO 2TB >> >> These may or may not be under warranty, > > I would be surprised if there were warranty coverage at > this point, but might look a bit

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-09 Thread The Wanderer
On 2024-01-09 at 11:21, Michael Kjörling wrote: > On 9 Jan 2024 08:11 -0500, from wande...@fastmail.fm (The Wanderer): > >> Within the past few weeks, I got root-mail notifications from >> smartd that the ATA error count on two of the drives had increased >> - one from 0 to a fairly low value (I

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-09 Thread The Wanderer
On 2024-01-09 at 11:12, Curt wrote: > On 2024-01-09, The Wanderer wrote: > >> My default plan is to identify an appropriate model and buy a pair >> of replacement drives, but not install them yet; buy another two >> drives every six months, until I have a full replacement set; and >> start

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-09 Thread Michael Kjörling
On 9 Jan 2024 08:11 -0500, from wande...@fastmail.fm (The Wanderer): > Within the past few weeks, I got root-mail notifications from smartd > that the ATA error count on two of the drives had increased - one from 0 > to a fairly low value (I think between 10 and 20), the other from 0 to > 1. I

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-09 Thread Curt
On 2024-01-09, The Wanderer wrote: > > My default plan is to identify an appropriate model and buy a pair of > replacement drives, but not install them yet; buy another two drives > every six months, until I have a full replacement set; and start failing > drives out of the RAID array and

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-09 Thread The Wanderer
On 2024-01-09 at 09:38, Dan Ritter wrote: > The Wanderer wrote: > >> So... as the Subject asks, should I be worried? How do I interpret >> these results, and at what point do they start to reflect something >> to take action over? If there is not reason to be worried, what >> *do* these alerts

Re: SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-09 Thread Dan Ritter
The Wanderer wrote: > So... as the Subject asks, should I be worried? How do I interpret these > results, and at what point do they start to reflect something to take > action over? If there is not reason to be worried, what *do* these > alerts indicate, and at what point *should* I start to be

SMART Uncorrectable_Error_Cnt rising - should I be worried?

2024-01-09 Thread The Wanderer
This is not directly Debian-related, except insofar as the system involved is running Debian, but we've already had a somewhat similar thread recently and this forum is as likely as any I'm aware of to have people who might have the experience to address the question(s). I would be open to