Re: [gentoo-user] smartctrl drive error @60%
On Tuesday, July 01, 2014 06:52:10 AM Mick wrote: On Sunday 29 Jun 2014 13:05:04 Rich Freeman wrote: On Sun, Jun 29, 2014 at 12:44 AM, Dale rdalek1...@gmail.com wrote: What if I copied data to the drive until it was just about full. I'm thinking like maybe 90 or 95% or so. If I do that and run the test every few days, would it then catch a error after a few weeks or so of testing? I realize no one knows with 100% certainty... As you already said, nobody knows with 100% certainty. In the failures I've experienced I'd expect it to start catching errors within a few days. However, on those drives the relocated sector count never increases, which suggests that the firmware never relocated those sectors when overwritten, which seems brain-dead to me. If the drive relocates the sectors, then conceivably it could go quite a long time until having errors, probably in an entirely different set of sectors. Even if it doesn't relocate, the reliability of the bad sectors could be high or low. Rich What triggers a relocation? I also have a drive which shows a sector relocation pending, but for a few days now and after some tests that showed no errors, it won't relocate it. I think a write to that sector should force a relocation. -- Joost
Re: [gentoo-user] smartctrl drive error @60%
J. Roeleveld wrote: On Tuesday, July 01, 2014 06:52:10 AM Mick wrote: On Sunday 29 Jun 2014 13:05:04 Rich Freeman wrote: On Sun, Jun 29, 2014 at 12:44 AM, Dale rdalek1...@gmail.com wrote: What if I copied data to the drive until it was just about full. I'm thinking like maybe 90 or 95% or so. If I do that and run the test every few days, would it then catch a error after a few weeks or so of testing? I realize no one knows with 100% certainty... As you already said, nobody knows with 100% certainty. In the failures I've experienced I'd expect it to start catching errors within a few days. However, on those drives the relocated sector count never increases, which suggests that the firmware never relocated those sectors when overwritten, which seems brain-dead to me. If the drive relocates the sectors, then conceivably it could go quite a long time until having errors, probably in an entirely different set of sectors. Even if it doesn't relocate, the reliability of the bad sectors could be high or low. Rich What triggers a relocation? I also have a drive which shows a sector relocation pending, but for a few days now and after some tests that showed no errors, it won't relocate it. I think a write to that sector should force a relocation. -- Joost I think you are right Joost. I should have tried some fixes that COULD be destructive to see if a) it fixes it and b) the data lives, other than the bad part at least. I forgot to do that and really wasn't sure how to do it either. One person posted a lot of info about it but it was a bit deep for me. It would have required some reading and because of health issues, I can't tackle that much at one time right now. What I did tho. I got the new drive, rsynced the data from old drive to new drive. Removed the LVM stuff from the old drive. I used dd to erase the whole old drive, which took a while for 3TBs. o_O After that, I ran the test. It came back fine. Check out this snippet: SMART Self-test log structure revision number 1 Num Test_DescriptionStatus Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 16499 - # 2 Extended offlineCompleted without error 00% 16498 - # 3 Short offline Completed without error 00% 16475 - # 4 Extended offlineCompleted without error 00% 16466 - # 5 Extended offlineAborted by host 90% 16461 - # 6 Extended offlineCompleted: read failure 60% 16451 2905482560 # 7 Extended offlineCompleted: read failure 60% 16432 2905482560 # 8 Extended offlineCompleted: read failure 60% 16427 2905482560 # 9 Extended offlineCompleted: read failure 60% 16394 2905482560 #10 Extended offlineCompleted: read failure 60% 16389 2905482560 #11 Short offline Completed without error 00% 16380 - #12 Extended offlineCompleted: read failure 60% 16365 2905482560 #13 Extended offlineCompleted: read failure 60% 16352 2905482560 #14 Extended offlineCompleted without error 00% 8044 - #15 Extended offlineCompleted without error 00% 3121 - #16 Extended offlineCompleted without error 00% 1548 - #17 Short offline Completed without error 00% 1141 - #18 Extended offlineCompleted without error 00% 719 - #19 Extended offlineCompleted without error 00% 525 - #20 Short offline Completed without error 00% 516 - #21 Extended offlineCompleted without error 00% 18 - 7 of 7 failed self-tests are outdated by newer successful extended offline self-test # 2 Note the very last line. You can see all the failures but the last line says the drive is good to go since the drive passed after the bad ones. So, while I'm not holding my breath, that is what SMART says. It may blow smoke and make horrible noises next week but right now, it says it is OK. In the end, it seems something has to write to that specific sector and then the drive will reallocate/move/whatever so that the bad part isn't used anymore. It seems dd did that but I bet there are other tools that could do it without losing data other than what is in the bad spot of course. That's my simple idea at least. Hope that helps. I wish I could have done the other stuff and kept notes on commands and such and then post the results. That MAY have helped someone in the future. My brain ain't what it used to be. ;-) Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
Frank Steinmetzger wrote: On Wed, Jun 25, 2014 at 11:57:55PM -0500, Dale wrote: I'm just going to try and buy another 3TB drive as soon as I can. I may even make it into a removable thingy. Then I can make backups and just put it in a outbuilding. By the way, my outbuilding is pretty far from For such use, I am planning to get an external SATA dock, rather than use a “removable thingy”. You pop in the naked drive and stow that away after you’re done. This has multiple advantages. For one, you can read SMART data from the external drives, provided you use a SATA connection. Cou can’t do that over USB. So if your board has an eSATA connector, get an external dock such as http://www.sharkoon.com/?q=en/node/1277. If it doesn’t, you can get an internal one for a 5¼″ slot in the front of your case, like http://www.sharkoon.com/?q=en/node/1281 for one 3.5″ HDD, or http://www.sharkoon.com/?q=en/content/sata-qp-intern-multi for both big and small HDDs (plus some USB3 connectors, if your case doesn’t have them but your board provides a header). Those docks are only slightly more expensive than an external case, but you can use them for all your drives, not just a single one, and you have no hassle with countless external power supplies (“which one did go into which case?”). PS: I’m not saying you should get a Sharkoon, after all I haven’t bought it yet myself. But their site shows nicely what’s available and gives me ideas. -- Gruß | Greetings | Qapla’ Please do not share anything from, with or about me on any social network. Please don’t befuddle me with facts, my mind is set. I been wanting to get me something external but hadn't got around to looking yet. I didn't know they have a SATA version. I plan to avoid USB if I can. From my understanding, eSATA can be hotplugged and I have a couple of those connections. Thanks for the info. It put me on a different path. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On 07/01/2014 10:30:44 AM, Dale wrote: I been wanting to get me something external but hadn't got around to looking yet. I didn't know they have a SATA version. I plan to avoid USB if I can. From my understanding, eSATA can be hotplugged and I have a couple of those connections. I 'believe' that eSATA is dead - just look how few products are available. I have 3 USB-3 drives which are very fast (more than 100 MB/sec) and if your motherboard doesn't have an USB-3 adapter, yet, it's very cheap. Helmut.
Re: [gentoo-user] smartctrl drive error @60%
Helmut Jarausch wrote: On 07/01/2014 10:30:44 AM, Dale wrote: I been wanting to get me something external but hadn't got around to looking yet. I didn't know they have a SATA version. I plan to avoid USB if I can. From my understanding, eSATA can be hotplugged and I have a couple of those connections. I 'believe' that eSATA is dead - just look how few products are available. I have 3 USB-3 drives which are very fast (more than 100 MB/sec) and if your motherboard doesn't have an USB-3 adapter, yet, it's very cheap. Helmut I do have a few USB3 connectors. I just figured USB would be a good bit slower. Plus, can USB power a 3.5 hard drive nowadays? root@fireball / # hdparm -tT /dev/sdb /dev/sdb: Timing cached reads: 6604 MB in 2.00 seconds = 3303.39 MB/sec Timing buffered disk reads: 542 MB in 3.01 seconds = 180.33 MB/sec root@fireball / # I did find a eSATA enclosure on Newegg and the price wasn't bad. It's one option I guess. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On 07/01/2014 10:58:45 AM, Dale wrote: Helmut Jarausch wrote: On 07/01/2014 10:30:44 AM, Dale wrote: I been wanting to get me something external but hadn't got around to looking yet. I didn't know they have a SATA version. I plan to avoid USB if I can. From my understanding, eSATA can be hotplugged and I have a couple of those connections. I 'believe' that eSATA is dead - just look how few products are available. I have 3 USB-3 drives which are very fast (more than 100 MB/sec) and if your motherboard doesn't have an USB-3 adapter, yet, it's very cheap. Helmut I do have a few USB3 connectors. I just figured USB would be a good bit slower. Plus, can USB power a 3.5 hard drive nowadays? Probably not. All of my external USB3 disks have a separate power supply. root@fireball / # hdparm -tT /dev/sdb /dev/sdb: Timing cached reads: 6604 MB in 2.00 seconds = 3303.39 MB/sec Timing buffered disk reads: 542 MB in 3.01 seconds = 180.33 MB/sec root@fireball / # Try a real life example like dd. I have seen the above mentioned speed on disks with a file system on it which does limit the speed anyway. I did find a eSATA enclosure on Newegg and the price wasn't bad. It's one option I guess. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On Tuesday, July 01, 2014 11:06:59 AM Helmut Jarausch wrote: On 07/01/2014 10:58:45 AM, Dale wrote: Helmut Jarausch wrote: On 07/01/2014 10:30:44 AM, Dale wrote: I been wanting to get me something external but hadn't got around to looking yet. I didn't know they have a SATA version. I plan to avoid USB if I can. From my understanding, eSATA can be hotplugged and I have a couple of those connections. I 'believe' that eSATA is dead - just look how few products are available. I have 3 USB-3 drives which are very fast (more than 100 MB/sec) and if your motherboard doesn't have an USB-3 adapter, yet, it's very cheap. Helmut I do have a few USB3 connectors. I just figured USB would be a good bit slower. Plus, can USB power a 3.5 hard drive nowadays? Probably not. All of my external USB3 disks have a separate power supply. I only know of 2.5 USB-drivers that are powered via the same USB-cable. Never seen 3.5 ones that are USB-powered. I use 2.5 drives for my backups, as they are designed for laptop use, I have the feeling they are a bit more robust when it comes to accidental bumps. root@fireball / # hdparm -tT /dev/sdb /dev/sdb: Timing cached reads: 6604 MB in 2.00 seconds = 3303.39 MB/sec Timing buffered disk reads: 542 MB in 3.01 seconds = 180.33 MB/sec root@fireball / # Try a real life example like dd. I have seen the above mentioned speed on disks with a file system on it which does limit the speed anyway. +1 -- Joost
Re: [gentoo-user] smartctrl drive error @60%
J. Roeleveld wrote: On Tuesday, July 01, 2014 11:06:59 AM Helmut Jarausch wrote: On 07/01/2014 10:58:45 AM, Dale wrote: Probably not. All of my external USB3 disks have a separate power supply. I only know of 2.5 USB-drivers that are powered via the same USB-cable. Never seen 3.5 ones that are USB-powered. I use 2.5 drives for my backups, as they are designed for laptop use, I have the feeling they are a bit more robust when it comes to accidental bumps. I thought those things looked like they had their own power. Neat. root@fireball / # hdparm -tT /dev/sdb /dev/sdb: Timing cached reads: 6604 MB in 2.00 seconds = 3303.39 MB/sec Timing buffered disk reads: 542 MB in 3.01 seconds = 180.33 MB/sec root@fireball / # Try a real life example like dd. I have seen the above mentioned speed on disks with a file system on it which does limit the speed anyway. +1 -- Joost I watched the dd process when I was erasing the old drive. I got about the same results. It started out a little over 200 and went as low as 170 or so close to the end. On average, about what hdparm shows. Close enough it seems. ;-) Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On Tuesday, July 01, 2014 04:21:45 AM Dale wrote: J. Roeleveld wrote: root@fireball / # hdparm -tT /dev/sdb /dev/sdb: Timing cached reads: 6604 MB in 2.00 seconds = 3303.39 MB/sec Timing buffered disk reads: 542 MB in 3.01 seconds = 180.33 MB/sec root@fireball / # Try a real life example like dd. I have seen the above mentioned speed on disks with a file system on it which does limit the speed anyway. +1 -- Joost I watched the dd process when I was erasing the old drive. I got about the same results. It started out a little over 200 and went as low as 170 or so close to the end. On average, about what hdparm shows. Close enough it seems. ;-) Yep, but do the same after adding a filesystem to the mix? Eg. mount it somewhere, then dd to a file on that drive. -- Joost
Re: [gentoo-user] smartctrl drive error @60%
On Tue, Jul 1, 2014 at 2:09 AM, J. Roeleveld jo...@antarean.org wrote: On Tuesday, July 01, 2014 06:52:10 AM Mick wrote: What triggers a relocation? I also have a drive which shows a sector relocation pending, but for a few days now and after some tests that showed no errors, it won't relocate it. I think a write to that sector should force a relocation. In theory either a write to that sector or a successful read should trigger a relocation. In practice, I've never seen a drive actually do this - maybe I just manage to pick drives with braindead firmware. When I write to a pending sector, the pending sector count goes down, but the relocated sector count doesn't change, and usually in a few days I have another pending sector. The last time I had a drive fail I was running md raid, so a scrub fixed all the pending sectors automatically, but the drive firmware wasn't doing its part to relocate them. Either that, or the drive had run out of spare sectors and wasn't reporting this via SMART. Rich
Re: [gentoo-user] smartctrl drive error @60%
On 01/07/2014 07:52, Mick wrote: On Sunday 29 Jun 2014 13:05:04 Rich Freeman wrote: On Sun, Jun 29, 2014 at 12:44 AM, Dale rdalek1...@gmail.com wrote: What if I copied data to the drive until it was just about full. I'm thinking like maybe 90 or 95% or so. If I do that and run the test every few days, would it then catch a error after a few weeks or so of testing? I realize no one knows with 100% certainty... As you already said, nobody knows with 100% certainty. In the failures I've experienced I'd expect it to start catching errors within a few days. However, on those drives the relocated sector count never increases, which suggests that the firmware never relocated those sectors when overwritten, which seems brain-dead to me. If the drive relocates the sectors, then conceivably it could go quite a long time until having errors, probably in an entirely different set of sectors. Even if it doesn't relocate, the reliability of the bad sectors could be high or low. Rich What triggers a relocation? I also have a drive which shows a sector relocation pending, but for a few days now and after some tests that showed no errors, it won't relocate it. it's triggered by a write to the sector -- Alan McKinnon alan.mckin...@gmail.com
Re: [gentoo-user] smartctrl drive error @60%
J. Roeleveld wrote: On Tuesday, July 01, 2014 04:21:45 AM Dale wrote: I watched the dd process when I was erasing the old drive. I got about the same results. It started out a little over 200 and went as low as 170 or so close to the end. On average, about what hdparm shows. Close enough it seems. ;-) Yep, but do the same after adding a filesystem to the mix? Eg. mount it somewhere, then dd to a file on that drive. -- Joost I've only ever use dd to blank a drive. I never used it to copy anything. While dd may be a bit faster in my use, having a file system is a more realistic use. I think a file system would slow things down a bit, maybe not much since file systems are pretty fast nowadays. Thing is, I'm fairly sure USB won't be as fast as a straight SATA connection. That is one reason I would rather use SATA connections instead. That was also the reason I posted that info. It shows that on my rig here, I can likely copy faster than USB with a SATA connection. The speed I posted is a good bit faster than what Helmut posted even tho his was a general amount. Unless Helmut has a older, slower machine then I wouldn't expect mine to be much if any faster than his. Basically, USB would be a bottleneck that I might can avoid and my mobo supports eSATA connections. . I'm not trying to benchmark, just give a general idea. What hdparm gives me is pretty close to what dd was giving and not to far off from what I get when doing a copy with cp or rsync. I been doing a good bit of copying here lately. I do have a drive that is the older SATA but most are the newer and faster SATA. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On Sunday 29 Jun 2014 13:05:04 Rich Freeman wrote: On Sun, Jun 29, 2014 at 12:44 AM, Dale rdalek1...@gmail.com wrote: What if I copied data to the drive until it was just about full. I'm thinking like maybe 90 or 95% or so. If I do that and run the test every few days, would it then catch a error after a few weeks or so of testing? I realize no one knows with 100% certainty... As you already said, nobody knows with 100% certainty. In the failures I've experienced I'd expect it to start catching errors within a few days. However, on those drives the relocated sector count never increases, which suggests that the firmware never relocated those sectors when overwritten, which seems brain-dead to me. If the drive relocates the sectors, then conceivably it could go quite a long time until having errors, probably in an entirely different set of sectors. Even if it doesn't relocate, the reliability of the bad sectors could be high or low. Rich What triggers a relocation? I also have a drive which shows a sector relocation pending, but for a few days now and after some tests that showed no errors, it won't relocate it. -- Regards, Mick signature.asc Description: This is a digitally signed message part.
Re: [gentoo-user] smartctrl drive error @60%
On Sunday 29 Jun 2014 05:44:38 Dale wrote: Rich Freeman wrote: On Sat, Jun 28, 2014 at 11:27 PM, Dale rdalek1...@gmail.com wrote: So, thoughts? Did it mark that part as bad and all is well or is this going to be trouble down the line? Should I just fill the thing up with data and test the stuffin out of it to make sure? That is pretty typical. You wrote to every sector on the drive. You don't need to be able to read a sector to overwrite it, so doing this cleared out the drive's list of offline uncorrectable sectors. If you're fortunate it relocated those sectors in which case the drive is only using good sectors now. It can't relocate a sector unless it either gets a successful read, or it is overwritten, and you overwrote them. Either way the extended offline test passing isn't unusual. Either it relocated the sectors in which case the drive is completely good or the data written to the bad sectors was readable when the test was run, which doesn't guarantee that it will still be readable a day/week/month/year from now. Unfortunately I don't think there is any way to find out what the firmware is doing, or to predict the likelihood of another failure. The only thing we can say for sure that like all hard drives, it WILL fail sometime. Rich What if I copied data to the drive until it was just about full. I'm thinking like maybe 90 or 95% or so. If I do that and run the test every few days, would it then catch a error after a few weeks or so of testing? I realize no one knows with 100% certainty but I would like to backup my data say every couple weeks just in case. If the drive works, fine. If it fails, well, it wouldn't be the first time and it won't be a primary drive so no big loss. I got to find me a good drive for backups tho. I'm waiting on a good sale of a brand other than Seagate tho. That should help keep two drives from failing at the same time. Well, a little anyway. I think it is called Dale's Law now. ;-) I'm not sure what it is called, but it seems infectious! I have a drive (in a laptop) which I recently zeroed out with dd and fsck -c for good measure, before I installed gentoo on it. Yesterday, I tried a long test, but it won't complete. It reached 10% remaining and it stayed there for a few hours. I will repeat the test to see if it gets through this time, but I am worried that it's on its way out. Oh well, I may install an SSD if it fails. -- Regards, Mick signature.asc Description: This is a digitally signed message part.
Re: [gentoo-user] smartctrl drive error @60%
Mick wrote: On Sunday 29 Jun 2014 05:44:38 Dale wrote: What if I copied data to the drive until it was just about full. I'm thinking like maybe 90 or 95% or so. If I do that and run the test every few days, would it then catch a error after a few weeks or so of testing? I realize no one knows with 100% certainty but I would like to backup my data say every couple weeks just in case. If the drive works, fine. If it fails, well, it wouldn't be the first time and it won't be a primary drive so no big loss. I got to find me a good drive for backups tho. I'm waiting on a good sale of a brand other than Seagate tho. That should help keep two drives from failing at the same time. Well, a little anyway. I think it is called Dale's Law now. ;-) I'm not sure what it is called, but it seems infectious! I have a drive (in a laptop) which I recently zeroed out with dd and fsck -c for good measure, before I installed gentoo on it. Yesterday, I tried a long test, but it won't complete. It reached 10% remaining and it stayed there for a few hours. I will repeat the test to see if it gets through this time, but I am worried that it's on its way out. Oh well, I may install an SSD if it fails. That's seems to be normal at least for me. Mine has certain percentages that it just seems to sit at for a good while. It eventually passes the test tho. Just leave it overnight and check it the next morning or something. I know laptops are different but got to do what you got to do. Maybe pluging it into a desktop or something would help. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On Sunday 29 Jun 2014 09:42:39 Dale wrote: Mick wrote: On Sunday 29 Jun 2014 05:44:38 Dale wrote: What if I copied data to the drive until it was just about full. I'm thinking like maybe 90 or 95% or so. If I do that and run the test every few days, would it then catch a error after a few weeks or so of testing? I realize no one knows with 100% certainty but I would like to backup my data say every couple weeks just in case. If the drive works, fine. If it fails, well, it wouldn't be the first time and it won't be a primary drive so no big loss. I got to find me a good drive for backups tho. I'm waiting on a good sale of a brand other than Seagate tho. That should help keep two drives from failing at the same time. Well, a little anyway. I think it is called Dale's Law now. ;-) I'm not sure what it is called, but it seems infectious! I have a drive (in a laptop) which I recently zeroed out with dd and fsck -c for good measure, before I installed gentoo on it. Yesterday, I tried a long test, but it won't complete. It reached 10% remaining and it stayed there for a few hours. I will repeat the test to see if it gets through this time, but I am worried that it's on its way out. Oh well, I may install an SSD if it fails. That's seems to be normal at least for me. Mine has certain percentages that it just seems to sit at for a good while. It eventually passes the test tho. Just leave it overnight and check it the next morning or something. I know laptops are different but got to do what you got to do. Maybe pluging it into a desktop or something would help. I've restarted it and will leave it all day today to see what gives. -- Regards, Mick signature.asc Description: This is a digitally signed message part.
Re: [gentoo-user] smartctrl drive error @60%
On Sun, Jun 29, 2014 at 12:44 AM, Dale rdalek1...@gmail.com wrote: What if I copied data to the drive until it was just about full. I'm thinking like maybe 90 or 95% or so. If I do that and run the test every few days, would it then catch a error after a few weeks or so of testing? I realize no one knows with 100% certainty... As you already said, nobody knows with 100% certainty. In the failures I've experienced I'd expect it to start catching errors within a few days. However, on those drives the relocated sector count never increases, which suggests that the firmware never relocated those sectors when overwritten, which seems brain-dead to me. If the drive relocates the sectors, then conceivably it could go quite a long time until having errors, probably in an entirely different set of sectors. Even if it doesn't relocate, the reliability of the bad sectors could be high or low. Rich
Re: [gentoo-user] smartctrl drive error @60%
Rich Freeman wrote: On Sun, Jun 29, 2014 at 12:44 AM, Dale rdalek1...@gmail.com wrote: What if I copied data to the drive until it was just about full. I'm thinking like maybe 90 or 95% or so. If I do that and run the test every few days, would it then catch a error after a few weeks or so of testing? I realize no one knows with 100% certainty... As you already said, nobody knows with 100% certainty. In the failures I've experienced I'd expect it to start catching errors within a few days. However, on those drives the relocated sector count never increases, which suggests that the firmware never relocated those sectors when overwritten, which seems brain-dead to me. If the drive relocates the sectors, then conceivably it could go quite a long time until having errors, probably in an entirely different set of sectors. Even if it doesn't relocate, the reliability of the bad sectors could be high or low. Rich Yep. I guess the best thing to do is test the stuffin out of it and hope the tests don't wear it out. lol As I told my ex more than once, time tells. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On Wed, Jun 25, 2014 at 11:57:55PM -0500, Dale wrote: I'm just going to try and buy another 3TB drive as soon as I can. I may even make it into a removable thingy. Then I can make backups and just put it in a outbuilding. By the way, my outbuilding is pretty far from For such use, I am planning to get an external SATA dock, rather than use a “removable thingy”. You pop in the naked drive and stow that away after you’re done. This has multiple advantages. For one, you can read SMART data from the external drives, provided you use a SATA connection. Cou can’t do that over USB. So if your board has an eSATA connector, get an external dock such as http://www.sharkoon.com/?q=en/node/1277. If it doesn’t, you can get an internal one for a 5¼″ slot in the front of your case, like http://www.sharkoon.com/?q=en/node/1281 for one 3.5″ HDD, or http://www.sharkoon.com/?q=en/content/sata-qp-intern-multi for both big and small HDDs (plus some USB3 connectors, if your case doesn’t have them but your board provides a header). Those docks are only slightly more expensive than an external case, but you can use them for all your drives, not just a single one, and you have no hassle with countless external power supplies (“which one did go into which case?”). PS: I’m not saying you should get a Sharkoon, after all I haven’t bought it yet myself. But their site shows nicely what’s available and gives me ideas. -- Gruß | Greetings | Qapla’ Please do not share anything from, with or about me on any social network. Please don’t befuddle me with facts, my mind is set. signature.asc Description: Digital signature
Re: [gentoo-user] smartctrl drive error @60%
On Sat, 28 Jun 2014 08:48:40 +0100, Mick wrote: I would think that your ISP providers in the US will be blocking outgoing port 25 to stop compromised MSWindows machines spamming the rest of us. If you use my suggestion there shouldn't be a problem. It makes no difference whether you address it directly to your ISP address or via an alias. The ISP won't block port 25 connections to its own servers from its own customers, otherwise none of them could send email at all! In the US many big players are blocking outbound port 25 for their customers as a blanket measure to control spam from botnets, e.g.: http://www.verizon.com/Support/Residential/internet/highspeed/general+support/top+questions/questionsone/124274.htm If Dale uses the ssmtp.conf I sent he will be using a different port + TLS encryption and should not have a problem. Yes, that makes sense. I thought you were referring to the aliases part of the config. Using TLS or a different port if that's what the ISP needs is perfectly logical. -- Neil Bothwick Bus: (n.) a connector you plug money into, something like a slot machine. signature.asc Description: PGP signature
Re: [gentoo-user] smartctrl drive error @60%
On Friday 27 Jun 2014 21:54:32 Neil Bothwick wrote: On Fri, 27 Jun 2014 14:22:09 +0100, Mick wrote: I would think that your ISP providers in the US will be blocking outgoing port 25 to stop compromised MSWindows machines spamming the rest of us. If you use my suggestion there shouldn't be a problem. It makes no difference whether you address it directly to your ISP address or via an alias. The ISP won't block port 25 connections to its own servers from its own customers, otherwise none of them could send email at all! In the US many big players are blocking outbound port 25 for their customers as a blanket measure to control spam from botnets, e.g.: http://www.verizon.com/Support/Residential/internet/highspeed/general+support/top+questions/questionsone/124274.htm If Dale uses the ssmtp.conf I sent he will be using a different port + TLS encryption and should not have a problem. Even if Dale's ISP does not block port 25 for connections to the ISP's *own* mail servers, it may well block it to other providers' mail addresses for the same reason. This was a common practice some years back (pre-Gmail) when ISP had started charging for mail services. -- Regards, Mick signature.asc Description: This is a digitally signed message part.
Re: [gentoo-user] smartctrl drive error @60%
Mick wrote: On Friday 27 Jun 2014 21:54:32 Neil Bothwick wrote: On Fri, 27 Jun 2014 14:22:09 +0100, Mick wrote: I would think that your ISP providers in the US will be blocking outgoing port 25 to stop compromised MSWindows machines spamming the rest of us. If you use my suggestion there shouldn't be a problem. It makes no difference whether you address it directly to your ISP address or via an alias. The ISP won't block port 25 connections to its own servers from its own customers, otherwise none of them could send email at all! In the US many big players are blocking outbound port 25 for their customers as a blanket measure to control spam from botnets, e.g.: http://www.verizon.com/Support/Residential/internet/highspeed/general+support/top+questions/questionsone/124274.htm If Dale uses the ssmtp.conf I sent he will be using a different port + TLS encryption and should not have a problem. Even if Dale's ISP does not block port 25 for connections to the ISP's *own* mail servers, it may well block it to other providers' mail addresses for the same reason. This was a common practice some years back (pre-Gmail) when ISP had started charging for mail services. According to the settings in Seamonkey, it should be port 995 and SSL/TLS. I used your basic setup which is port 465. It works tho. :-) Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
Dale wrote: Howdy, I run this test every once in a while. How bad is this: root@fireball / # smartctl -l selftest /dev/sdc smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_DescriptionStatus Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offlineCompleted: read failure 60% 16365 2905482560 # 2 Extended offlineCompleted: read failure 60% 16352 2905482560 # 3 Extended offlineCompleted without error 00% 8044 - # 4 Extended offlineCompleted without error 00% 3121 - And better yet, is there any way to tell it to not use that part and finish the test? It seems it stopped when it got to that, or I think it did. Thoughts? Dale :-) :-) OK. Update. I got the new drive in, copied the files over, tested the new drive A LOT, then did a dd on the old drive and wiped the WHOLE thing. I let dd run until it ran out of space and died, which took a pretty good while. After dd finished, I ran the Smart test again. This is what I get now: root@fireball / # smartctl -l selftest /dev/sdd smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_DescriptionStatus Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offlineCompleted without error 00% 16466 - # 2 Extended offlineAborted by host 90% 16461 - # 3 Extended offlineCompleted: read failure 60% 16451 2905482560 # 4 Extended offlineCompleted: read failure 60% 16432 2905482560 # 5 Extended offlineCompleted: read failure 60% 16427 2905482560 Ignore the second one, I started the test on the old drive and was meaning to do it on the new drive. When dd finished, I wanted to start a fresh test so I killed the second one. As you can see in the latest test, no errors. So, thoughts? Did it mark that part as bad and all is well or is this going to be trouble down the line? Should I just fill the thing up with data and test the stuffin out of it to make sure? Thanks. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On Sat, Jun 28, 2014 at 11:27 PM, Dale rdalek1...@gmail.com wrote: So, thoughts? Did it mark that part as bad and all is well or is this going to be trouble down the line? Should I just fill the thing up with data and test the stuffin out of it to make sure? That is pretty typical. You wrote to every sector on the drive. You don't need to be able to read a sector to overwrite it, so doing this cleared out the drive's list of offline uncorrectable sectors. If you're fortunate it relocated those sectors in which case the drive is only using good sectors now. It can't relocate a sector unless it either gets a successful read, or it is overwritten, and you overwrote them. Either way the extended offline test passing isn't unusual. Either it relocated the sectors in which case the drive is completely good or the data written to the bad sectors was readable when the test was run, which doesn't guarantee that it will still be readable a day/week/month/year from now. Unfortunately I don't think there is any way to find out what the firmware is doing, or to predict the likelihood of another failure. The only thing we can say for sure that like all hard drives, it WILL fail sometime. Rich
Re: [gentoo-user] smartctrl drive error @60%
Rich Freeman wrote: On Sat, Jun 28, 2014 at 11:27 PM, Dale rdalek1...@gmail.com wrote: So, thoughts? Did it mark that part as bad and all is well or is this going to be trouble down the line? Should I just fill the thing up with data and test the stuffin out of it to make sure? That is pretty typical. You wrote to every sector on the drive. You don't need to be able to read a sector to overwrite it, so doing this cleared out the drive's list of offline uncorrectable sectors. If you're fortunate it relocated those sectors in which case the drive is only using good sectors now. It can't relocate a sector unless it either gets a successful read, or it is overwritten, and you overwrote them. Either way the extended offline test passing isn't unusual. Either it relocated the sectors in which case the drive is completely good or the data written to the bad sectors was readable when the test was run, which doesn't guarantee that it will still be readable a day/week/month/year from now. Unfortunately I don't think there is any way to find out what the firmware is doing, or to predict the likelihood of another failure. The only thing we can say for sure that like all hard drives, it WILL fail sometime. Rich What if I copied data to the drive until it was just about full. I'm thinking like maybe 90 or 95% or so. If I do that and run the test every few days, would it then catch a error after a few weeks or so of testing? I realize no one knows with 100% certainty but I would like to backup my data say every couple weeks just in case. If the drive works, fine. If it fails, well, it wouldn't be the first time and it won't be a primary drive so no big loss. I got to find me a good drive for backups tho. I'm waiting on a good sale of a brand other than Seagate tho. That should help keep two drives from failing at the same time. Well, a little anyway. I think it is called Dale's Law now. ;-) Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On Thursday 26 Jun 2014 16:08:52 Neil Bothwick wrote: On Thu, 26 Jun 2014 09:14:39 -0500, Dale wrote: Holy sheep. It worked. I lost my jaw yesterday I think it was. I'm not sure what I am going to be missing now. :-D Neil and Allan will so impressed. LOL OK. So, what will send me a message now? Do I need to tell it to send me something, say from smart stuff, or does it just know to do it? You tell cron where to mail reports by setting MAILTO=you@wherever at the top of /etc/crontab. It will then mail you every time a cronjob produces output. Or complete the /etc/ssmtp/ssmtp.conf and revaliases with the info I have sent you in this thread and set MAILTO=root in your /etc/crontab. I would think that your ISP providers in the US will be blocking outgoing port 25 to stop compromised MSWindows machines spamming the rest of us. If you use my suggestion there shouldn't be a problem. If your internet connection is down for some reason, you should get deadletter files in /root/ with the output. -- Regards, Mick signature.asc Description: This is a digitally signed message part.
Re: [gentoo-user] smartctrl drive error @60%
On Fri, 27 Jun 2014 14:22:09 +0100, Mick wrote: You tell cron where to mail reports by setting MAILTO=you@wherever at the top of /etc/crontab. It will then mail you every time a cronjob produces output. Or complete the /etc/ssmtp/ssmtp.conf and revaliases with the info I have sent you in this thread and set MAILTO=root in your /etc/crontab. I would think that your ISP providers in the US will be blocking outgoing port 25 to stop compromised MSWindows machines spamming the rest of us. If you use my suggestion there shouldn't be a problem. It makes no difference whether you address it directly to your ISP address or via an alias. The ISP won't block port 25 connections to its own servers from its own customers, otherwise none of them could send email at all! -- Neil Bothwick NOTE: In order to control energy costs the light at the end of the tunnel has been shut off until further notice... signature.asc Description: PGP signature
Re: [gentoo-user] smartctrl drive error @60%
On Thursday 26 Jun 2014 06:56:26 Mick wrote: Sort out access rights to 0604 Oops! Potentially dangerous typo! Should be: 0640 of course. -- Regards, Mick signature.asc Description: This is a digitally signed message part.
Re: [gentoo-user] smartctrl drive error @60%
Hello, On Wed, 25 Jun 2014, Dale wrote: David Haller wrote: On Wed, 25 Jun 2014, Dale wrote: Yeah. Oh, and I had a clean smart until a few days ago, luckily I alread had a WD Red (WD40EFRX) drive waiting when this attrib jumped from 0 to: 5 Reallocated_Sector_Ct 0x0033 087 087 036 Pre-fail Always 17688 Other Seagates (a few 1.5T drives) have also made me trouble, the 2T Samsung already relabeled and sold as a Seagate but with Samsung in the FW though is still ok. [..] I was wondering about how that would be updated since a lot of that stuff requires windoze. I had the fun with a couple of those 2TB Samsung drives a while ago (Jan 2012?). Samsung had some .iso files available, which you could write to a CD/DVD/USB-Shtik, and then boot from them. For me, the biggest problem was _which_ of the 4 HD204UI and 2 HD203WI I had and have needed the update... I don't remember the SW displaying serial numbers... *gah* Anyway, I've got it sorted out and updated those that needed the update. With Seagate/WD/HGST/Toshiba I've no experience. And I still am grumpy about Samsung selling off their HDD stuff to Seagate. BTW: in german, we have a saying for Seagate drives: sie geht oder sie geht nicht, basically spoken as sea gate odr sea gate nicht meaning she works or she won't... But, talk about IBM deathstars, and WD has also a record, basically, all HDDs are much alike nowadays, and have been for years. There's always a bad batch somewhere ... Seagate: Seagate drives (???), ex-Maxtor drives, ex-Samsung drives WD: bought Hitachi GST (formerly IBM), i.e. WD + ex-IBM/HGST drives Toshiba: new player, no warranty for bulk drives, unknown for the desktop I miss the days when you still had a real choice (WD, Seagate, Maxtor, Samsung, IBM, and smaller/specialized stuff (Excelstor, Toshiba for Laptop drives))... If Seagate would at least label their former Samsung drives in a recognizable manner (say, ST*DS* vs. ST*DM* or keep the Samsung label or whatever), I'd be a happy bunny, but as of now, that failing ST3000DM001 was the last Seagate I've bought for quite some time. Oh, and I've got a second ST3000DM001 used externally, with stuff that can get lost, but it'd be inconvenient. And yes, I'm planning on ordering a replacement ASAP (another WD40EFRX). Just in case. And the steady state of my discs is full anyway ... # dfall -t ext3 -t ext4 -h [..] 14.6T13.2T 996.4G 90% and that's just because item one: /dev/sda is taken up by a 128G Samsung 830 SSD, and item two: I just swapped in that 4TB WD for the failing Seagate 3TB. (and yes, I use ext{3,4} exclusively on disk), there's another ~10 TiB in the fileserver on 11 disks and a few naked drives (and a docking station (Sharkoon QuickDeck)) and an external drive ;) I ordered a drive. It should be here tomorrow. In the meantime, I shutdown and re-seated all the cables, power too. I got the test running again but results is a few hours off yet. It did pass the short test tho. I'm not sure that it means much. Good. Do not use dd, it WILL fail at the first error. Use gnu ddrescue or dd_rescue to grab an image. I used mc to copy via filesystem, eg. 'rsync -auxlPRAXSHD /foo/ /bar/' is fine too. Oh, and I hope you didn't buy a Seagate again ;) BTW: *GRR* for you buying a ST3000DM001 again I plan to rsync or cp the data over. Good plan. The dd part will come into play after I am sure I got everything off that I can get and am just erasing the drive completely. I plan to dd the drive then run the tests again just to see what it is doing. Heck, maybe it will reallocate that area like it should be doing already, I guess. Reallocation happens on writes ... Look for --write-sector in 'man hdparm'. I've been doing that smartctl -t .. / hdparm --write-sector ... stuff for a bunch of sectors but got tired of that game. And you having 104 pending sectors? *gah* That'll get tedious. Probably using 'ddrescue /dev/zero /dev/sdX sdX.log' would be easier. That way, after you got whatever data you can rescue from that drive, you can clear the drive too before sending it in for a warranty replacement (if still applicable). Time will tell. I'll be having fun tomorrow tho. ;-) Do have fun, after you got your data off that drive :) -dnh -- Gibt es ein Buch über das maßvolle Verwenden von Fußnoten? Wenn ja, dann bin ich bereit, Dir ein Exemplar zu schicken. [Thorsten Haude zu David Haller in sl-etikette]
Re: [gentoo-user] smartctrl drive error @60%
On Wed, 25 Jun 2014 22:54:35 -0500, Dale wrote: Curious. I hope I don't start a flame war here. I have had WD, Seagate and I think there is a Samsung here somewhere, may be the one that is rolling over on its back now. The one drive that failed a few years ago was a WD drive. That said, all the other WD drives I have had just got to small to really use, and slow when SATA came out. I'm partial to WD and Seagate still since I got good long term use out of those. Based on your experience, you tend to be of the same opinion? Allan, your situation should involve a lot of hard drives. Any thoughts? Neil, you have a nice big opinion on this? Yes, mix drives from different manufacturers. Or buy them at different times. All manufacturers can have bad batches (remember the IBM Deathstar?). I bought two Seagate drives a couple of years ago, for use in a RAID. The only time I have ignored my own advice on this matter (other matters are way off topic!). After a year they both started showing SMART errors and one of them failed soon after, the other was replaced before it had a chance to fail. Yes, it's anecdotal, but it makes sense - true redundancy means using different sources. -- Neil Bothwick I don't suffer from insanity. I enjoy every minute of it. signature.asc Description: PGP signature
Re: [gentoo-user] smartctrl drive error @60%
On Wed, 25 Jun 2014 21:15:54 -0500, Dale wrote: I like this part: Extremely simple MTA to get mail off the system to a Mailhub ^ That part right up there. :-D That may be a new thread, if needed. My first thought was even Dale can't have problems with that. I soon reconsidered... -- Neil Bothwick Voting Democrat or Republican is like choosing a cabin in the Titanic. signature.asc Description: PGP signature
Re: [gentoo-user] smartctrl drive error @60%
On 26/06/2014 05:54, Dale wrote: Curious. I hope I don't start a flame war here. I have had WD, Seagate and I think there is a Samsung here somewhere, may be the one that is rolling over on its back now. The one drive that failed a few years ago was a WD drive. That said, all the other WD drives I have had just got to small to really use, and slow when SATA came out. I'm partial to WD and Seagate still since I got good long term use out of those. Based on your experience, you tend to be of the same opinion? Allan, your situation should involve a lot of hard drives. Any thoughts? Neil, you have a nice big opinion on this? My experiences aren't worth much in this case, what I had to deal with was data center setups where - the power has never gone off for 6 years - the drives never spin down and just keep on turning year after year - the servers were the nice big ones Dell makes with awesome cooling - the data center feels like a fridge and the ambient temp never varies more than 1 deg - the server power supplies are seriously high grade, the 5V and 12V out of them are solid and do not fluctuate at all Add all this up and it's an almost perfect environment for drives to last a long time. You don't have that, not even close. I have only 1 little bit of anecdotal data: my nas at home has 4 x 3T WD Green drives in it, going on almost 2 years now. My kids hammer the blazes out of that thing, and ZFS scrubs keep it real busy when the kids don't. And those drives just keep on turning and turning and turning, I didn't do anything special. I put it down to statistics - no-one makes bad drives (or cars) these days and I haven't pulled the unlucky card yet. I dunno, go figure -- Alan McKinnon alan.mckin...@gmail.com
Re: [gentoo-user] smartctrl drive error @60%
Neil Bothwick wrote: On Wed, 25 Jun 2014 22:54:35 -0500, Dale wrote: Curious. I hope I don't start a flame war here. I have had WD, Seagate and I think there is a Samsung here somewhere, may be the one that is rolling over on its back now. The one drive that failed a few years ago was a WD drive. That said, all the other WD drives I have had just got to small to really use, and slow when SATA came out. I'm partial to WD and Seagate still since I got good long term use out of those. Based on your experience, you tend to be of the same opinion? Allan, your situation should involve a lot of hard drives. Any thoughts? Neil, you have a nice big opinion on this? Yes, mix drives from different manufacturers. Or buy them at different times. All manufacturers can have bad batches (remember the IBM Deathstar?). I bought two Seagate drives a couple of years ago, for use in a RAID. The only time I have ignored my own advice on this matter (other matters are way off topic!). After a year they both started showing SMART errors and one of them failed soon after, the other was replaced before it had a chance to fail. Yes, it's anecdotal, but it makes sense - true redundancy means using different sources. Yep, it makes good sense. Each batch can have one oddball failure but if a batch has a firmware/hardware fault, the whole batch can die at the same time. One could certainly see the point that having say a WD and a Seagate mirroring each other would be good advice. Having two drives that are only one digit apart on the serial number could very well be a recipe for problems, unless one is really lucky and got two well made drives. Given how things are manufactured nowadays and the compact data on the media, it doesn't take much to make a dud for sure. This sort of reminds me of a old saying. A chain is only as strong as its weakest link. It doesn't take much to make a hard drive either really good or really bad. I don't think they aim for really good, just good enough to stay out of the really bad area. ;-) I may have to keep a eye out on a WD drive for the next one. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
Neil Bothwick wrote: On Wed, 25 Jun 2014 21:15:54 -0500, Dale wrote: I like this part: Extremely simple MTA to get mail off the system to a Mailhub ^ That part right up there. :-D That may be a new thread, if needed. My first thought was even Dale can't have problems with that. I soon reconsidered... Well, remember me building a init thingy a good while back? Yep, it failed to boot a while back. It sort of started but from what was on the screen, I could tell the init thingy failed. I edited grub to remove the init thingy and booted. Then I removed all the init thingys. I was using the drop dead simple dracut to make it and it still failed. After it failed and I tried to boot without it, I was diggin for my Kubuntu DVD. It booted fine without it tho. Whew!!! I'm thinking of something tho. Btrfs. While I have a new drive with no file system on it, it's a good time to think on switching from LVM. Hmmm. I'm currently on gentoo-sources 3.14. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
Alan McKinnon wrote: My experiences aren't worth much in this case, what I had to deal with was data center setups where - the power has never gone off for 6 years - the drives never spin down and just keep on turning year after year - the servers were the nice big ones Dell makes with awesome cooling - the data center feels like a fridge and the ambient temp never varies more than 1 deg - the server power supplies are seriously high grade, the 5V and 12V out of them are solid and do not fluctuate at all Add all this up and it's an almost perfect environment for drives to last a long time. You don't have that, not even close. I have only 1 little bit of anecdotal data: my nas at home has 4 x 3T WD Green drives in it, going on almost 2 years now. My kids hammer the blazes out of that thing, and ZFS scrubs keep it real busy when the kids don't. And those drives just keep on turning and turning and turning, I didn't do anything special. I put it down to statistics - no-one makes bad drives (or cars) these days and I haven't pulled the unlucky card yet. I dunno, go figure Well, it does make good points tho. I keep my room here pretty cool. It's not as cool as your data center but I have a window A/C and my own heater. I don't mind it being a little cool in the winter but don't like it warm in the summer either. The cooler the better. I also have the Cooler Master HAF-932 case with those really nice large fans. The hard drives are right in front of the front intake fan. I have a power supply that is really to big for what I have running. I can't recall the brand and wattage just that it doesn't pull near as much power as I thought it would. It pulls less than half what my older and much slower puter pulled. Also, I rarely shut this thing down. I did the other night to unplug/re-plug all the cables but other than that, it is usually because I have lost power from the mains. So, keep them cool, good clean power and leave them running when ya can. Sounds like a plan. ;-) Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On Thu, Jun 26, 2014 at 7:03 AM, Dale rdalek1...@gmail.com wrote: I'm thinking of something tho. Btrfs. While I have a new drive with no file system on it, it's a good time to think on switching from LVM. Hmmm. I'm currently on gentoo-sources 3.14. I think btrfs is usable, but not without its problems. I don't think I'd run it without a daily backup onto something that doesn't run btrfs. I'm running btrfs in raid1 configuration, and I've only restored from the backup once, though if I didn't have it I probably could have still recovered (ENOSPC issue - the usual solutions weren't working). Earlier this week I was having lockup issues (I tried to re-enable snapper and deleting a few snapshots at once caused it to stop syncing, and then for several days any kind of heavy write activity would cause the issue to repeat, which I suspect was the result of delayed cleanup). Rich
Re: [gentoo-user] smartctrl drive error @60%
Rich Freeman wrote: On Thu, Jun 26, 2014 at 7:03 AM, Dale rdalek1...@gmail.com wrote: I'm thinking of something tho. Btrfs. While I have a new drive with no file system on it, it's a good time to think on switching from LVM. Hmmm. I'm currently on gentoo-sources 3.14. I think btrfs is usable, but not without its problems. I don't think I'd run it without a daily backup onto something that doesn't run btrfs. I'm running btrfs in raid1 configuration, and I've only restored from the backup once, though if I didn't have it I probably could have still recovered (ENOSPC issue - the usual solutions weren't working). Earlier this week I was having lockup issues (I tried to re-enable snapper and deleting a few snapshots at once caused it to stop syncing, and then for several days any kind of heavy write activity would cause the issue to repeat, which I suspect was the result of delayed cleanup). Rich Well, I really don't have enough time to read up and get to know the commands and such either. I guess I better stick with LVM, for now at least. When I get another drive, I could always switch then and maybe it will be more stable then as well. Maybe. Thanks for sharing the info. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On 26/06/2014 13:20, Dale wrote: Alan McKinnon wrote: My experiences aren't worth much in this case, what I had to deal with was data center setups where - the power has never gone off for 6 years - the drives never spin down and just keep on turning year after year - the servers were the nice big ones Dell makes with awesome cooling - the data center feels like a fridge and the ambient temp never varies more than 1 deg - the server power supplies are seriously high grade, the 5V and 12V out of them are solid and do not fluctuate at all Add all this up and it's an almost perfect environment for drives to last a long time. You don't have that, not even close. I have only 1 little bit of anecdotal data: my nas at home has 4 x 3T WD Green drives in it, going on almost 2 years now. My kids hammer the blazes out of that thing, and ZFS scrubs keep it real busy when the kids don't. And those drives just keep on turning and turning and turning, I didn't do anything special. I put it down to statistics - no-one makes bad drives (or cars) these days and I haven't pulled the unlucky card yet. I dunno, go figure Well, it does make good points tho. I keep my room here pretty cool. It's not as cool as your data center but I have a window A/C and my own heater. I don't mind it being a little cool in the winter but don't like it warm in the summer either. The cooler the better. I also have the Cooler Master HAF-932 case with those really nice large fans. The hard drives are right in front of the front intake fan. I have a power supply that is really to big for what I have running. I can't recall the brand and wattage just that it doesn't pull near as much power as I thought it would. It pulls less than half what my older and much slower puter pulled. Also, I rarely shut this thing down. I did the other night to unplug/re-plug all the cables but other than that, it is usually because I have lost power from the mains. So, keep them cool, good clean power and leave them running when ya can. Sounds like a plan. ;-) You got it :-) hard drives are mechanical objects, not electronic ones, and they fail for mechanical reasons. Motors fail, bearings seize, spindle arms wear out. Transforming magnetic blobs on the platter into binary bits is very reliable, as long as the head is in exactly the place it is supposed to be. So the enemies of disks are environmental; - temperature and humidity changes - frequent spin ups and spin downs - dust - power dips/fluctuations and brown-outs - being dropped, knocked and generally ubused etc, etc, etc Take care of the environmental factors, and statistics fall in your favour making the odds good you'll get the life you expect -- Alan McKinnon alan.mckin...@gmail.com
Re: [gentoo-user] smartctrl drive error @60%
Alan McKinnon wrote: On 26/06/2014 13:20, Dale wrote: Alan McKinnon wrote: My experiences aren't worth much in this case, what I had to deal with was data center setups where - the power has never gone off for 6 years - the drives never spin down and just keep on turning year after year - the servers were the nice big ones Dell makes with awesome cooling - the data center feels like a fridge and the ambient temp never varies more than 1 deg - the server power supplies are seriously high grade, the 5V and 12V out of them are solid and do not fluctuate at all Add all this up and it's an almost perfect environment for drives to last a long time. You don't have that, not even close. I have only 1 little bit of anecdotal data: my nas at home has 4 x 3T WD Green drives in it, going on almost 2 years now. My kids hammer the blazes out of that thing, and ZFS scrubs keep it real busy when the kids don't. And those drives just keep on turning and turning and turning, I didn't do anything special. I put it down to statistics - no-one makes bad drives (or cars) these days and I haven't pulled the unlucky card yet. I dunno, go figure Well, it does make good points tho. I keep my room here pretty cool. It's not as cool as your data center but I have a window A/C and my own heater. I don't mind it being a little cool in the winter but don't like it warm in the summer either. The cooler the better. I also have the Cooler Master HAF-932 case with those really nice large fans. The hard drives are right in front of the front intake fan. I have a power supply that is really to big for what I have running. I can't recall the brand and wattage just that it doesn't pull near as much power as I thought it would. It pulls less than half what my older and much slower puter pulled. Also, I rarely shut this thing down. I did the other night to unplug/re-plug all the cables but other than that, it is usually because I have lost power from the mains. So, keep them cool, good clean power and leave them running when ya can. Sounds like a plan. ;-) You got it :-) hard drives are mechanical objects, not electronic ones, and they fail for mechanical reasons. Motors fail, bearings seize, spindle arms wear out. Transforming magnetic blobs on the platter into binary bits is very reliable, as long as the head is in exactly the place it is supposed to be. So the enemies of disks are environmental; - temperature and humidity changes - frequent spin ups and spin downs - dust - power dips/fluctuations and brown-outs - being dropped, knocked and generally ubused etc, etc, etc Take care of the environmental factors, and statistics fall in your favour making the odds good you'll get the life you expect I think that is one reason I have had some pretty good luck with that. I might also add, I have actually only had one computer that failed. That includes the ones that folks just gave me which is quite a few. Most of them just get to slow to use. The ones I build, I build them like a tank. I put coolers on everything that is even a little warm. My CPU cooler on my current rig is pretty large. Case fans blowing a lot of air, quiet if possible. For this drive that I have going out now to go out, it has to have a issue not related to cooling and such. Unless it was somehow handled badly while being shipped to me, its never been dropped or anything either. This is a desktop, with wheels since it is on carpet, and it rarely goes anywhere. It doesn't get rattled around like a laptop or something. My old rig, AMD 2500+ in a old full tower case still runs good. I booted it a month or so ago. I had a Volcano 11 or 12 on the CPU which is solid copper. I replaced the northbridge cooler with a copper cooler with a fan. The mosfets close to the CPU, I added coolers to them too. It had 5 case fans. It wasn't quiet but it ran cool. The mobo temps were usually just a couple degrees above room temp. CPU never got over 100F. Heck, the CPU in my current rig has never seen 110F. The highest I have ever seen was 107F and that was when I was compiling and had power to blink just enough to cut off my A/C for a hour or so. Maybe I need a UPS for my A/C too. :-D It seems the best thing WE can do, good power, good cooling, don't drop it and keep backups. I went back through the error logs and found this: Jun 12 23:30:36 localhost smartd[2688]: Device: /dev/sdc [SAT], 104 Currently unreadable (pending) sectors Jun 12 23:30:36 localhost smartd[2688]: Device: /dev/sdc [SAT], 104 Offline uncorrectable sectors That's the first error I could find. It went from nothing to that in one huge jump. I also found this: Jun 8 03:10:02 localhost sSMTP[7164]: Unable to locate mail Jun 8 03:10:02 localhost sSMTP[7164]: Cannot open mail:25 Jun 8 03:10:03 localhost CROND[7145]: (root) MAIL (mailed 57 bytes of output but got status 0x0001 ) It seems it is trying to mail something. I need
Re: [gentoo-user] smartctrl drive error @60%
Mick wrote: On Thursday 26 Jun 2014 06:56:26 Mick wrote: Sort out access rights to 0604 Oops! Potentially dangerous typo! Should be: 0640 of course. Picking a short one to reply too. Holy sheep. It worked. I lost my jaw yesterday I think it was. I'm not sure what I am going to be missing now. :-D Neil and Allan will so impressed. LOL OK. So, what will send me a message now? Do I need to tell it to send me something, say from smart stuff, or does it just know to do it? I found this in messages: Jun 8 03:10:02 localhost sSMTP[7164]: Unable to locate mail Jun 8 03:10:02 localhost sSMTP[7164]: Cannot open mail:25 Jun 8 03:10:03 localhost CROND[7145]: (root) MAIL (mailed 57 bytes of output but got status 0x0001) It seems that something was trying to email me. Crond maybe? It was at the top of the file right after logrotate did its thing. Could that be it? Thanks much. Going to look for missing body parts now. Hmmm, maybe I will find a better brain. ;-) Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On Thu, 26 Jun 2014 09:14:39 -0500, Dale wrote: Holy sheep. It worked. I lost my jaw yesterday I think it was. I'm not sure what I am going to be missing now. :-D Neil and Allan will so impressed. LOL OK. So, what will send me a message now? Do I need to tell it to send me something, say from smart stuff, or does it just know to do it? You tell cron where to mail reports by setting MAILTO=you@wherever at the top of /etc/crontab. It will then mail you every time a cronjob produces output. -- Neil Bothwick ... I just forgot to increment the counter, Tom said, nonplussed. signature.asc Description: PGP signature
Re: [gentoo-user] smartctrl drive error @60%
Neil Bothwick wrote: On Thu, 26 Jun 2014 09:14:39 -0500, Dale wrote: Holy sheep. It worked. I lost my jaw yesterday I think it was. I'm not sure what I am going to be missing now. :-D Neil and Allan will so impressed. LOL OK. So, what will send me a message now? Do I need to tell it to send me something, say from smart stuff, or does it just know to do it? You tell cron where to mail reports by setting MAILTO=you@wherever at the top of /etc/crontab. It will then mail you every time a cronjob produces output. Well, right now I am trying to get smartmon to send them since that is pretty important and I have a error at the moment to really test it. I think I am having some success here. I been googling and trying to sort this thing out. So far, I have this: /dev/sda -a -d sat -o on -S on -s (S/../.././02|L/../../6/12) -m root /dev/sdb -a -d sat -o on -S on -s (S/../.././02|L/../../6/12) -m root /dev/sdc -a -d sat -o on -S on -s (S/../.././02|L/../../6/12) -m root I changed it to testing during the day since I am usually sleeping then and up at night. Anyway, when I restart smartd, I get a email about the error. Yeppie What I would like tho, it to just know to do this for whatever drives are added/removed without me having to change the config. I bet there is a way to do that. Now, let me go check into that cron thing. Is there anything else I need to set up to send emails like this? Smartd is working, about to work on cron so what else usually tries to send a email? I have got to go find those body parts I am losing. Thanks much. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
Rich Freeman wrote: And do check on your warranty. You can migrate all your data to the new drive, and then replace the old one as a backup disk. Either use it with raid, or as an offline backup. If you want to do raid you can set up mdadm with a degraded raid1 so that you can copy your data over from your old drive, and then when it is replaced you just partition the new one, add it to the raid, and watch it rebuild automatically. Rich I thought I was never going to find that thing. It is pretty well hidden. Anyway: Serial Number Seagate Part Number Warranty Status Z1F0PKT5 9YN166-302 Out of Warranty So, I guess I need to check newegg and see if I can show a invoice that it is less than two years old. I'm not sure that it is tho. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On 06/26/2014 05:57 AM, Alan McKinnon wrote: Take care of the environmental factors, and statistics fall in your favour making the odds good you'll get the life you expect Yep, but sometimes crap just fails for no reason whatsoever. As an example, my old house had central A/C and never went above 23.5 C in the summer and was typically 19 C in winter. All machines on their own UPS boxes. Oddly enough most failures I've seen were from drives on 24x7 and for whatever reason I had to shut them down and they would not power up again. I still lost a quite a few drives there, regardless. Computer hardware fails. I just say shit happens and move on. ;-) Oh, and I make sure critical stuff is in more than two places. What I consider critical is likely less than 5% of my total storage. Dan
Re: [gentoo-user] smartctrl drive error @60%
On 06/25/2014 08:54 PM, Dale wrote: Curious. I hope I don't start a flame war here. I have had WD, Seagate and I think there is a Samsung here somewhere, may be the one that is rolling over on its back now. The one drive that failed a few years ago was a WD drive. That said, all the other WD drives I have had just got to small to really use, and slow when SATA came out. I'm partial to WD and Seagate still since I got good long term use out of those. Based on your experience, you tend to be of the same opinion? Flame war on a mailing list? Nah, will never happen. :-) We have over 100 workstations at work and so far the WD Blue drives fail the most, Seagate second, Samsung third. At home once I get a bad run of one manufacturer I usually get ticked off and find another to switch to. I originally used WD, then Maxtor, then Seagate, then Samsung, then WD again... where I am now. Dan
Re: [gentoo-user] smartctrl drive error @60%
On 25 June 2014 07:05:03 CEST, Dale rdalek1...@gmail.com wrote: J. Roeleveld wrote: On 25 June 2014 01:09:03 CEST, Dale rdalek1...@gmail.com wrote: Howdy, I run this test every once in a while. How bad is this: root@fireball / # smartctl -l selftest /dev/sdc smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_DescriptionStatus Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offlineCompleted: read failure 60% 16365 2905482560 # 2 Extended offlineCompleted: read failure 60% 16352 2905482560 # 3 Extended offlineCompleted without error 00% 8044 - # 4 Extended offlineCompleted without error 00% 3121 - And better yet, is there any way to tell it to not use that part and finish the test? It seems it stopped when it got to that, or I think it did. Thoughts? Dale :-) :-) Dale, Not sure how to get it to go past. Think that is in the firmware of the disk. I would start with making a backup first. -- Joost That's a 3TB drive. I don't have anything big enough to back it up to. Is there anyway to find out if this error is really serious or just a run of the mill type error? I would think that if it was a run of the mill error the drive would handle the error itself and I wouldn't even see it. Something like marking the area as bad and just not trying to use it anymore, even for the test. Thanks. Any advice is appreciated. I need a hard drive guru. ;-) Here is additional info: root@fireball / # hdparm -i /dev/sdc /dev/sdc: Model=ST3000DM001-9YN166, FwRev=CC4C, SerialNo=Z1F0PKT5 Config={ HardSect NotMFM HdSw15uSec Fixed DTR10Mbs RotSpdTol.5% } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4 BuffType=unknown, BuffSize=unknown, MaxMultSect=16, MultSect=16 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=5860533168 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6 AdvancedPM=yes: unknown setting WriteCache=enabled Drive conforms to: unknown: ATA/ATAPI-4,5,6,7 * signifies the current active mode root@fireball / # Dale :-) :-) There are some options with smartctl you could try to force the drive to swap that bad sector with a spare one. A full disk read could also force that. Eg. Try ' dd if=/dev/sdc of=/dev/null '. But, I usually order a replacement when Smart tests start throwing errors. I know 3TB is a lot for you to have to backup, but it's also a lot of data to loose... -- Joost -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: [gentoo-user] smartctrl drive error @60%
Am Tue, 24 Jun 2014 18:09:03 -0500 schrieb Dale rdalek1...@gmail.com: Howdy, I run this test every once in a while. How bad is this: root@fireball / # smartctl -l selftest /dev/sdc smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_DescriptionStatus Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offlineCompleted: read failure 60% 16365 2905482560 # 2 Extended offlineCompleted: read failure 60% 16352 2905482560 # 3 Extended offlineCompleted without error 00% 8044 - # 4 Extended offlineCompleted without error 00% 3121 - And better yet, is there any way to tell it to not use that part and finish the test? It seems it stopped when it got to that, or I think it did. Thoughts? I have no idea, really, but I had a similar situation that was caused by a loose SATA connection. In my case the drive stopped working first, then after checking the SATA connection, it was detected again, but didn't work correctly, including failing its SMART extended tests at a specific sector. Then, after fiddling with the connection several weeks later, it started working flawlessly again. I plan on buying SATA cables with clips, now :-/ (though I have to check if that will work with my mainboard first). Otherwise, to reiterate what J. Roeleveld wrote: backups, backups, backups ;) . Dale :-) :-) HTH -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature
Re: [gentoo-user] smartctrl drive error @60%
On 06/25/2014 06:05 AM, Dale wrote: J. Roeleveld wrote: On 25 June 2014 01:09:03 CEST, Dale rdalek1...@gmail.com wrote: Howdy, I run this test every once in a while. How bad is this: root@fireball / # smartctl -l selftest /dev/sdc smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_DescriptionStatus Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offlineCompleted: read failure 60% 16365 2905482560 # 2 Extended offlineCompleted: read failure 60% 16352 2905482560 # 3 Extended offlineCompleted without error 00% 8044 - # 4 Extended offlineCompleted without error 00% 3121 - this is pretty bad. enough to really go and get a replacement asap, and turn that disk off if you can. the self test stops at the first error it comes to and in this case it is LBA#2905482560 for calculation of where the error is check out the smartcl [1] site which will help you to mark the block bad though the data that was in that block is probably lost forever. i'd also suggest you run # smartctl -a /dev/sdc and paste the results here. the crucial rows are 196/197 the reallocated sector counts and pending sector counts. they show how many blocks have been reallocated, and also how many are pending. this will give you a scaling factor, at the moment you are in trouble, if these figures are very high you are in very high trouble, if they are low you are in low trouble. [1] http://smartmontools.sourceforge.net/badblockhowto.html
Re: [gentoo-user] smartctrl drive error @60%
J. Roeleveld wrote: There are some options with smartctl you could try to force the drive to swap that bad sector with a spare one. A full disk read could also force that. Eg. Try ' dd if=/dev/sdc of=/dev/null '. But, I usually order a replacement when Smart tests start throwing errors. I know 3TB is a lot for you to have to backup, but it's also a lot of data to loose... -- Joost I just don't have anything to put the data on. I been saying I was going to get me a backup drive but hadn't yet. Looks like I better order one unless someone pops on and says this is normal and OK, sort of doubting that will happen tho. Dale :-) :-) -- I am only responsible for what I said ... Not for what you understood or how you interpreted my words!
Re: [gentoo-user] smartctrl drive error @60%
thegeezer wrote: this is pretty bad. enough to really go and get a replacement asap, and turn that disk off if you can. the self test stops at the first error it comes to and in this case it is LBA#2905482560 for calculation of where the error is check out the smartcl [1] site which will help you to mark the block bad though the data that was in that block is probably lost forever. i'd also suggest you run # smartctl -a /dev/sdc and paste the results here. the crucial rows are 196/197 the reallocated sector counts and pending sector counts. they show how many blocks have been reallocated, and also how many are pending. this will give you a scaling factor, at the moment you are in trouble, if these figures are very high you are in very high trouble, if they are low you are in low trouble. [1] http://smartmontools.sourceforge.net/badblockhowto.html Here is the output: root@fireball / # smartctl -a /dev/sdc smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.14 (AF) Device Model: ST3000DM001-9YN166 Serial Number:Z1F0PKT5 LU WWN Device Id: 5 000c50 04d79e15c Firmware Version: CC4C User Capacity:3,000,592,982,016 bytes [3.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate:7200 rpm Device is:In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is:Wed Jun 25 02:46:39 2014 CDT == WARNING: A firmware update for this drive is available, see the following Seagate web pages: http://knowledge.seagate.com/articles/en_US/FAQ/207931en http://knowledge.seagate.com/articles/en_US/FAQ/223651en SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 118) The previous self-test completed having the read element of the test failed. Total time to complete Offline data collection:( 584) seconds. Offline data collection capabilities:(0x73) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities:(0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability:(0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time:( 1) minutes. Extended self-test routine recommended polling time:( 340) minutes. Conveyance self-test routine recommended polling time:( 2) minutes. SCT capabilities: (0x3085) SCT Status supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 119 099 006Pre-fail Always - 234421760 3 Spin_Up_Time0x0003 092 092 000Pre-fail Always - 0 4 Start_Stop_Count0x0032 100 100 020Old_age Always - 33 5 Reallocated_Sector_Ct 0x0033 100 100 036Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 079 060 030Pre-fail Always - 99909120 9 Power_On_Hours 0x0032 082 082 000Old_age Always - 16379 10 Spin_Retry_Count0x0013 100 100 097Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020Old_age Always - 34 183 Runtime_Bad_Block 0x0032 100 100 000Old_age Always - 0 184 End-to-End_Error0x0032 100 100 099Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000Old_age Always - 0 188 Command_Timeout 0x0032 100 100 000
Re: [gentoo-user] smartctrl drive error @60%
On Wed, 25 Jun 2014 00:05:03 -0500, Dale wrote: That's a 3TB drive. I don't have anything big enough to back it up to. Then either your data is not important to you or you need to get another drive ASAP. Meanwhile, you could start backing up the most important data. Is there anyway to find out if this error is really serious or just a run of the mill type error? Whenever I have seen this behaviour, it was soon followed by total drive failure, even to the point that the computer would not boot with that drive connected. -- Neil Bothwick WinErr 003: Dynamic linking error - Your mistake is now in every file signature.asc Description: PGP signature
Re: [gentoo-user] smartctrl drive error @60%
Neil Bothwick wrote: On Wed, 25 Jun 2014 00:05:03 -0500, Dale wrote: That's a 3TB drive. I don't have anything big enough to back it up to. Then either your data is not important to you or you need to get another drive ASAP. Meanwhile, you could start backing up the most important data. I got a drive picked out at Newegg. http://www.newegg.com/Product/Product.aspx?Item=N82E16822148844 Is there anyway to find out if this error is really serious or just a run of the mill type error? Whenever I have seen this behaviour, it was soon followed by total drive failure, even to the point that the computer would not boot with that drive connected. Well, I did blow the dust out a month or so ago so I thought I would remove the sides and re-seat all the cables. I've got the long test running now but it passed the SHORT test. I'm hoping it will fix this issue, just hoping. That is a good deal on that new drive tho. May get it anyway. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On 06/25/2014 08:49 AM, Dale wrote: thegeezer wrote: this is pretty bad. Here is the output: root@fireball / # smartctl -a /dev/sdc smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.14 (AF) Device Model: ST3000DM001-9YN166 Serial Number:Z1F0PKT5 LU WWN Device Id: 5 000c50 04d79e15c Firmware Version: CC4C User Capacity:3,000,592,982,016 bytes [3.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate:7200 rpm Device is:In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is:Wed Jun 25 02:46:39 2014 CDT == WARNING: A firmware update for this drive is available, see the following Seagate web pages: http://knowledge.seagate.com/articles/en_US/FAQ/207931en http://knowledge.seagate.com/articles/en_US/FAQ/223651en interesting - not seen that before might be worth a nose SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 118) The previous self-test completed having the read element of the test failed. Total time to complete Offline data collection:( 584) seconds. Offline data collection capabilities:(0x73) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities:(0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability:(0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time:( 1) minutes. Extended self-test routine recommended polling time:( 340) minutes. Conveyance self-test routine recommended polling time:( 2) minutes. SCT capabilities: (0x3085) SCT Status supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 119 099 006Pre-fail Always - 234421760 you can happily ignore this error rate, it is usual for it to be high and htere is hardware correction for it 3 Spin_Up_Time0x0003 092 092 000Pre-fail Always - 0 4 Start_Stop_Count0x0032 100 100 020Old_age Always - 33 33 power cycles seem very low but further down we see the power on time is just under two years which is also erring towards the lighter side of the mtbf 5 Reallocated_Sector_Ct 0x0033 100 100 036Pre-fail Always - 0 zero reallocated sectors suggests there is space to do reallocation 7 Seek_Error_Rate 0x000f 079 060 030Pre-fail Always - 99909120 9 Power_On_Hours 0x0032 082 082 000Old_age Always - 16379 almost two years of power on time 10 Spin_Retry_Count0x0013 100 100 097Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020Old_age Always - 34 183 Runtime_Bad_Block 0x0032 100 100 000Old_age Always - 0 184 End-to-End_Error0x0032 100 100 099Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000Old_age Always - 0 188 Command_Timeout 0x0032 100 100 000Old_age Always - 0 0 0 189 High_Fly_Writes 0x003a 100 100 000Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 069 063 045Old_age Always - 31 (Min/Max 26/33) 191 G-Sense_Error_Rate 0x0032 100 100
Re: [gentoo-user] smartctrl drive error @60%
On 06/25/2014 11:05 AM, Dale wrote: I got a drive picked out at Newegg. http://www.newegg.com/Product/Product.aspx?Item=N82E16822148844 slightly offtopic - i notice that the drive has a 2year limited warranty has anyone managed to get anything from hard drive warranties ?
Re: [gentoo-user] smartctrl drive error @60%
On 06/25/2014 12:55:08 PM, thegeezer wrote: On 06/25/2014 11:05 AM, Dale wrote: I got a drive picked out at Newegg. http://www.newegg.com/Product/Product.aspx?Item=N82E16822148844 slightly offtopic - i notice that the drive has a 2year limited warranty has anyone managed to get anything from hard drive warranties ? I always buy enterprise editions which have a warranty of 5 years. I had several drives which got replaced after 3-5 years. Furthermore, I have the feeling that enterprise editions have been tested more strictly. I know they are much more expensive but I even take these for my private machine. Helmut.
Re: [gentoo-user] smartctrl drive error @60%
Dale wrote: I got a drive picked out at Newegg. http://www.newegg.com/Product/Product.aspx?Item=N82E16822148844 Drive is ordered. Be here tomorrow. Yay Newegg. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On Wed, Jun 25, 2014 at 6:55 AM, thegeezer thegee...@thegeezer.net wrote: On 06/25/2014 11:05 AM, Dale wrote: I got a drive picked out at Newegg. http://www.newegg.com/Product/Product.aspx?Item=N82E16822148844 slightly offtopic - i notice that the drive has a 2year limited warranty has anyone managed to get anything from hard drive warranties ? Yes. Most manufacturers have a hard drive warranty tool online. Just give it your serial number and it will tell you if you're eligible, and how to go about it. I know Seagate wants you to run their own testing util (which just does a SMART test and spits out a validation code which you write down). I've gotten the same sorts of errors several times now on my RAID and when it happens I just go through the warranty process, select advance replacement, swap out the drive, then return the old drive in their packaging. Typically costs me $10 for HD replacement (I have to pay return shipping only). Typically drives tend to die for me about a year after I buy them - alarmingly often, actually. Anybody who doesn't run smartmon or its equivalent is insane, as is anybody who doesn't at least run RAID, though anything valuable should be backed up. Rich
Re: [gentoo-user] smartctrl drive error @60%
thegeezer wrote: On 06/25/2014 08:49 AM, Dale wrote: thegeezer wrote: this is pretty bad. Here is the output: root@fireball / # smartctl -a /dev/sdc smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.14 (AF) Device Model: ST3000DM001-9YN166 Serial Number:Z1F0PKT5 LU WWN Device Id: 5 000c50 04d79e15c Firmware Version: CC4C User Capacity:3,000,592,982,016 bytes [3.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate:7200 rpm Device is:In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is:Wed Jun 25 02:46:39 2014 CDT == WARNING: A firmware update for this drive is available, see the following Seagate web pages: http://knowledge.seagate.com/articles/en_US/FAQ/207931en http://knowledge.seagate.com/articles/en_US/FAQ/223651en interesting - not seen that before might be worth a nose I was thinking the same thing myself. How does it know there is a update was another question I had. SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 118) The previous self-test completed having the read element of the test failed. Total time to complete Offline data collection:( 584) seconds. Offline data collection capabilities:(0x73) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities:(0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability:(0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time:( 1) minutes. Extended self-test routine recommended polling time:( 340) minutes. Conveyance self-test routine recommended polling time:( 2) minutes. SCT capabilities: (0x3085) SCT Status supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 119 099 006Pre-fail Always - 234421760 you can happily ignore this error rate, it is usual for it to be high and htere is hardware correction for it 3 Spin_Up_Time0x0003 092 092 000Pre-fail Always - 0 4 Start_Stop_Count0x0032 100 100 020Old_age Always - 33 33 power cycles seem very low but further down we see the power on time is just under two years which is also erring towards the lighter side of the mtbf About the only time I shutdown is when the power fails. My puter only pulls about 150 watts so I just leave it running 24/7. 5 Reallocated_Sector_Ct 0x0033 100 100 036Pre-fail Always - 0 zero reallocated sectors suggests there is space to do reallocation 7 Seek_Error_Rate 0x000f 079 060 030Pre-fail Always - 99909120 9 Power_On_Hours 0x0032 082 082 000Old_age Always - 16379 almost two years of power on time 10 Spin_Retry_Count0x0013 100 100 097Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020Old_age Always - 34 183 Runtime_Bad_Block 0x0032 100 100 000Old_age Always - 0 184 End-to-End_Error0x0032 100 100 099Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000Old_age Always - 0 188 Command_Timeout 0x0032 100 100 000Old_age Always - 0
Re: [gentoo-user] smartctrl drive error @60%
Helmut Jarausch jarau...@igpm.rwth-aachen.de wrote: On 06/25/2014 12:55:08 PM, thegeezer wrote: On 06/25/2014 11:05 AM, Dale wrote: I got a drive picked out at Newegg. http://www.newegg.com/Product/Product.aspx?Item=N82E16822148844 slightly offtopic - i notice that the drive has a 2year limited warranty has anyone managed to get anything from hard drive warranties ? I always buy enterprise editions which have a warranty of 5 years. I had several drives which got replaced after 3-5 years. Furthermore, I have the feeling that enterprise editions have been tested more strictly. I know they are much more expensive but I even take these for my private machine. And lately the warranty is just one year. -- Your life is like a penny. You're going to lose it. The question is: How do you spend it? John Covici cov...@ccs.covici.com
Re: [gentoo-user] smartctrl drive error @60%
On Wed, Jun 25, 2014 at 9:15 AM, Dale rdalek1...@gmail.com wrote: thegeezer wrote: On 06/25/2014 08:49 AM, Dale wrote: thegeezer wrote: this says there are 104 pending sectors i.e. bad blocks on the drive that have not been reallocatd yet Wonder why it hasn't? Isn't it supposed to do that sort of thing itself? It can't relocate the sectors until it successfully reads them, or until something else writes over them. However, the last few drives I've had this happen to never really relocated things. If I scrubbed the drives mdadm would overwrite the unreadable sectors, which should trigger a relocation, but then a day or two later the errors would show up again. So, the drive firmware must be avoiding relocation or something. Either that or there is a large region of the drive that is failing (which would make sense) and I was just playing whack-a-mole with the bad sectors. In any case, if the drive is under warranty I've yet to have a complaint returning it with a copy of the smartctl output showing the failed test/etc. With advance replacement I can keep the old drive until the new one arrives. I usually just run the test manually but I sort of had family stuff going on for the past year, almost a year anyway. Sort of behind on things although I have been doing my normal updates. rc-update add smartd default I don't know that I even had to configure it - it is set to email root@localhost when there is a problem. I also run mdadm to monitor raid. I don't think anybody makes a monitor for btrfs, though my boot is mirrored across all my btrfs drives using mdadm so a drive failure should be detected in any case. I need to check up on that, though - I'd like an email if something goes wrong with btrfs storage. I ordered a drive. It should be here tomorrow. In the meantime, I shutdown and re-seated all the cables, power too. I got the test running again but results is a few hours off yet. It did pass the short test tho. I'm not sure that it means much. Short test generally doesn't do much - you need the long ones. I'd be shocked if it passed with offline uncorrectable sectors. And do check on your warranty. You can migrate all your data to the new drive, and then replace the old one as a backup disk. Either use it with raid, or as an offline backup. If you want to do raid you can set up mdadm with a degraded raid1 so that you can copy your data over from your old drive, and then when it is replaced you just partition the new one, add it to the raid, and watch it rebuild automatically. Rich
Re: [gentoo-user] smartctrl drive error @60%
On Wed, Jun 25, 2014 at 10:23 AM, Neil Bothwick n...@digimed.co.uk wrote: On Wed, 25 Jun 2014 08:33:37 -0400, Rich Freeman wrote: Typically drives tend to die for me about a year after I buy them - alarmingly often, actually. Do you have a UPS? I used to get similar levels of failure, and not just drives,then I bought a UPS and things got much better. It seems the mains supply here is not as stable as it should be. I do not, and I can't say I was terribly thrilled with the performance with the last cheap UPS I bought. The price to do it right tends to be moderately high, so it hasn't been a priority. Perhaps I should look into it again. Rich
Re: [gentoo-user] smartctrl drive error @60%
On Wed, 25 Jun 2014 08:33:37 -0400, Rich Freeman wrote: Typically drives tend to die for me about a year after I buy them - alarmingly often, actually. Do you have a UPS? I used to get similar levels of failure, and not just drives,then I bought a UPS and things got much better. It seems the mains supply here is not as stable as it should be. -- Neil Bothwick Grow your own dope, plant a politician! signature.asc Description: PGP signature
Re: [gentoo-user] smartctrl drive error @60%
Rich Freeman ri...@gentoo.org wrote: On Wed, Jun 25, 2014 at 6:55 AM, thegeezer thegee...@thegeezer.net wrote: On 06/25/2014 11:05 AM, Dale wrote: I got a drive picked out at Newegg. http://www.newegg.com/Product/Product.aspx?Item=N82E16822148844 slightly offtopic - i notice that the drive has a 2year limited warranty has anyone managed to get anything from hard drive warranties ? Yes. Most manufacturers have a hard drive warranty tool online. Just give it your serial number and it will tell you if you're eligible, and how to go about it. I know Seagate wants you to run their own testing util (which just does a SMART test and spits out a validation code which you write down). I've gotten the same sorts of errors several times now on my RAID and when it happens I just go through the warranty process, select advance replacement, swap out the drive, then return the old drive in their packaging. Typically costs me $10 for HD replacement (I have to pay return shipping only). Typically drives tend to die for me about a year after I buy them - alarmingly often, actually. Anybody who doesn't run smartmon or its equivalent is insane, as is anybody who doesn't at least run RAID, though anything valuable should be backed up. Is it not true that you cannot run raid on consumer drives because of timing errors? -- Your life is like a penny. You're going to lose it. The question is: How do you spend it? John Covici cov...@ccs.covici.com
Re: [gentoo-user] smartctrl drive error @60%
Rich Freeman wrote: On Wed, Jun 25, 2014 at 10:23 AM, Neil Bothwick n...@digimed.co.uk wrote: On Wed, 25 Jun 2014 08:33:37 -0400, Rich Freeman wrote: Typically drives tend to die for me about a year after I buy them - alarmingly often, actually. Do you have a UPS? I used to get similar levels of failure, and not just drives,then I bought a UPS and things got much better. It seems the mains supply here is not as stable as it should be. I do not, and I can't say I was terribly thrilled with the performance with the last cheap UPS I bought. The price to do it right tends to be moderately high, so it hasn't been a priority. Perhaps I should look into it again. Rich . I have had two CyberPower UPS's and been happy with them. Both still work but had to put in a set of batteries in the older one. Old one runs my TV during those frequent blinks we get here and the new one runs my puter. I usually catch them on sale for a little over $100 here. I want to get two more at some point. One for my Mom's TV and one for my sis-n-law's puter. Out of all the hard drives I have ever had, only one has failed. The smart software gave me enough warning to copy the stuff over. Maybe me having a UPS has helped on that. No way to prove it either way tho. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
Rich Freeman wrote: On Wed, Jun 25, 2014 at 9:15 AM, Dale rdalek1...@gmail.com wrote: thegeezer wrote: On 06/25/2014 08:49 AM, Dale wrote: thegeezer wrote: this says there are 104 pending sectors i.e. bad blocks on the drive that have not been reallocatd yet Wonder why it hasn't? Isn't it supposed to do that sort of thing itself? It can't relocate the sectors until it successfully reads them, or until something else writes over them. However, the last few drives I've had this happen to never really relocated things. If I scrubbed the drives mdadm would overwrite the unreadable sectors, which should trigger a relocation, but then a day or two later the errors would show up again. So, the drive firmware must be avoiding relocation or something. Either that or there is a large region of the drive that is failing (which would make sense) and I was just playing whack-a-mole with the bad sectors. In any case, if the drive is under warranty I've yet to have a complaint returning it with a copy of the smartctl output showing the failed test/etc. With advance replacement I can keep the old drive until the new one arrives. I'm going to bet this drive is out of warranty. I'm pretty sure it is over 2 years since I bought it. Once I replace that drive, I'll dd the thing and see what it does then. It'll either break it or give me a fresh start to play with and see how long it lasts. I usually just run the test manually but I sort of had family stuff going on for the past year, almost a year anyway. Sort of behind on things although I have been doing my normal updates. rc-update add smartd default I don't know that I even had to configure it - it is set to email root@localhost when there is a problem. I also run mdadm to monitor raid. I don't think anybody makes a monitor for btrfs, though my boot is mirrored across all my btrfs drives using mdadm so a drive failure should be detected in any case. I need to check up on that, though - I'd like an email if something goes wrong with btrfs storage. I'm using lvm here. I also don't have a mail server set up which is why I run them manually. I usually do it once a month or so but had some family issues to pop up. I ordered a drive. It should be here tomorrow. In the meantime, I shutdown and re-seated all the cables, power too. I got the test running again but results is a few hours off yet. It did pass the short test tho. I'm not sure that it means much. Short test generally doesn't do much - you need the long ones. I'd be shocked if it passed with offline uncorrectable sectors. And do check on your warranty. You can migrate all your data to the new drive, and then replace the old one as a backup disk. Either use it with raid, or as an offline backup. If you want to do raid you can set up mdadm with a degraded raid1 so that you can copy your data over from your old drive, and then when it is replaced you just partition the new one, add it to the raid, and watch it rebuild automatically. Rich I figured the short test wouldn't say much. I am backing up some of the stuff tho. I do have a 750GB drive that was empty. It won't save it all but it is a start. Test should have been done by now but I guess the copy process is slowing it down. I'm getting this so far: # 1 Extended offlineSelf-test routine in progress 70% 16387 - dale twiddles his thumbs Thanks much. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On Wed, 25 Jun 2014 10:54:51 -0500, Dale wrote: I'm using lvm here. I also don't have a mail server set up which is why I run them manually. Install a simple forwarding MTA like ssmtp to have al mails from cron and friends sent to your ISP mailbox. -- Neil Bothwick Beware! The end is... aaarrgh! signature.asc Description: PGP signature
Re: [gentoo-user] smartctrl drive error @60%
On Wed, Jun 25, 2014 at 11:30 AM, cov...@ccs.covici.com wrote: Is it not true that you cannot run raid on consumer drives because of timing errors? Yes, it is not true. :) I've never had issues running RAID on consumer drives. Sure, devices certified for RAID might spend less time trying to recover data which is a bit more optimal, but only in the situation where your drive is actually failing. If my RAID blocks on read for 30 seconds once a year when a drive is about to die I can live with that, assuming mdadm doesn't figure out it should give up sooner than that. Rich
Re: [gentoo-user] smartctrl drive error @60%
On Wed, Jun 25, 2014 at 11:54 AM, Dale rdalek1...@gmail.com wrote: I'm going to bet this drive is out of warranty. I'm pretty sure it is over 2 years since I bought it. Once I replace that drive, I'll dd the thing and see what it does then. It'll either break it or give me a fresh start to play with and see how long it lasts. Well, finding out for sure is a 30 second process, so up to you whether it is worth the time. smartctl will give you the serial/model number, and you punch that into a website, and it will say whether it is under warranty or not. If you plan to wipe the disk before return, print out the results of smartctl -a first, since wiping will probably clear the pending sectors. But, it is your drive, so do whatever you want with it! :) Rich
Re: [gentoo-user] smartctrl drive error @60%
On Wednesday 25 Jun 2014 17:09:52 Neil Bothwick wrote: On Wed, 25 Jun 2014 10:54:51 -0500, Dale wrote: I'm using lvm here. I also don't have a mail server set up which is why I run them manually. Install a simple forwarding MTA like ssmtp to have al mails from cron and friends sent to your ISP mailbox. ... and when you find out please tell us: 1) What syntax is appropriate to allow the use of mail account passwds which contain not just alphanumeric characters but also symbols like [~@#$] ? 2) How can you force it to NOT use RC4 cipher when it logs into Google Mail to relay messages, but the more secure ECDHE-RSA-AES128-GCM-SHA256 that the server proposes ? -- Regards, Mick signature.asc Description: This is a digitally signed message part.
Re: [gentoo-user] smartctrl drive error @60%
Am 25.06.2014 09:49, schrieb Dale: thegeezer wrote: this is pretty bad. enough to really go and get a replacement asap, and turn that disk off if you can. the self test stops at the first error it comes to and in this case it is LBA#2905482560 for calculation of where the error is check out the smartcl [1] site which will help you to mark the block bad though the data that was in that block is probably lost forever. i'd also suggest you run # smartctl -a /dev/sdc and paste the results here. the crucial rows are 196/197 the reallocated sector counts and pending sector counts. they show how many blocks have been reallocated, and also how many are pending. this will give you a scaling factor, at the moment you are in trouble, if these figures are very high you are in very high trouble, if they are low you are in low trouble. [1] http://smartmontools.sourceforge.net/badblockhowto.html Here is the output: root@fireball / # smartctl -a /dev/sdc smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.14 (AF) Device Model: ST3000DM001-9YN166 Serial Number:Z1F0PKT5 LU WWN Device Id: 5 000c50 04d79e15c Firmware Version: CC4C User Capacity:3,000,592,982,016 bytes [3.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate:7200 rpm Device is:In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is:Wed Jun 25 02:46:39 2014 CDT == WARNING: A firmware update for this drive is available, see the following Seagate web pages: http://knowledge.seagate.com/articles/en_US/FAQ/207931en http://knowledge.seagate.com/articles/en_US/FAQ/223651en SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 118) The previous self-test completed having the read element of the test failed. Total time to complete Offline data collection:( 584) seconds. Offline data collection capabilities:(0x73) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities:(0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability:(0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time:( 1) minutes. Extended self-test routine recommended polling time:( 340) minutes. Conveyance self-test routine recommended polling time:( 2) minutes. SCT capabilities: (0x3085) SCT Status supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 119 099 006Pre-fail Always - 234421760 3 Spin_Up_Time0x0003 092 092 000Pre-fail Always - 0 4 Start_Stop_Count0x0032 100 100 020Old_age Always - 33 5 Reallocated_Sector_Ct 0x0033 100 100 036Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 079 060 030Pre-fail Always - 99909120 9 Power_On_Hours 0x0032 082 082 000Old_age Always - 16379 10 Spin_Retry_Count0x0013 100 100 097Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020Old_age Always - 34 183 Runtime_Bad_Block 0x0032 100 100 000Old_age Always - 0 184 End-to-End_Error0x0032 100 100 099Old_age Always - 0 187 Reported_Uncorrect 0x0032
Re: [gentoo-user] smartctrl drive error @60%
On Wed, Jun 25, 2014 at 12:55 PM, Volker Armin Hemmann volkerar...@googlemail.com wrote: so without looking that drive up - you are using a desktop part for non-stop setup? Honestly, I think it makes far more sense to build a fault-tolerant setup than to try to avoid faults by spending more on the parts. I've only run desktop hard drives on my 24x7 RAID. If they die I replace them under warranty - I've yet to have one die outside of warranty, and I'm usually upgrading for size by that timeframe anyway, and I can use the old drives for storage. By all means get better-grade components, but I wouldn't use that as an excuse for not having backups of some kind. ALL hard drives WILL fail, it is just a matter of when. ANY hard drive can fail the day after you buy it, a month after you buy it, and so on, though obviously the probability of a particular drive failing at any point in time may vary by what you pay for it. I'd buy a more expensive drive only if the TCO is actually lower. I'd engineer any system to accept the failure of at least one drive, and for any data I actually cared about I'd engineer the system to resist fire, the rm star, and so on. Rich
Re: [gentoo-user] smartctrl drive error @60%
Am 25.06.2014 19:06, schrieb Rich Freeman: On Wed, Jun 25, 2014 at 12:55 PM, Volker Armin Hemmann volkerar...@googlemail.com wrote: so without looking that drive up - you are using a desktop part for non-stop setup? Honestly, I think it makes far more sense to build a fault-tolerant setup than to try to avoid faults by spending more on the parts. I've only run desktop hard drives on my 24x7 RAID. If they die I replace them under warranty so you are ripping of other customers? - I've yet to have one die outside of warranty, and I'm usually upgrading for size by that timeframe anyway, and I can use the old drives for storage. By all means get better-grade components, but I wouldn't use that as an excuse for not having backups of some kind. there is no excuse for not having backups. ALL hard drives WILL fail, it is just a matter of when. indeed. ANY hard drive can fail the day after you buy it, a month after you buy it, and so on, though obviously the probability of a particular drive failing at any point in time may vary by what you pay for it. or if it was meant to be used the way you use it.
Re: [gentoo-user] smartctrl drive error @60%
On Wed, 25 Jun 2014 17:44:48 +0100, Mick wrote: Install a simple forwarding MTA like ssmtp to have all mails from cron and friends sent to your ISP mailbox. ... and when you find out please tell us: 1) What syntax is appropriate to allow the use of mail account passwds which contain not just alphanumeric characters but also symbols like [~@#$] ? 2) How can you force it to NOT use RC4 cipher when it logs into Google Mail to relay messages, but the more secure ECDHE-RSA-AES128-GCM-SHA256 that the server proposes ? It's debatable whether either of those scenarios fall within the definition of simple. If something that simple won't do what you want, and there are several to try: ssmtp, esmtp, nullmailer etc, then you may need to use the likes of Postfix - but for Dale's situation, a lightweight forwarder is better than not being able to monitor his system. -- Neil Bothwick I thought the 10 commandments were multiple choice. signature.asc Description: PGP signature
Re: [gentoo-user] smartctrl drive error @60%
On Wed, Jun 25, 2014 at 1:15 PM, Volker Armin Hemmann volkerar...@googlemail.com wrote: Am 25.06.2014 19:06, schrieb Rich Freeman: Honestly, I think it makes far more sense to build a fault-tolerant setup than to try to avoid faults by spending more on the parts. I've only run desktop hard drives on my 24x7 RAID. If they die I replace them under warranty so you are ripping of other customers? I certainly am not aware of any warranty terms I'm violating. I just spot checked a drive warranty and it makes no mention of excluding continuous use, and the drive specifications do not contain any exclusions for continuous use. The SMART data in the drives I've returned contains both the number of power cycles and power-on time, and I've yet to have a manufacturer question either. To exclude continuous operation their warranty would have to specify just how many hours per day their drives can be operated for. ANY hard drive can fail the day after you buy it, a month after you buy it, and so on, though obviously the probability of a particular drive failing at any point in time may vary by what you pay for it. or if it was meant to be used the way you use it. Like I said, I'm certainly interested in any actual data that supports that drives sold to run 24x7 last any longer than desktop drives when run 24x7. Rich
Re: [gentoo-user] smartctrl drive error @60%
On 25/06/2014 17:30, cov...@ccs.covici.com wrote: Rich Freeman ri...@gentoo.org wrote: On Wed, Jun 25, 2014 at 6:55 AM, thegeezer thegee...@thegeezer.net wrote: On 06/25/2014 11:05 AM, Dale wrote: I got a drive picked out at Newegg. http://www.newegg.com/Product/Product.aspx?Item=N82E16822148844 slightly offtopic - i notice that the drive has a 2year limited warranty has anyone managed to get anything from hard drive warranties ? Yes. Most manufacturers have a hard drive warranty tool online. Just give it your serial number and it will tell you if you're eligible, and how to go about it. I know Seagate wants you to run their own testing util (which just does a SMART test and spits out a validation code which you write down). I've gotten the same sorts of errors several times now on my RAID and when it happens I just go through the warranty process, select advance replacement, swap out the drive, then return the old drive in their packaging. Typically costs me $10 for HD replacement (I have to pay return shipping only). Typically drives tend to die for me about a year after I buy them - alarmingly often, actually. Anybody who doesn't run smartmon or its equivalent is insane, as is anybody who doesn't at least run RAID, though anything valuable should be backed up. Is it not true that you cannot run raid on consumer drives because of timing errors? That sounds like something EMC and WD/Seagate would say. There's no reason in the world not to use consumer drives for RAID - unless you plan to add the drives to those obscenely expensive full-rack SAN jobs vendors want folk to buy. The reason consumer drives tend not to work in those arrays has nothing to do with the performance of the drive itself. The manufacturers flip a bit in the firmware and without that signature the array hardware often will not use the drive. It often really is as simple as that. -- Alan McKinnon alan.mckin...@gmail.com
Re: [gentoo-user] smartctrl drive error @60%
On Wed, Jun 25, 2014 at 7:03 AM, Rich Freeman ri...@gentoo.org wrote: I don't think anybody makes a monitor for btrfs, though my boot is mirrored across all my btrfs drives using mdadm so a drive failure should be detected in any case. I need to check up on that, though - I'd like an email if something goes wrong with btrfs storage. You're going to want to cron a 'scrub' and have it email you. There's no background daemon that I'm aware of to handle this. ZFS just introduced 'zed' and it would be nice if BTRFS would do the same -- Douglas J Hunley (doug.hun...@gmail.com) Twitter: @hunleyd Web: about.me/douglas_hunley G+: http://google.com/+DouglasHunley
Re: [gentoo-user] smartctrl drive error @60%
On Wed, Jun 25, 2014 at 2:44 PM, Douglas J Hunley doug.hun...@gmail.com wrote: You're going to want to cron a 'scrub' and have it email you. There's no background daemon that I'm aware of to handle this. ZFS just introduced 'zed' and it would be nice if BTRFS would do the same Actually, I think that for serious failures smartd will take care of it. I was reading the btrfs list archives and apparently btrfs doesn't make as much as a whisper when a drive fails. It just keeps on going. Now, the keeps on going part I'm fine with, but you'd think that operating in a degraded mode would trigger some kind of message. Granted, it isn't 100% done yet, either. In fact, if your replace the failed drive you have to manually force a re-balance or it will just continue to operate degraded. Rich
Re: [gentoo-user] smartctrl drive error @60%
Volker Armin Hemmann wrote: so without looking that drive up - you are using a desktop part for non-stop setup? If I recall correctly, the last drive that died was a more expensive type of drive, intended for a server setup. So far, the cheaper drives are the ones that have lasted until I outgrew them. So far, I have yet to ever have a drive die under warranty. So far. I need to check on this one but will do that after I get things changed out. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On Wednesday, June 25, 2014 01:44:23 PM Rich Freeman wrote: On Wed, Jun 25, 2014 at 1:15 PM, Volker Armin Hemmann ANY hard drive can fail the day after you buy it, a month after you buy it, and so on, though obviously the probability of a particular drive failing at any point in time may vary by what you pay for it. or if it was meant to be used the way you use it. Like I said, I'm certainly interested in any actual data that supports that drives sold to run 24x7 last any longer than desktop drives when run 24x7. Not hard data, but while still using desktop drives, I had a drive failure on average once or twice a year. Now with enterprise 24x7 drives, the failure rate has dropped to 1 in the past 3 years. That is, for both, using proper UPS equipment. Additionally, I noticed a definite speed increase after switching to enterprise disks. -- Joost
Re: [gentoo-user] smartctrl drive error @60%
Neil Bothwick wrote: On Wed, 25 Jun 2014 17:44:48 +0100, Mick wrote: Install a simple forwarding MTA like ssmtp to have all mails from cron and friends sent to your ISP mailbox. ... and when you find out please tell us: 1) What syntax is appropriate to allow the use of mail account passwds which contain not just alphanumeric characters but also symbols like [~@#$] ? 2) How can you force it to NOT use RC4 cipher when it logs into Google Mail to relay messages, but the more secure ECDHE-RSA-AES128-GCM-SHA256 that the server proposes ? It's debatable whether either of those scenarios fall within the definition of simple. If something that simple won't do what you want, and there are several to try: ssmtp, esmtp, nullmailer etc, then you may need to use the likes of Postfix - but for Dale's situation, a lightweight forwarder is better than not being able to monitor his system. I have to say, I dread setting up a mail server about as bad as I dread going to the Doctor. It's just something I really don't want to add to my system unless I have to. It's sort of like the init thingy. I don't want to add something else that will eventually break and I'll have to fix. The mail system won't keep me from booting but it is just one more thing to keep a eye on and make sure it is working. So, making sure the mail system is working will likely take up the same amount of time that checking the drive manually every month or so will take. The only good part is, and this is the point you are making so well, even tho I had other things going on, it would have been testing my drive and spit out a error to get my attention. Going back, the error has been there for a while. It would have been nice to know this before now. Hindsight again. ;-) What I really need to do, set up a RAID or some other backup method so that even if this happens again, I don't risk losing anything. Then again, that will take time as well. Also takes money. From df -h: Filesystem Size Used Avail Use% Mounted on /dev/mapper/home-home 2.7T 1.5T 1.3T 56% /home Most of that is recorded TV shows, movies etc. I also have some pics I took with my camera that can't be replaced. Those I backup to DVDs pretty regular. I use kbackup to tarball them and then burn them to DVDs. It works. One set is outside the home in case of fire. The biggest thing is some of those shows would be hard to get again plus the effort to get them as well. Let's hope it lasts until at least tomorrow. I bet it takes a while to copy all that tho. O_O Thanks. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On Wed, Jun 25, 2014 at 6:16 PM, Dale rdalek1...@gmail.com wrote: What I really need to do, set up a RAID or some other backup method so that even if this happens again, I don't risk losing anything. Then again, that will take time as well. Also takes money. Keep in mind that RAID is more about speed of recovery and protects against the failure mode of total drive failure, which is a fairly common failure mode. A hard drive failure on a RAID involves no unplanned downtime, and a need for some short planned downtime to replace the drive. Backup protects against a lot more, but typically results in a recovery that takes hours, and when the drive goes you're down without warning. Most of that is recorded TV shows, movies etc. I also have some pics I took with my camera that can't be replaced. Those I backup to DVDs pretty regular. I use kbackup to tarball them and then burn them to DVDs. It works. One set is outside the home in case of fire. The biggest thing is some of those shows would be hard to get again plus the effort to get them as well. So, stuff like photos I backup to the cloud, or to offsite media (generally I favor the cloud for active stuff, and offsite media for stuff I'm done with). Ditto for things like /etc, mysql, documents, email, and other small but important things. For stuff like MythTV recordings I used to just rely on RAID - recognizing that there was a very real possibility that I could lose them all. Now I also do a backup to a drive that is normally left unmounted, which isn't great, but since I moved to btrfs I wanted something on ext4 that had daily rsnapshots. Again, I'm willing to risk losing this stuff. Rich
Re: [gentoo-user] smartctrl drive error @60%
On Wed, 25 Jun 2014 17:16:17 -0500, Dale wrote: Install a simple forwarding MTA like ssmtp to have all mails from cron and friends sent to your ISP mailbox. I have to say, I dread setting up a mail server about as bad as I dread going to the Doctor. It's just something I really don't want to add to my system unless I have to. Which is why I suggesting something like ssmtp, which you can't call a server, it just forwards. Often the only configuration needed is changing one line in ssmtp.conf, to the address of your ISP's mail server. That's it, now any program can send mail using sendmail and it just goes to your ISP mailbox. -- Neil Bothwick God said, div D = rho, div B = 0, curl E = - @B/@t, curl H = J + @D/@t, and there was light. signature.asc Description: PGP signature
Re: [gentoo-user] smartctrl drive error @60%
On 26/06/14 06:16, Dale wrote: Neil Bothwick wrote: On Wed, 25 Jun 2014 17:44:48 +0100, Mick wrote: Install a simple forwarding MTA like ssmtp to have all mails from cron and friends sent to your ISP mailbox. ... and when you find out please tell us: What I really need to do, set up a RAID or some other backup method so that even if this happens again, I don't risk losing anything. Then again, that will take time as well. Also takes money. Repeat after me ... RAID IS NOT A BACKUP There are many ways to do a backup - various raid forms, mirrors etc can help in some (and only some) instances but only a spatially separated copy of the data is relatively safe. Have two computers? - cross backup between them. (keep an old machine as a file server in the back room, start it up a couple of times a week and run a backup script - can even be automated) Have a friend/relative nearby? - take your PC over, create a backup and then sync the differences across the net using rsync etc - most normal people do fill up todays large disks, or have large personal valuable data requirements. You dont need to backup the whole machine, just the valuable bits (configs, personal data, email archives, ...) There are many ways to do it - if you only have one disk and no backups, the data by definition is not valuable :) Ive just been caught by an old 1G WD green drive failing (possibly the MB's fault as the sata interface died as well - seen a few of those now!) that took out the middle drive from a striped LVM. Didnt bother to recover, just built a new machine from leftover bits, bought another drive and rebuilt it using btrfs raid 1 on the two orignal WD 2G green drives and a new WD red, and restored from backups on another machine - over the years this type of event has happened a few times - you only need to get burnt once to learn!. BillK
Re: [gentoo-user] smartctrl drive error @60%
Hello, On Wed, 25 Jun 2014, Dale wrote: thegeezer wrote: On 06/25/2014 08:49 AM, Dale wrote: Device Model: ST3000DM001-9YN166 I have (had sort of) the same disc, with the same FW. see the following Seagate web pages: http://knowledge.seagate.com/articles/en_US/FAQ/207931en http://knowledge.seagate.com/articles/en_US/FAQ/223651en interesting - not seen that before might be worth a nose I was thinking the same thing myself. How does it know there is a update was another question I had. Those FW-Updates do _NOT_ apply to FW-Version 9YN166. From what I found, you'd brick the drive. The smartctl DB does not take the FW-version into account, just the model, to display above notice. 7 Seek_Error_Rate 0x000f 079 060 030Pre-fail Always - 99909120 9 Power_On_Hours 0x0032 082 082 000Old_age Always - 16379 almost two years of power on time looks familiar 4 Start_Stop_Count0x0032 100 100 020Old_age Always 915 7 Seek_Error_Rate 0x000f 072 060 030Pre-fail Always 19309568 9 Power_On_Hours 0x0032 088 088 000Old_age Always 11351 [..] 197 Current_Pending_Sector 0x0012 100 100 000Old_age Always - 104 197 this says there are 104 pending sectors i.e. bad blocks on the drive that have not been reallocatd yet Wonder why it hasn't? Isn't it supposed to do that sort of thing itself? 198 Offline_Uncorrectable 0x0010 100 100 000Old_age Offline - 104 this says it was not able to reallocate. which is odd because of the entry 5 being zero Uh oh. Yeah. Oh, and I had a clean smart until a few days ago, luckily I alread had a WD Red (WD40EFRX) drive waiting when this attrib jumped from 0 to: 5 Reallocated_Sector_Ct 0x0033 087 087 036Pre-fail Always 17688 Other Seagates (a few 1.5T drives) have also made me trouble, the 2T Samsung already relabeled and sold as a Seagate but with Samsung in the FW though is still ok. [..] I ordered a drive. It should be here tomorrow. In the meantime, I shutdown and re-seated all the cables, power too. I got the test running again but results is a few hours off yet. It did pass the short test tho. I'm not sure that it means much. Good. Do not use dd, it WILL fail at the first error. Use gnu ddrescue or dd_rescue to grab an image. I used mc to copy via filesystem, eg. 'rsync -auxlPRAXSHD /foo/ /bar/' is fine too. Oh, and I hope you didn't buy a Seagate again ;) -dnh -- The sigmonster ate my sig and all I got was this stupid tagline.
Re: [gentoo-user] smartctrl drive error @60%
On 06/25/2014 10:44 AM, Rich Freeman wrote: Like I said, I'm certainly interested in any actual data that supports that drives sold to run 24x7 last any longer than desktop drives when run 24x7. Anecdotal, but... In 2008 I bought four 24x7 drives (500GB) and eight regular drives to be used in raid. Out of the eight regular drives, six failed before 4 years was up. All of the 24x7 drives are still in use (although I don't remember which machine(s) they're in now), six years later. All Seagate. I initially did do warranty replacement on the failed drives (all drives had 5 year warranty back then), and out of the six replacements, four failed a little over three months in. At that point I went and bought a real battery backed raid card (computer still has a UPS) with WD enterprise drives and no hiccups of any kind in about two years. And disk performance is way, way up. Dan
Re: [gentoo-user] smartctrl drive error @60%
Rich Freeman wrote: On Wed, Jun 25, 2014 at 6:16 PM, Dale rdalek1...@gmail.com wrote: What I really need to do, set up a RAID or some other backup method so that even if this happens again, I don't risk losing anything. Then again, that will take time as well. Also takes money. Keep in mind that RAID is more about speed of recovery and protects against the failure mode of total drive failure, which is a fairly common failure mode. A hard drive failure on a RAID involves no unplanned downtime, and a need for some short planned downtime to replace the drive. Backup protects against a lot more, but typically results in a recovery that takes hours, and when the drive goes you're down without warning. True. My issue with RAID is that it is yet another thing I have to maintain. I started using lvm and so far, it has been low maintenance and has made changing things MUCH easier when I do need to move things around a bit. It is a time saver to be more accurate. RAID also leaves me open to theft, house fire and such too. At the moment, I think, like you, having a external drive that I keep somewhere else is the safest method. Thing is, getting a drive big enough to do this. Buying this drive put a dent in my debt. That said, I really need to buy another drive if this old one turns out to be bad and set up some sort of backup plan. If it turns out to be OK somehow, then I may have a solution, maybe. While I don't want to lose anything, my camera pics is the most important to keep. That's why I rotate backups and keep one set outside the house. I would rather not lose my videos and could get most of them back but it won't be easy for sure. Most of that is recorded TV shows, movies etc. I also have some pics I took with my camera that can't be replaced. Those I backup to DVDs pretty regular. I use kbackup to tarball them and then burn them to DVDs. It works. One set is outside the home in case of fire. The biggest thing is some of those shows would be hard to get again plus the effort to get them as well. So, stuff like photos I backup to the cloud, or to offsite media (generally I favor the cloud for active stuff, and offsite media for stuff I'm done with). Ditto for things like /etc, mysql, documents, email, and other small but important things. For stuff like MythTV recordings I used to just rely on RAID - recognizing that there was a very real possibility that I could lose them all. Now I also do a backup to a drive that is normally left unmounted, which isn't great, but since I moved to btrfs I wanted something on ext4 that had daily rsnapshots. Again, I'm willing to risk losing this stuff. Rich I don't have anything on the cloud to backup too. That would likely be a good idea but I can't afford anything pricey, which is why I hadn't bought a backup drive before now either. Plus, something I'd prefer to keep under my thumb. Heck, some things here are encrypted, bank info and such. Also, while I have DSL, it ain't real speedy. Backing up that much data over my connection could take a while, like days, maybe even a week or more. I really do need a plan that I can manage to put in place tho. Murphy's law and all. :-D Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
Rich Freeman wrote: On Wed, Jun 25, 2014 at 11:54 AM, Dale rdalek1...@gmail.com wrote: I'm going to bet this drive is out of warranty. I'm pretty sure it is over 2 years since I bought it. Once I replace that drive, I'll dd the thing and see what it does then. It'll either break it or give me a fresh start to play with and see how long it lasts. Well, finding out for sure is a 30 second process, so up to you whether it is worth the time. smartctl will give you the serial/model number, and you punch that into a website, and it will say whether it is under warranty or not. If you plan to wipe the disk before return, print out the results of smartctl -a first, since wiping will probably clear the pending sectors. But, it is your drive, so do whatever you want with it! :) Rich I do plan to check and see if it is under warranty. I'll do that after I get things moved over and can test a bit more. Who knows, it could be Murphy and he will just leave at some point. ;-) I'm pretty sure this drive is close to three years old tho. Heck, I can go look at the Newegg order history and find out. I would think the manufacturer goes by the date made where a invoice dated later would tend to slide that out further. Either way, I'll find out. If it is under warranty and can be swapped out, that would solve a few issues. I'll have one backup drive at least. ;-) Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On Wed, Jun 25, 2014 at 10:07 PM, Dale rdalek1...@gmail.com wrote: Rich Freeman wrote: I don't have anything on the cloud to backup too. That would likely be a good idea but I can't afford anything pricey, which is why I hadn't bought a backup drive before now either. Plus, something I'd prefer to keep under my thumb. Heck, some things here are encrypted, bank info and such. Also, while I have DSL, it ain't real speedy. Backing up that much data over my connection could take a while, like days, maybe even a week or more. I put my backups on Amazon S3 reduced-redundancy - it is a few cents per GB per month. I think I have something like 20-30GB backed up. Oh, if you need to actually retrieve it that will cost you 10 cents per GB, but frankly if my house burned down that would be the least of my concerns. I'd only use the cloud to back up critical data. If you want to back up your mythtv and mp3 collection, then you're going to be uploading a LOT of data and paying quite a bit to store it. If you want to be storing TB of data offsite there are better ways of doing it. The advantage of something like S3 is that it is always there, which means you stick a duplicity script in your crontab and just periodically check up on it. You don't have to remember to do your backups. It just isn't practical to use it for more than a few dozen GB depending on your incremental strategy. I also have a 50Mbps outbound connection, which doesn't hurt. Your next best option is to find a friend with similar needs and give each other a place to upload your encrypted backups to. That will just cost you drive space, but if you're both planning on backing up 1TB of data it will still cost you the one-time drive purchase. If you want a quick cloud-capable backup solution, I'd look at duplicity. I just wish it had options for Google Drive (it supposedly does, but as far as I can tell it doesn't work, at least not with a two factor application password). Rich
Re: [gentoo-user] smartctrl drive error @60%
Neil Bothwick wrote: I have to say, I dread setting up a mail server about as bad as I dread going to the Doctor. It's just something I really don't want to add to my system unless I have to. Which is why I suggesting something like ssmtp, which you can't call a server, it just forwards. Often the only configuration needed is changing one line in ssmtp.conf, to the address of your ISP's mail server. That's it, now any program can send mail using sendmail and it just goes to your ISP mailbox. I like this part: Extremely simple MTA to get mail off the system to a Mailhub ^ That part right up there. :-D That may be a new thread, if needed. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
Bill Kenworthy wrote: On 26/06/14 06:16, Dale wrote: Neil Bothwick wrote: On Wed, 25 Jun 2014 17:44:48 +0100, Mick wrote: Install a simple forwarding MTA like ssmtp to have all mails from cron and friends sent to your ISP mailbox. ... and when you find out please tell us: What I really need to do, set up a RAID or some other backup method so that even if this happens again, I don't risk losing anything. Then again, that will take time as well. Also takes money. Repeat after me ... RAID IS NOT A BACKUP I agree with that. Power supply goes nuts and burns out the whole puter. RAID won't help that. House catches fire, ooops. Thief steals puter, uh oh. That list could go on for a while. About the only thing it does is allow quick recovery from a failing/dead drive. Basically. It's good at that from what I have read. There are many ways to do a backup - various raid forms, mirrors etc can help in some (and only some) instances but only a spatially separated copy of the data is relatively safe. Have two computers? - cross backup between them. (keep an old machine as a file server in the back room, start it up a couple of times a week and run a backup script - can even be automated) I do have a old puter at the moment. I thought about sticking it in a outbuilding and just turning it on to do backups then shutting it back down. That puts distance between house and outbuilding too. Thing is, I plan to let a family member use it when I can get around to getting a new case for it. I guess I could use any old slow junky puter with a LARGE drive in it. Have a friend/relative nearby? - take your PC over, create a backup and then sync the differences across the net using rsync etc - most normal people do fill up todays large disks, or have large personal valuable data requirements. You dont need to backup the whole machine, just the valuable bits (configs, personal data, email archives, ...) There are many ways to do it - if you only have one disk and no backups, the data by definition is not valuable :) Ive just been caught by an old 1G WD green drive failing (possibly the MB's fault as the sata interface died as well - seen a few of those now!) that took out the middle drive from a striped LVM. Didnt bother to recover, just built a new machine from leftover bits, bought another drive and rebuilt it using btrfs raid 1 on the two orignal WD 2G green drives and a new WD red, and restored from backups on another machine - over the years this type of event has happened a few times - you only need to get burnt once to learn!. BillK I do backup what I know can't be replaced at all. My camera pics can't be replaced since they are not anywhere else. Some other things here that are nowhere else I can live without, just would rather not if I can help it. I never backup the OS. I just reinstall it if needed. Generally, I try to keep a copy of /etc and the world file. I'll copy /etc over and use the world file as a guide on what to install on the new install. Heck, I can install Kubuntu in a hour or less. Then I can install Gentoo from that while doing my usual puter activities. I had a WD 80GB drive to fail several years ago. That's the only drive I have ever had to fail on me tho. It spit out errors and I was able to do backups and save the data before it died for good. I can't recall the exact error but it mentioned '24 hours' and 'right now'. It didn't miss it by much either. Just imagine if we had no tools to warn us of a failure at all. That would suck. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
David Haller wrote: Hello, On Wed, 25 Jun 2014, Dale wrote: Yeah. Oh, and I had a clean smart until a few days ago, luckily I alread had a WD Red (WD40EFRX) drive waiting when this attrib jumped from 0 to: 5 Reallocated_Sector_Ct 0x0033 087 087 036 Pre-fail Always 17688 Other Seagates (a few 1.5T drives) have also made me trouble, the 2T Samsung already relabeled and sold as a Seagate but with Samsung in the FW though is still ok. [..] I was wondering about how that would be updated since a lot of that stuff requires windoze. I ordered a drive. It should be here tomorrow. In the meantime, I shutdown and re-seated all the cables, power too. I got the test running again but results is a few hours off yet. It did pass the short test tho. I'm not sure that it means much. Good. Do not use dd, it WILL fail at the first error. Use gnu ddrescue or dd_rescue to grab an image. I used mc to copy via filesystem, eg. 'rsync -auxlPRAXSHD /foo/ /bar/' is fine too. Oh, and I hope you didn't buy a Seagate again ;) -dnh I plan to rsync or cp the data over. The dd part will come into play after I am sure I got everything off that I can get and am just erasing the drive completely. I plan to dd the drive then run the tests again just to see what it is doing. Heck, maybe it will reallocate that area like it should be doing already, I guess. Time will tell. I'll be having fun tomorrow tho. ;-) Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
J. Roeleveld wrote: On Wednesday, June 25, 2014 01:44:23 PM Rich Freeman wrote: On Wed, Jun 25, 2014 at 1:15 PM, Volker Armin Hemmann ANY hard drive can fail the day after you buy it, a month after you buy it, and so on, though obviously the probability of a particular drive failing at any point in time may vary by what you pay for it. or if it was meant to be used the way you use it. Like I said, I'm certainly interested in any actual data that supports that drives sold to run 24x7 last any longer than desktop drives when run 24x7. Not hard data, but while still using desktop drives, I had a drive failure on average once or twice a year. Now with enterprise 24x7 drives, the failure rate has dropped to 1 in the past 3 years. That is, for both, using proper UPS equipment. Additionally, I noticed a definite speed increase after switching to enterprise disks. -- Joost I have one WD black which I think is a more expensive drive. I have to say, when I run hdparm -tT on it, it is faster than the other regular drives that claim the same specs, SATA etc etc. They do cost more tho. Some a good bit more unless you can catch a good sale. While I was looking for this new drive, I looked into a 4TB drive. I am still trying to get my jaw back up off the floor. Holy sheep. They are still fairly proud of some of those puppies. I did notice they have a 5 and 6TB one now. O_O Double holy sheep. I think I lost my jaw now. Good bye nose, hello China. :-( Well, one of these days we will be talking about getting 6TB drives for $50 and how much we want a 20TB drive, to put all our worthless junk on. lol Oh, we will still complain about how they die to soon too. We may even have CPUs that run at light speed with many dozens of cores, but still to dang slow. ;-) Pass the rice please. Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
Daniel Frey wrote: On 06/25/2014 10:44 AM, Rich Freeman wrote: Like I said, I'm certainly interested in any actual data that supports that drives sold to run 24x7 last any longer than desktop drives when run 24x7. Anecdotal, but... In 2008 I bought four 24x7 drives (500GB) and eight regular drives to be used in raid. Out of the eight regular drives, six failed before 4 years was up. All of the 24x7 drives are still in use (although I don't remember which machine(s) they're in now), six years later. All Seagate. I initially did do warranty replacement on the failed drives (all drives had 5 year warranty back then), and out of the six replacements, four failed a little over three months in. At that point I went and bought a real battery backed raid card (computer still has a UPS) with WD enterprise drives and no hiccups of any kind in about two years. And disk performance is way, way up. Dan Curious. I hope I don't start a flame war here. I have had WD, Seagate and I think there is a Samsung here somewhere, may be the one that is rolling over on its back now. The one drive that failed a few years ago was a WD drive. That said, all the other WD drives I have had just got to small to really use, and slow when SATA came out. I'm partial to WD and Seagate still since I got good long term use out of those. Based on your experience, you tend to be of the same opinion? Allan, your situation should involve a lot of hard drives. Any thoughts? Neil, you have a nice big opinion on this? I realize that any brand of drive will break eventually. That's one reason I don't hold the one failure I have had against WD. I got a lot of use out of that drive and it did let me know it was going to die, like real soon. I'm going to duck now. :/ Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
Rich Freeman wrote: On Wed, Jun 25, 2014 at 10:07 PM, Dale rdalek1...@gmail.com wrote: Rich Freeman wrote: I don't have anything on the cloud to backup too. That would likely be a good idea but I can't afford anything pricey, which is why I hadn't bought a backup drive before now either. Plus, something I'd prefer to keep under my thumb. Heck, some things here are encrypted, bank info and such. Also, while I have DSL, it ain't real speedy. Backing up that much data over my connection could take a while, like days, maybe even a week or more. I put my backups on Amazon S3 reduced-redundancy - it is a few cents per GB per month. I think I have something like 20-30GB backed up. Oh, if you need to actually retrieve it that will cost you 10 cents per GB, but frankly if my house burned down that would be the least of my concerns. I'd only use the cloud to back up critical data. If you want to back up your mythtv and mp3 collection, then you're going to be uploading a LOT of data and paying quite a bit to store it. If you want to be storing TB of data offsite there are better ways of doing it. Outside my camera pics, I don't think I have anything that critical. I backed them up on 7 DVDs yesterday. I been doing that for many years. Two sets just to be sure. I also rotate the DVDs after a while too. I burn sysrescue ISOs to it or something. The advantage of something like S3 is that it is always there, which means you stick a duplicity script in your crontab and just periodically check up on it. You don't have to remember to do your backups. It just isn't practical to use it for more than a few dozen GB depending on your incremental strategy. I also have a 50Mbps outbound connection, which doesn't hurt. Downstream Rate 1536 (Kbits/Sec) Upstream Rate 384 (Kbits/Sec) While it ain't super fast, it beats dial-up and I remember those days very well. Still pretty slow to do backups over tho. :/ Your next best option is to find a friend with similar needs and give each other a place to upload your encrypted backups to. That will just cost you drive space, but if you're both planning on backing up 1TB of data it will still cost you the one-time drive purchase. If you want a quick cloud-capable backup solution, I'd look at duplicity. I just wish it had options for Google Drive (it supposedly does, but as far as I can tell it doesn't work, at least not with a two factor application password). Rich I'm just going to try and buy another 3TB drive as soon as I can. I may even make it into a removable thingy. Then I can make backups and just put it in a outbuilding. By the way, my outbuilding is pretty far from the house. A house fire wouldn't hurt it any. I got so much junk in there, a thief would shake his head and leave empty handed. May even cry at the thought of it. Working up a plan and hoping to work the plan. While at it. Latest test results. It finished a bit ago. root@fireball / # smartctl -l selftest /dev/sdc smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_DescriptionStatus Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offlineCompleted: read failure 60% 16394 2905482560 # 2 Extended offlineCompleted: read failure 60% 16389 2905482560 It is still rolling over. It should throw up its feet any day now. :-( Dale :-) :-)
Re: [gentoo-user] smartctrl drive error @60%
On Thursday 26 Jun 2014 03:15:54 Dale wrote: Neil Bothwick wrote: I have to say, I dread setting up a mail server about as bad as I dread going to the Doctor. It's just something I really don't want to add to my system unless I have to. Which is why I suggesting something like ssmtp, which you can't call a server, it just forwards. Often the only configuration needed is changing one line in ssmtp.conf, to the address of your ISP's mail server. That's it, now any program can send mail using sendmail and it just goes to your ISP mailbox. I like this part: Extremely simple MTA to get mail off the system to a Mailhub ^ That part right up there. :-D That may be a new thread, if needed. Try this basic setup in your /etc/ssmtp/ssmtp.conf: root=d...@gmail.com #Change to your preferred email address mailhub=smtp.gmail.com:465 #Could also use port 587 for STARTTLS rewriteDomain=dales_smoker.shack #Something to denote your machine's name FromLineOverride=YES UseTLS=YES #Can also try UseSTARTTLS=YES as an alternative AuthUser=d...@gmail.com AuthPass=dalesgmails3cr3tpasswd #Special characters seem to barf with ssmtp Sort out access rights to 0604, since it now contains your mail passwd unencrypted: # ls -la /etc/ssmtp/ssmtp.conf -rw-r- 1 root ssmtp 1696 May 19 23:40 /etc/ssmtp/ssmtp.conf Add this in your /etc/ssmtp/revaliases: === root:d...@gmail.com:smtp.gmail.com:465 dale:d...@gmail.com:smtp.gmail.com:465 other_user:d...@gmail.com:smtp.gmail.com:465 === Then ping a message to yourself as a test to see that all works fine: echo My first test message | mail -v -s Test for sSMTP 1 d...@gmail.com It should then appear in your gmail account (Sent folder). Set a label/filter to find such messages easily and you're done. -- Regards, Mick signature.asc Description: This is a digitally signed message part.
[gentoo-user] smartctrl drive error @60%
Howdy, I run this test every once in a while. How bad is this: root@fireball / # smartctl -l selftest /dev/sdc smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_DescriptionStatus Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offlineCompleted: read failure 60% 16365 2905482560 # 2 Extended offlineCompleted: read failure 60% 16352 2905482560 # 3 Extended offlineCompleted without error 00% 8044 - # 4 Extended offlineCompleted without error 00% 3121 - And better yet, is there any way to tell it to not use that part and finish the test? It seems it stopped when it got to that, or I think it did. Thoughts? Dale :-) :-) -- I am only responsible for what I said ... Not for what you understood or how you interpreted my words!
Re: [gentoo-user] smartctrl drive error @60%
On 25 June 2014 01:09:03 CEST, Dale rdalek1...@gmail.com wrote: Howdy, I run this test every once in a while. How bad is this: root@fireball / # smartctl -l selftest /dev/sdc smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_DescriptionStatus Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offlineCompleted: read failure 60% 16365 2905482560 # 2 Extended offlineCompleted: read failure 60% 16352 2905482560 # 3 Extended offlineCompleted without error 00% 8044 - # 4 Extended offlineCompleted without error 00% 3121 - And better yet, is there any way to tell it to not use that part and finish the test? It seems it stopped when it got to that, or I think it did. Thoughts? Dale :-) :-) Dale, Not sure how to get it to go past. Think that is in the firmware of the disk. I would start with making a backup first. -- Joost -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: [gentoo-user] smartctrl drive error @60%
J. Roeleveld wrote: On 25 June 2014 01:09:03 CEST, Dale rdalek1...@gmail.com wrote: Howdy, I run this test every once in a while. How bad is this: root@fireball / # smartctl -l selftest /dev/sdc smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_DescriptionStatus Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offlineCompleted: read failure 60% 16365 2905482560 # 2 Extended offlineCompleted: read failure 60% 16352 2905482560 # 3 Extended offlineCompleted without error 00% 8044 - # 4 Extended offlineCompleted without error 00% 3121 - And better yet, is there any way to tell it to not use that part and finish the test? It seems it stopped when it got to that, or I think it did. Thoughts? Dale :-) :-) Dale, Not sure how to get it to go past. Think that is in the firmware of the disk. I would start with making a backup first. -- Joost That's a 3TB drive. I don't have anything big enough to back it up to. Is there anyway to find out if this error is really serious or just a run of the mill type error? I would think that if it was a run of the mill error the drive would handle the error itself and I wouldn't even see it. Something like marking the area as bad and just not trying to use it anymore, even for the test. Thanks. Any advice is appreciated. I need a hard drive guru. ;-) Here is additional info: root@fireball / # hdparm -i /dev/sdc /dev/sdc: Model=ST3000DM001-9YN166, FwRev=CC4C, SerialNo=Z1F0PKT5 Config={ HardSect NotMFM HdSw15uSec Fixed DTR10Mbs RotSpdTol.5% } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4 BuffType=unknown, BuffSize=unknown, MaxMultSect=16, MultSect=16 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=5860533168 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6 AdvancedPM=yes: unknown setting WriteCache=enabled Drive conforms to: unknown: ATA/ATAPI-4,5,6,7 * signifies the current active mode root@fireball / # Dale :-) :-) -- I am only responsible for what I said ... Not for what you understood or how you interpreted my words!