Re: [gentoo-user] smartctrl drive error @60%

2014-07-01 Thread J. Roeleveld
On Tuesday, July 01, 2014 06:52:10 AM Mick wrote:
 On Sunday 29 Jun 2014 13:05:04 Rich Freeman wrote:
  On Sun, Jun 29, 2014 at 12:44 AM, Dale rdalek1...@gmail.com wrote:
   What if I copied data to the drive until it was just about full.  I'm
   thinking like maybe 90 or 95% or so.  If I do that and run the test
   every few days, would it then catch a error after a few weeks or so of
   testing?  I realize no one knows with 100% certainty...
  
  As you already said, nobody knows with 100% certainty.
  
  In the failures I've experienced I'd expect it to start catching
  errors within a few days.  However, on those drives the relocated
  sector count never increases, which suggests that the firmware never
  relocated those sectors when overwritten, which seems brain-dead to
  me.
  
  If the drive relocates the sectors, then conceivably it could go quite
  a long time until having errors, probably in an entirely different set
  of sectors.
  
  Even if it doesn't relocate, the reliability of the bad sectors could
  be high or low.
  
  Rich
 
 What triggers a relocation?  I also have a drive which shows a sector
 relocation pending, but for a few days now and after some tests that showed
 no errors, it won't relocate it.

I think a write to that sector should force a relocation.

--
Joost



Re: [gentoo-user] smartctrl drive error @60%

2014-07-01 Thread Dale
J. Roeleveld wrote:
 On Tuesday, July 01, 2014 06:52:10 AM Mick wrote:
 On Sunday 29 Jun 2014 13:05:04 Rich Freeman wrote:
 On Sun, Jun 29, 2014 at 12:44 AM, Dale rdalek1...@gmail.com wrote:
 What if I copied data to the drive until it was just about full.  I'm
 thinking like maybe 90 or 95% or so.  If I do that and run the test
 every few days, would it then catch a error after a few weeks or so of
 testing?  I realize no one knows with 100% certainty...
 As you already said, nobody knows with 100% certainty.

 In the failures I've experienced I'd expect it to start catching
 errors within a few days.  However, on those drives the relocated
 sector count never increases, which suggests that the firmware never
 relocated those sectors when overwritten, which seems brain-dead to
 me.

 If the drive relocates the sectors, then conceivably it could go quite
 a long time until having errors, probably in an entirely different set
 of sectors.

 Even if it doesn't relocate, the reliability of the bad sectors could
 be high or low.

 Rich
 What triggers a relocation?  I also have a drive which shows a sector
 relocation pending, but for a few days now and after some tests that showed
 no errors, it won't relocate it.
 I think a write to that sector should force a relocation.

 --
 Joost



I think you are right Joost.  I should have tried some fixes that COULD
be destructive to see if a) it fixes it and b) the data lives, other
than the bad part at least.  I forgot to do that and really wasn't sure
how to do it either.  One person posted a lot of info about it but it
was a bit deep for me.  It would have required some reading and because
of health issues, I can't tackle that much at one time right now. 

What I did tho.  I got the new drive, rsynced the data from old drive to
new drive.  Removed the LVM stuff from the old drive.  I used dd to
erase the whole old drive, which took a while for 3TBs.  o_O  After
that, I ran the test.  It came back fine.  Check out this snippet:

SMART Self-test log structure revision number 1
Num  Test_DescriptionStatus  Remaining 
LifeTime(hours)  LBA_of_first_error
# 1  Short offline   Completed without error   00%
16499 -
# 2  Extended offlineCompleted without error   00%
16498 -
# 3  Short offline   Completed without error   00%
16475 -
# 4  Extended offlineCompleted without error   00%
16466 -
# 5  Extended offlineAborted by host   90%
16461 -
# 6  Extended offlineCompleted: read failure   60%
16451 2905482560
# 7  Extended offlineCompleted: read failure   60%
16432 2905482560
# 8  Extended offlineCompleted: read failure   60%
16427 2905482560
# 9  Extended offlineCompleted: read failure   60%
16394 2905482560
#10  Extended offlineCompleted: read failure   60%
16389 2905482560
#11  Short offline   Completed without error   00%
16380 -
#12  Extended offlineCompleted: read failure   60%
16365 2905482560
#13  Extended offlineCompleted: read failure   60%
16352 2905482560
#14  Extended offlineCompleted without error   00% 
8044 -
#15  Extended offlineCompleted without error   00% 
3121 -
#16  Extended offlineCompleted without error   00% 
1548 -
#17  Short offline   Completed without error   00% 
1141 -
#18  Extended offlineCompleted without error   00%  
719 -
#19  Extended offlineCompleted without error   00%  
525 -
#20  Short offline   Completed without error   00%  
516 -
#21  Extended offlineCompleted without error   00%   
18 -
7 of 7 failed self-tests are outdated by newer successful extended
offline self-test # 2
 
Note the very last line.  You can see all the failures but the last line
says the drive is good to go since the drive passed after the bad ones. 
So, while I'm not holding my breath, that is what SMART says.  It may
blow smoke and make horrible noises next week but right now, it says it
is OK. 

In the end, it seems something has to write to that specific sector and
then the drive will reallocate/move/whatever so that the bad part isn't
used anymore.  It seems dd did that but I bet there are other tools that
could do it without losing data other than what is in the bad spot of
course.  That's my simple idea at least. 

Hope that helps.  I wish I could have done the other stuff and kept
notes on commands and such and then post the results.  That MAY have
helped someone in the future.  My brain ain't what it used to be.  ;-)

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-07-01 Thread Dale
Frank Steinmetzger wrote:
 On Wed, Jun 25, 2014 at 11:57:55PM -0500, Dale wrote:

 I'm just going to try and buy another 3TB drive as soon as I can.  I may
 even make it into a removable thingy.  Then I can make backups and just
 put it in a outbuilding.  By the way, my outbuilding is pretty far from
 For such use, I am planning to get an external SATA dock, rather than use
 a “removable thingy”. You pop in the naked drive and stow that away after
 you’re done. This has multiple advantages. For one, you can read SMART data
 from the external drives, provided you use a SATA connection. Cou can’t do
 that over USB. So if your board has an eSATA connector, get an external dock
 such as http://www.sharkoon.com/?q=en/node/1277.
 If it doesn’t, you can get an internal one for a 5¼″ slot in the front of
 your case, like http://www.sharkoon.com/?q=en/node/1281 for one 3.5″ HDD, or
 http://www.sharkoon.com/?q=en/content/sata-qp-intern-multi for both big and
 small HDDs (plus some USB3 connectors, if your case doesn’t have them but
 your board provides a header).

 Those docks are only slightly more expensive than an external case, but you
 can use them for all your drives, not just a single one, and you have no
 hassle with countless external power supplies (“which one did go into which
 case?”).

 PS: I’m not saying you should get a Sharkoon, after all I haven’t bought it
 yet myself. But their site shows nicely what’s available and gives me ideas.
 --
 Gruß | Greetings | Qapla’
 Please do not share anything from, with or about me on any social network.

 Please don’t befuddle me with facts, my mind is set.

I been wanting to get me something external but hadn't got around to
looking yet.  I didn't know they have a SATA version.  I plan to avoid
USB if I can.  From my understanding, eSATA can be hotplugged and I have
a couple of those connections. 

Thanks for the info.  It put me on a different path. 

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-07-01 Thread Helmut Jarausch

On 07/01/2014 10:30:44 AM, Dale wrote:


I been wanting to get me something external but hadn't got around to
looking yet.  I didn't know they have a SATA version.  I plan to avoid
USB if I can.  From my understanding, eSATA can be hotplugged and I  
have

a couple of those connections.



I 'believe' that eSATA is dead - just look how few products are  
available.

I have 3 USB-3 drives which are very fast (more than 100 MB/sec) and
if your motherboard doesn't have an USB-3 adapter, yet, it's very cheap.

Helmut.




Re: [gentoo-user] smartctrl drive error @60%

2014-07-01 Thread Dale
Helmut Jarausch wrote:
 On 07/01/2014 10:30:44 AM, Dale wrote:

 I been wanting to get me something external but hadn't got around to
 looking yet.  I didn't know they have a SATA version.  I plan to avoid
 USB if I can.  From my understanding, eSATA can be hotplugged and I have
 a couple of those connections.


 I 'believe' that eSATA is dead - just look how few products are
 available.
 I have 3 USB-3 drives which are very fast (more than 100 MB/sec) and
 if your motherboard doesn't have an USB-3 adapter, yet, it's very cheap.

 Helmut

I do have a few USB3 connectors.  I just figured USB would be a good bit
slower.  Plus, can USB power a 3.5 hard drive nowadays? 

root@fireball / # hdparm -tT /dev/sdb

/dev/sdb:
 Timing cached reads:   6604 MB in  2.00 seconds = 3303.39 MB/sec
 Timing buffered disk reads: 542 MB in  3.01 seconds = 180.33 MB/sec
root@fireball / #

I did find a eSATA enclosure on Newegg and the price wasn't bad.  It's
one option I guess. 

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-07-01 Thread Helmut Jarausch

On 07/01/2014 10:58:45 AM, Dale wrote:

Helmut Jarausch wrote:
 On 07/01/2014 10:30:44 AM, Dale wrote:

 I been wanting to get me something external but hadn't got around  
to
 looking yet.  I didn't know they have a SATA version.  I plan to  
avoid
 USB if I can.  From my understanding, eSATA can be hotplugged and  
I have

 a couple of those connections.


 I 'believe' that eSATA is dead - just look how few products are
 available.
 I have 3 USB-3 drives which are very fast (more than 100 MB/sec) and
 if your motherboard doesn't have an USB-3 adapter, yet, it's very  
cheap.


 Helmut

I do have a few USB3 connectors.  I just figured USB would be a good  
bit

slower.  Plus, can USB power a 3.5 hard drive nowadays?


Probably not. All of my external USB3 disks have a separate power  
supply.


root@fireball / # hdparm -tT /dev/sdb

/dev/sdb:
 Timing cached reads:   6604 MB in  2.00 seconds = 3303.39 MB/sec
 Timing buffered disk reads: 542 MB in  3.01 seconds = 180.33 MB/sec
root@fireball / #


Try a real life example like dd. I have seen the above mentioned speed
on disks with a file system on it which does limit the speed anyway.



I did find a eSATA enclosure on Newegg and the price wasn't bad.  It's
one option I guess.

Dale

:-)  :-)








Re: [gentoo-user] smartctrl drive error @60%

2014-07-01 Thread J. Roeleveld
On Tuesday, July 01, 2014 11:06:59 AM Helmut Jarausch wrote:
 On 07/01/2014 10:58:45 AM, Dale wrote:
  Helmut Jarausch wrote:
   On 07/01/2014 10:30:44 AM, Dale wrote:
   I been wanting to get me something external but hadn't got around
  
  to
  
   looking yet.  I didn't know they have a SATA version.  I plan to
  
  avoid
  
   USB if I can.  From my understanding, eSATA can be hotplugged and
  
  I have
  
   a couple of those connections.
   
   I 'believe' that eSATA is dead - just look how few products are
   available.
   I have 3 USB-3 drives which are very fast (more than 100 MB/sec) and
   if your motherboard doesn't have an USB-3 adapter, yet, it's very
  
  cheap.
  
   Helmut
  
  I do have a few USB3 connectors.  I just figured USB would be a good
  bit
  slower.  Plus, can USB power a 3.5 hard drive nowadays?
 
 Probably not. All of my external USB3 disks have a separate power
 supply.

I only know of 2.5 USB-drivers that are powered via the same USB-cable.
Never seen 3.5 ones that are USB-powered.

I use 2.5 drives for my backups, as they are designed for laptop use, I have 
the feeling they are a bit more robust when it comes to accidental bumps.

  root@fireball / # hdparm -tT /dev/sdb
  
  /dev/sdb:
   Timing cached reads:   6604 MB in  2.00 seconds = 3303.39 MB/sec
   Timing buffered disk reads: 542 MB in  3.01 seconds = 180.33 MB/sec
  
  root@fireball / #
 
 Try a real life example like dd. I have seen the above mentioned speed
 on disks with a file system on it which does limit the speed anyway.

+1

--
Joost



Re: [gentoo-user] smartctrl drive error @60%

2014-07-01 Thread Dale
J. Roeleveld wrote:
 On Tuesday, July 01, 2014 11:06:59 AM Helmut Jarausch wrote:
 On 07/01/2014 10:58:45 AM, Dale wrote:
 Probably not. All of my external USB3 disks have a separate power
 supply. 
 I only know of 2.5 USB-drivers that are powered via the same USB-cable.
 Never seen 3.5 ones that are USB-powered.

 I use 2.5 drives for my backups, as they are designed for laptop use, I have 
 the feeling they are a bit more robust when it comes to accidental bumps.

I thought those things looked like they had their own power.  Neat. 



 root@fireball / # hdparm -tT /dev/sdb

 /dev/sdb:
  Timing cached reads:   6604 MB in  2.00 seconds = 3303.39 MB/sec
  Timing buffered disk reads: 542 MB in  3.01 seconds = 180.33 MB/sec

 root@fireball / #
 Try a real life example like dd. I have seen the above mentioned speed
 on disks with a file system on it which does limit the speed anyway.
 +1

 --
 Joost



I watched the dd process when I was erasing the old drive.  I got about
the same results.  It started out a little over 200 and went as low as
170 or so close to the end.  On average, about what hdparm shows.  Close
enough it seems.  ;-)

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-07-01 Thread J. Roeleveld
On Tuesday, July 01, 2014 04:21:45 AM Dale wrote:
 J. Roeleveld wrote:
  root@fireball / # hdparm -tT /dev/sdb
  
  /dev/sdb:
   Timing cached reads:   6604 MB in  2.00 seconds = 3303.39 MB/sec
   Timing buffered disk reads: 542 MB in  3.01 seconds = 180.33 MB/sec
  
  root@fireball / #
  
  Try a real life example like dd. I have seen the above mentioned speed
  on disks with a file system on it which does limit the speed anyway.
  
  +1
  
  --
  Joost
 
 I watched the dd process when I was erasing the old drive.  I got about
 the same results.  It started out a little over 200 and went as low as
 170 or so close to the end.  On average, about what hdparm shows.  Close
 enough it seems.  ;-)

Yep, but do the same after adding a filesystem to the mix?
Eg. mount it somewhere, then dd to a file on that drive.

--
Joost



Re: [gentoo-user] smartctrl drive error @60%

2014-07-01 Thread Rich Freeman
On Tue, Jul 1, 2014 at 2:09 AM, J. Roeleveld jo...@antarean.org wrote:
 On Tuesday, July 01, 2014 06:52:10 AM Mick wrote:

 What triggers a relocation?  I also have a drive which shows a sector
 relocation pending, but for a few days now and after some tests that showed
 no errors, it won't relocate it.

 I think a write to that sector should force a relocation.


In theory either a write to that sector or a successful read should
trigger a relocation.

In practice, I've never seen a drive actually do this - maybe I just
manage to pick drives with braindead firmware.  When I write to a
pending sector, the pending sector count goes down, but the relocated
sector count doesn't change, and usually in a few days I have another
pending sector.

The last time I had a drive fail I was running md raid, so a scrub
fixed all the pending sectors automatically, but the drive firmware
wasn't doing its part to relocate them.  Either that, or the drive had
run out of spare sectors and wasn't reporting this via SMART.

Rich



Re: [gentoo-user] smartctrl drive error @60%

2014-07-01 Thread Alan McKinnon
On 01/07/2014 07:52, Mick wrote:
 On Sunday 29 Jun 2014 13:05:04 Rich Freeman wrote:
 On Sun, Jun 29, 2014 at 12:44 AM, Dale rdalek1...@gmail.com wrote:
 What if I copied data to the drive until it was just about full.  I'm
 thinking like maybe 90 or 95% or so.  If I do that and run the test
 every few days, would it then catch a error after a few weeks or so of
 testing?  I realize no one knows with 100% certainty...

 As you already said, nobody knows with 100% certainty.

 In the failures I've experienced I'd expect it to start catching
 errors within a few days.  However, on those drives the relocated
 sector count never increases, which suggests that the firmware never
 relocated those sectors when overwritten, which seems brain-dead to
 me.

 If the drive relocates the sectors, then conceivably it could go quite
 a long time until having errors, probably in an entirely different set
 of sectors.

 Even if it doesn't relocate, the reliability of the bad sectors could
 be high or low.

 Rich
 
 What triggers a relocation?  I also have a drive which shows a sector 
 relocation pending, but for a few days now and after some tests that showed 
 no 
 errors, it won't relocate it.
 


it's triggered by a write to the sector



-- 
Alan McKinnon
alan.mckin...@gmail.com




Re: [gentoo-user] smartctrl drive error @60%

2014-07-01 Thread Dale
J. Roeleveld wrote:
 On Tuesday, July 01, 2014 04:21:45 AM Dale wrote:

 I watched the dd process when I was erasing the old drive.  I got about
 the same results.  It started out a little over 200 and went as low as
 170 or so close to the end.  On average, about what hdparm shows.  Close
 enough it seems.  ;-)
 Yep, but do the same after adding a filesystem to the mix?
 Eg. mount it somewhere, then dd to a file on that drive.

 --
 Joost



I've only ever use dd to blank a drive.  I never used it to copy
anything.  While dd may be a bit faster in my use, having a file system
is a more realistic use. I think a file system would slow things down a
bit, maybe not much since file systems are pretty fast nowadays.  Thing
is, I'm fairly sure USB won't be as fast as a straight SATA connection. 
That is one reason I would rather use SATA connections instead.  That
was also the reason I posted that info.  It shows that on my rig here, I
can likely copy faster than USB with a SATA connection.  The speed I
posted is a good bit faster than what Helmut posted even tho his was a
general amount.  Unless Helmut has a older, slower machine then I
wouldn't expect mine to be much if any faster than his.  Basically, USB
would be a bottleneck that I might can avoid and my mobo supports eSATA
connections.  . 

I'm not trying to benchmark, just give a general idea.  What hdparm
gives me is pretty close to what dd was giving and not to far off from
what I get when doing a copy with cp or rsync.  I been doing a good bit
of copying here lately.  I do have a drive that is the older SATA but
most are the newer and faster SATA. 

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-06-30 Thread Mick
On Sunday 29 Jun 2014 13:05:04 Rich Freeman wrote:
 On Sun, Jun 29, 2014 at 12:44 AM, Dale rdalek1...@gmail.com wrote:
  What if I copied data to the drive until it was just about full.  I'm
  thinking like maybe 90 or 95% or so.  If I do that and run the test
  every few days, would it then catch a error after a few weeks or so of
  testing?  I realize no one knows with 100% certainty...
 
 As you already said, nobody knows with 100% certainty.
 
 In the failures I've experienced I'd expect it to start catching
 errors within a few days.  However, on those drives the relocated
 sector count never increases, which suggests that the firmware never
 relocated those sectors when overwritten, which seems brain-dead to
 me.
 
 If the drive relocates the sectors, then conceivably it could go quite
 a long time until having errors, probably in an entirely different set
 of sectors.
 
 Even if it doesn't relocate, the reliability of the bad sectors could
 be high or low.
 
 Rich

What triggers a relocation?  I also have a drive which shows a sector 
relocation pending, but for a few days now and after some tests that showed no 
errors, it won't relocate it.

-- 
Regards,
Mick


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-user] smartctrl drive error @60%

2014-06-29 Thread Mick
On Sunday 29 Jun 2014 05:44:38 Dale wrote:
 Rich Freeman wrote:
  On Sat, Jun 28, 2014 at 11:27 PM, Dale rdalek1...@gmail.com wrote:
  So, thoughts?  Did it mark that part as bad and all is well or is this
  going to be trouble down the line?  Should I just fill the thing up with
  data and test the stuffin out of it to make sure?
  
  That is pretty typical.  You wrote to every sector on the drive.  You
  don't need to be able to read a sector to overwrite it, so doing this
  cleared out the drive's list of offline uncorrectable sectors.  If
  you're fortunate it relocated those sectors in which case the drive is
  only using good sectors now.  It can't relocate a sector unless it
  either gets a successful read, or it is overwritten, and you overwrote
  them.
  
  Either way the extended offline test passing isn't unusual.  Either it
  relocated the sectors in which case the drive is completely good or
  the data written to the bad sectors was readable when the test was
  run, which doesn't guarantee that it will still be readable a
  day/week/month/year from now.
  
  Unfortunately I don't think there is any way to find out what the
  firmware is doing, or to predict the likelihood of another failure.
  The only thing we can say for sure that like all hard drives, it WILL
  fail sometime.
  
  Rich
 
 What if I copied data to the drive until it was just about full.  I'm
 thinking like maybe 90 or 95% or so.  If I do that and run the test
 every few days, would it then catch a error after a few weeks or so of
 testing?  I realize no one knows with 100% certainty but I would like to
 backup my data say every couple weeks just in case.  If the drive works,
 fine.  If it fails, well, it wouldn't be the first time and it won't be
 a primary drive so no big loss.
 
 I got to find me a good drive for backups tho.  I'm waiting on a good
 sale of a brand other than Seagate tho.  That should help keep two
 drives from failing at the same time.  Well, a little anyway.  I think
 it is called Dale's Law now.  ;-)

I'm not sure what it is called, but it seems infectious!  I have a drive (in a 
laptop) which I recently zeroed out with dd and fsck -c for good measure, 
before I installed gentoo on it.  Yesterday, I tried a long test, but it won't 
complete.  It reached 10% remaining and it stayed there for a few hours.  I 
will repeat the test to see if it gets through this time, but I am worried 
that it's on its way out.

Oh well, I may install an SSD if it fails.

-- 
Regards,
Mick


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-user] smartctrl drive error @60%

2014-06-29 Thread Dale
Mick wrote:
 On Sunday 29 Jun 2014 05:44:38 Dale wrote:


 What if I copied data to the drive until it was just about full.  I'm
 thinking like maybe 90 or 95% or so.  If I do that and run the test
 every few days, would it then catch a error after a few weeks or so of
 testing?  I realize no one knows with 100% certainty but I would like to
 backup my data say every couple weeks just in case.  If the drive works,
 fine.  If it fails, well, it wouldn't be the first time and it won't be
 a primary drive so no big loss.

 I got to find me a good drive for backups tho.  I'm waiting on a good
 sale of a brand other than Seagate tho.  That should help keep two
 drives from failing at the same time.  Well, a little anyway.  I think
 it is called Dale's Law now.  ;-)

 I'm not sure what it is called, but it seems infectious!  I have a
drive (in a
 laptop) which I recently zeroed out with dd and fsck -c for good measure,
 before I installed gentoo on it.  Yesterday, I tried a long test, but
it won't
 complete.  It reached 10% remaining and it stayed there for a few
hours.  I
 will repeat the test to see if it gets through this time, but I am
worried
 that it's on its way out.

 Oh well, I may install an SSD if it fails.


That's seems to be normal at least for me.  Mine has certain percentages
that it just seems to sit at for a good while.  It eventually passes the
test tho.  Just leave it overnight and check it the next morning or
something.  I know laptops are different but got to do what you got to
do.  Maybe pluging it into a desktop or something would help.

Dale

:-)  :-)



Re: [gentoo-user] smartctrl drive error @60%

2014-06-29 Thread Mick
On Sunday 29 Jun 2014 09:42:39 Dale wrote:
 Mick wrote:
  On Sunday 29 Jun 2014 05:44:38 Dale wrote:
  What if I copied data to the drive until it was just about full.  I'm
  thinking like maybe 90 or 95% or so.  If I do that and run the test
  every few days, would it then catch a error after a few weeks or so of
  testing?  I realize no one knows with 100% certainty but I would like to
  backup my data say every couple weeks just in case.  If the drive works,
  fine.  If it fails, well, it wouldn't be the first time and it won't be
  a primary drive so no big loss.
  
  I got to find me a good drive for backups tho.  I'm waiting on a good
  sale of a brand other than Seagate tho.  That should help keep two
  drives from failing at the same time.  Well, a little anyway.  I think
  it is called Dale's Law now.  ;-)
  
  I'm not sure what it is called, but it seems infectious!  I have a
 
 drive (in a
 
  laptop) which I recently zeroed out with dd and fsck -c for good measure,
  before I installed gentoo on it.  Yesterday, I tried a long test, but
 
 it won't
 
  complete.  It reached 10% remaining and it stayed there for a few
 
 hours.  I
 
  will repeat the test to see if it gets through this time, but I am
 
 worried
 
  that it's on its way out.
  
  Oh well, I may install an SSD if it fails.
 
 That's seems to be normal at least for me.  Mine has certain percentages
 that it just seems to sit at for a good while.  It eventually passes the
 test tho.  Just leave it overnight and check it the next morning or
 something.  I know laptops are different but got to do what you got to
 do.  Maybe pluging it into a desktop or something would help.

I've restarted it and will leave it all day today to see what gives.

-- 
Regards,
Mick


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-user] smartctrl drive error @60%

2014-06-29 Thread Rich Freeman
On Sun, Jun 29, 2014 at 12:44 AM, Dale rdalek1...@gmail.com wrote:

 What if I copied data to the drive until it was just about full.  I'm
 thinking like maybe 90 or 95% or so.  If I do that and run the test
 every few days, would it then catch a error after a few weeks or so of
 testing?  I realize no one knows with 100% certainty...

As you already said, nobody knows with 100% certainty.

In the failures I've experienced I'd expect it to start catching
errors within a few days.  However, on those drives the relocated
sector count never increases, which suggests that the firmware never
relocated those sectors when overwritten, which seems brain-dead to
me.

If the drive relocates the sectors, then conceivably it could go quite
a long time until having errors, probably in an entirely different set
of sectors.

Even if it doesn't relocate, the reliability of the bad sectors could
be high or low.

Rich



Re: [gentoo-user] smartctrl drive error @60%

2014-06-29 Thread Dale
Rich Freeman wrote:
 On Sun, Jun 29, 2014 at 12:44 AM, Dale rdalek1...@gmail.com wrote:
 What if I copied data to the drive until it was just about full.  I'm
 thinking like maybe 90 or 95% or so.  If I do that and run the test
 every few days, would it then catch a error after a few weeks or so of
 testing?  I realize no one knows with 100% certainty...
 As you already said, nobody knows with 100% certainty.

 In the failures I've experienced I'd expect it to start catching
 errors within a few days.  However, on those drives the relocated
 sector count never increases, which suggests that the firmware never
 relocated those sectors when overwritten, which seems brain-dead to
 me.

 If the drive relocates the sectors, then conceivably it could go quite
 a long time until having errors, probably in an entirely different set
 of sectors.

 Even if it doesn't relocate, the reliability of the bad sectors could
 be high or low.

 Rich



Yep.  I guess the best thing to do is test the stuffin out of it and
hope the tests don't wear it out.  lol 

As I told my ex more than once, time tells. 

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-06-29 Thread Frank Steinmetzger
On Wed, Jun 25, 2014 at 11:57:55PM -0500, Dale wrote:

 I'm just going to try and buy another 3TB drive as soon as I can.  I may
 even make it into a removable thingy.  Then I can make backups and just
 put it in a outbuilding.  By the way, my outbuilding is pretty far from

For such use, I am planning to get an external SATA dock, rather than use
a “removable thingy”. You pop in the naked drive and stow that away after
you’re done. This has multiple advantages. For one, you can read SMART data
from the external drives, provided you use a SATA connection. Cou can’t do
that over USB. So if your board has an eSATA connector, get an external dock
such as http://www.sharkoon.com/?q=en/node/1277.
If it doesn’t, you can get an internal one for a 5¼″ slot in the front of
your case, like http://www.sharkoon.com/?q=en/node/1281 for one 3.5″ HDD, or
http://www.sharkoon.com/?q=en/content/sata-qp-intern-multi for both big and
small HDDs (plus some USB3 connectors, if your case doesn’t have them but
your board provides a header).

Those docks are only slightly more expensive than an external case, but you
can use them for all your drives, not just a single one, and you have no
hassle with countless external power supplies (“which one did go into which
case?”).

PS: I’m not saying you should get a Sharkoon, after all I haven’t bought it
yet myself. But their site shows nicely what’s available and gives me ideas.
--
Gruß | Greetings | Qapla’
Please do not share anything from, with or about me on any social network.

Please don’t befuddle me with facts, my mind is set.


signature.asc
Description: Digital signature


Re: [gentoo-user] smartctrl drive error @60%

2014-06-28 Thread Neil Bothwick
On Sat, 28 Jun 2014 08:48:40 +0100, Mick wrote:

   I would think that your ISP providers in the US will be blocking
   outgoing port 25 to stop compromised MSWindows machines spamming the
   rest of us.  If you use my suggestion there shouldn't be a
   problem.  
  
  It makes no difference whether you address it directly to your ISP
  address or via an alias. The ISP won't block port 25 connections to
  its own servers from its own customers, otherwise none of them could
  send email at all!  
 
 In the US many big players are blocking outbound port 25 for their
 customers as a blanket measure to control spam from botnets, e.g.:
 
 http://www.verizon.com/Support/Residential/internet/highspeed/general+support/top+questions/questionsone/124274.htm

 If Dale uses the ssmtp.conf I sent he will be using a different port +
 TLS encryption and should not have a problem.

Yes, that makes sense. I thought you were referring to the aliases part
of the config. Using TLS or  a different port if that's what the ISP needs
is perfectly logical.


-- 
Neil Bothwick

Bus: (n.) a connector you plug money into, something like a slot machine.


signature.asc
Description: PGP signature


Re: [gentoo-user] smartctrl drive error @60%

2014-06-28 Thread Mick
On Friday 27 Jun 2014 21:54:32 Neil Bothwick wrote:
 On Fri, 27 Jun 2014 14:22:09 +0100, Mick wrote:

  I would think that your ISP providers in the US will be blocking
  outgoing port 25 to stop compromised MSWindows machines spamming the
  rest of us.  If you use my suggestion there shouldn't be a problem.
 
 It makes no difference whether you address it directly to your ISP
 address or via an alias. The ISP won't block port 25 connections to its
 own servers from its own customers, otherwise none of them could send
 email at all!

In the US many big players are blocking outbound port 25 for their customers 
as a blanket measure to control spam from botnets, e.g.:

http://www.verizon.com/Support/Residential/internet/highspeed/general+support/top+questions/questionsone/124274.htm

If Dale uses the ssmtp.conf I sent he will be using a different port + TLS 
encryption and should not have a problem.

Even if Dale's ISP does not block port 25 for connections to the ISP's *own* 
mail servers, it may well block it to other providers' mail addresses for the 
same reason.  This was a common practice some years back (pre-Gmail) when ISP 
had started charging for mail services.

-- 
Regards,
Mick


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-user] smartctrl drive error @60%

2014-06-28 Thread Dale
Mick wrote:
 On Friday 27 Jun 2014 21:54:32 Neil Bothwick wrote:
 On Fri, 27 Jun 2014 14:22:09 +0100, Mick wrote:
 I would think that your ISP providers in the US will be blocking
 outgoing port 25 to stop compromised MSWindows machines spamming the
 rest of us.  If you use my suggestion there shouldn't be a problem.
 It makes no difference whether you address it directly to your ISP
 address or via an alias. The ISP won't block port 25 connections to its
 own servers from its own customers, otherwise none of them could send
 email at all!
 In the US many big players are blocking outbound port 25 for their customers 
 as a blanket measure to control spam from botnets, e.g.:

 http://www.verizon.com/Support/Residential/internet/highspeed/general+support/top+questions/questionsone/124274.htm

 If Dale uses the ssmtp.conf I sent he will be using a different port + TLS 
 encryption and should not have a problem.

 Even if Dale's ISP does not block port 25 for connections to the ISP's *own* 
 mail servers, it may well block it to other providers' mail addresses for the 
 same reason.  This was a common practice some years back (pre-Gmail) when ISP 
 had started charging for mail services.


According to the settings in Seamonkey, it should be port 995 and
SSL/TLS.   I used your basic setup which is port 465.  It works tho. 
:-) 

Dale

:-)  :-)



Re: [gentoo-user] smartctrl drive error @60%

2014-06-28 Thread Dale
Dale wrote:
 Howdy,

 I run this test every once in a while.  How bad is this:

 root@fireball / # smartctl -l selftest /dev/sdc
 smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build)
 Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

 === START OF READ SMART DATA SECTION ===
 SMART Self-test log structure revision number 1
 Num  Test_DescriptionStatus  Remaining 
 LifeTime(hours)  LBA_of_first_error
 # 1  Extended offlineCompleted: read failure   60%
 16365 2905482560
 # 2  Extended offlineCompleted: read failure   60%
 16352 2905482560
 # 3  Extended offlineCompleted without error   00% 
 8044 -
 # 4  Extended offlineCompleted without error   00% 
 3121 -

 And better yet, is there any way to tell it to not use that part and
 finish the test?  It seems it stopped when it got to that, or I think it
 did. 

 Thoughts? 

 Dale

 :-)  :-) 


OK.  Update.  I got the new drive in, copied the files over, tested the
new drive A LOT, then did a dd on the old drive and wiped the WHOLE
thing.  I let dd run until it ran out of space and died, which took a
pretty good while.  After dd finished, I ran the Smart test again.  This
is what I get now:

root@fireball / # smartctl -l selftest /dev/sdd
smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_DescriptionStatus  Remaining 
LifeTime(hours)  LBA_of_first_error
# 1  Extended offlineCompleted without error   00%
16466 -
# 2  Extended offlineAborted by host   90%
16461 -
# 3  Extended offlineCompleted: read failure   60%
16451 2905482560
# 4  Extended offlineCompleted: read failure   60%
16432 2905482560
# 5  Extended offlineCompleted: read failure   60%
16427 2905482560
 

Ignore the second one, I started the test on the old drive and was
meaning to do it on the new drive.  When dd finished, I wanted to start
a fresh test so I killed the second one.  As you can see in the latest
test, no errors. 

So, thoughts?  Did it mark that part as bad and all is well or is this
going to be trouble down the line?  Should I just fill the thing up with
data and test the stuffin out of it to make sure?

Thanks.

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-06-28 Thread Rich Freeman
On Sat, Jun 28, 2014 at 11:27 PM, Dale rdalek1...@gmail.com wrote:

 So, thoughts?  Did it mark that part as bad and all is well or is this
 going to be trouble down the line?  Should I just fill the thing up with
 data and test the stuffin out of it to make sure?


That is pretty typical.  You wrote to every sector on the drive.  You
don't need to be able to read a sector to overwrite it, so doing this
cleared out the drive's list of offline uncorrectable sectors.  If
you're fortunate it relocated those sectors in which case the drive is
only using good sectors now.  It can't relocate a sector unless it
either gets a successful read, or it is overwritten, and you overwrote
them.

Either way the extended offline test passing isn't unusual.  Either it
relocated the sectors in which case the drive is completely good or
the data written to the bad sectors was readable when the test was
run, which doesn't guarantee that it will still be readable a
day/week/month/year from now.

Unfortunately I don't think there is any way to find out what the
firmware is doing, or to predict the likelihood of another failure.
The only thing we can say for sure that like all hard drives, it WILL
fail sometime.

Rich



Re: [gentoo-user] smartctrl drive error @60%

2014-06-28 Thread Dale
Rich Freeman wrote:
 On Sat, Jun 28, 2014 at 11:27 PM, Dale rdalek1...@gmail.com wrote:
 So, thoughts?  Did it mark that part as bad and all is well or is this
 going to be trouble down the line?  Should I just fill the thing up with
 data and test the stuffin out of it to make sure?

 That is pretty typical.  You wrote to every sector on the drive.  You
 don't need to be able to read a sector to overwrite it, so doing this
 cleared out the drive's list of offline uncorrectable sectors.  If
 you're fortunate it relocated those sectors in which case the drive is
 only using good sectors now.  It can't relocate a sector unless it
 either gets a successful read, or it is overwritten, and you overwrote
 them.

 Either way the extended offline test passing isn't unusual.  Either it
 relocated the sectors in which case the drive is completely good or
 the data written to the bad sectors was readable when the test was
 run, which doesn't guarantee that it will still be readable a
 day/week/month/year from now.

 Unfortunately I don't think there is any way to find out what the
 firmware is doing, or to predict the likelihood of another failure.
 The only thing we can say for sure that like all hard drives, it WILL
 fail sometime.

 Rich



What if I copied data to the drive until it was just about full.  I'm
thinking like maybe 90 or 95% or so.  If I do that and run the test
every few days, would it then catch a error after a few weeks or so of
testing?  I realize no one knows with 100% certainty but I would like to
backup my data say every couple weeks just in case.  If the drive works,
fine.  If it fails, well, it wouldn't be the first time and it won't be
a primary drive so no big loss. 

I got to find me a good drive for backups tho.  I'm waiting on a good
sale of a brand other than Seagate tho.  That should help keep two
drives from failing at the same time.  Well, a little anyway.  I think
it is called Dale's Law now.  ;-) 

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-06-27 Thread Mick
On Thursday 26 Jun 2014 16:08:52 Neil Bothwick wrote:
 On Thu, 26 Jun 2014 09:14:39 -0500, Dale wrote:
  Holy sheep.  It worked.  I lost my jaw yesterday I think it was.  I'm
  not sure what I am going to be missing now.  :-D  Neil and Allan will so
  impressed.  LOL
  
  OK.  So, what will send me a message now?  Do I need to tell it to send
  me something, say from smart stuff, or does it just know to do it?
 
 You tell cron where to mail reports by setting MAILTO=you@wherever at the
 top of /etc/crontab. It will then mail you every time a cronjob produces
 output.

Or complete the /etc/ssmtp/ssmtp.conf and revaliases with the info I have sent 
you in this thread and set MAILTO=root in your /etc/crontab.

I would think that your ISP providers in the US will be blocking outgoing port 
25 to stop compromised MSWindows machines spamming the rest of us.  If you use 
my suggestion there shouldn't be a problem.

If your internet connection is down for some reason, you should get deadletter 
files in /root/ with the output.

-- 
Regards,
Mick


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-user] smartctrl drive error @60%

2014-06-27 Thread Neil Bothwick
On Fri, 27 Jun 2014 14:22:09 +0100, Mick wrote:

  You tell cron where to mail reports by setting MAILTO=you@wherever at
  the top of /etc/crontab. It will then mail you every time a cronjob
  produces output.  
 
 Or complete the /etc/ssmtp/ssmtp.conf and revaliases with the info I
 have sent you in this thread and set MAILTO=root in your /etc/crontab.
 
 I would think that your ISP providers in the US will be blocking
 outgoing port 25 to stop compromised MSWindows machines spamming the
 rest of us.  If you use my suggestion there shouldn't be a problem.

It makes no difference whether you address it directly to your ISP
address or via an alias. The ISP won't block port 25 connections to its
own servers from its own customers, otherwise none of them could send
email at all!


-- 
Neil Bothwick

NOTE: In order to control energy costs the light at the end
of the tunnel has been shut off until further notice...


signature.asc
Description: PGP signature


Re: [gentoo-user] smartctrl drive error @60%

2014-06-26 Thread Mick
On Thursday 26 Jun 2014 06:56:26 Mick wrote:
 Sort out access rights to 0604

Oops!  Potentially dangerous typo!  Should be:  0640 of course.

-- 
Regards,
Mick


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-user] smartctrl drive error @60%

2014-06-26 Thread David Haller
Hello,

On Wed, 25 Jun 2014, Dale wrote:
David Haller wrote:
 On Wed, 25 Jun 2014, Dale wrote:
 Yeah. Oh, and I had a clean smart until a few days ago, luckily I
 alread had a WD Red (WD40EFRX) drive waiting when this attrib jumped
 from 0 to: 5 Reallocated_Sector_Ct 0x0033 087 087 036 Pre-fail Always
 17688 Other Seagates (a few 1.5T drives) have also made me trouble,
 the 2T Samsung already relabeled and sold as a Seagate but with
 Samsung in the FW though is still ok. [..] 

I was wondering about how that would be updated since a lot of that
stuff requires windoze. 

I had the fun with a couple of those 2TB Samsung drives a while ago
(Jan 2012?). Samsung had some .iso files available, which you could
write to a CD/DVD/USB-Shtik, and then boot from them. For me, the
biggest problem was _which_ of the 4 HD204UI and 2 HD203WI I had and
have needed the update... I don't remember the SW displaying serial
numbers... *gah* Anyway, I've got it sorted out and updated those that
needed the update. With Seagate/WD/HGST/Toshiba I've no experience. 
And I still am grumpy about Samsung selling off their HDD stuff to
Seagate.

BTW: in german, we have a saying for Seagate drives: sie geht oder
sie geht nicht, basically spoken as sea gate odr sea gate nicht
meaning she works or she won't... But, talk about IBM deathstars,
and WD has also a record, basically, all HDDs are much alike nowadays,
and have been for years. There's always a bad batch somewhere ...

Seagate: Seagate drives (???), ex-Maxtor drives, ex-Samsung drives
WD: bought Hitachi GST (formerly IBM), i.e. WD + ex-IBM/HGST drives
Toshiba: new player, no warranty for bulk drives, unknown for the desktop

I miss the days when you still had a real choice (WD, Seagate, Maxtor,
Samsung, IBM, and smaller/specialized stuff (Excelstor, Toshiba for
Laptop drives))...

If Seagate would at least label their former Samsung drives in a
recognizable manner (say, ST*DS* vs. ST*DM* or keep the Samsung label
or whatever), I'd be a happy bunny, but as of now, that failing
ST3000DM001 was the last Seagate I've bought for quite some time.

Oh, and I've got a second ST3000DM001 used externally, with stuff that
can get lost, but it'd be inconvenient. And yes, I'm planning on
ordering a replacement ASAP (another WD40EFRX). Just in case. And the
steady state of my discs is full anyway ...

# dfall -t ext3 -t ext4 -h
[..]
14.6T13.2T   996.4G  90%

and that's just because item one: /dev/sda is taken up by a 128G
Samsung 830 SSD, and item two: I just swapped in that 4TB WD for the
failing Seagate 3TB. (and yes, I use ext{3,4} exclusively on disk),
there's another ~10 TiB in the fileserver on 11 disks and a few
naked drives (and a docking station (Sharkoon QuickDeck)) and an
external drive ;)

 I ordered a drive.  It should be here tomorrow.  In the meantime, I
 shutdown and re-seated all the cables, power too. I got the test running
 again but results is a few hours off yet.  It did pass the short test
 tho.  I'm not sure that it means much. 

 Good. Do not use dd, it WILL fail at the first error. Use gnu ddrescue
 or dd_rescue to grab an image. I used mc to copy via filesystem, eg. 
 'rsync -auxlPRAXSHD /foo/ /bar/' is fine too. Oh, and I hope you
 didn't buy a Seagate again ;)

BTW: *GRR* for you buying a ST3000DM001 again

I plan to rsync or cp the data over.

Good plan.

The dd part will come into play after I am sure I got everything off
that I can get and am just erasing the drive completely. I plan to dd
the drive then run the tests again just to see what it is doing. 
Heck, maybe it will reallocate that area like it should be doing
already, I guess.

Reallocation happens on writes ... Look for --write-sector in 'man
hdparm'.

I've been doing that smartctl -t .. / hdparm --write-sector ... 
stuff for a bunch of sectors but got tired of that game. And you
having 104 pending sectors? *gah* That'll get tedious. Probably using
'ddrescue /dev/zero /dev/sdX sdX.log' would be easier. That way, after
you got whatever data you can rescue from that drive, you can clear
the drive too before sending it in for a warranty replacement (if
still applicable).

Time will tell.  I'll be having fun tomorrow tho.  ;-)

Do have fun, after you got your data off that drive :)

-dnh

-- 
Gibt es ein Buch über das maßvolle Verwenden von Fußnoten?
Wenn ja, dann bin ich bereit, Dir ein Exemplar zu schicken.
[Thorsten Haude zu David Haller in sl-etikette]



Re: [gentoo-user] smartctrl drive error @60%

2014-06-26 Thread Neil Bothwick
On Wed, 25 Jun 2014 22:54:35 -0500, Dale wrote:

 Curious.  I hope I don't start a flame war here.  I have had WD, Seagate
 and I think there is a Samsung here somewhere, may be the one that is
 rolling over on its back now.  The one drive that failed a few years ago
 was a WD drive.  That said, all the other WD drives I have had just got
 to small to really use, and slow when SATA came out.  I'm partial to WD
 and Seagate still since I got good long term use out of those.  Based on
 your experience, you tend to be of the same opinion? 
 
 Allan, your situation should involve a lot of hard drives.  Any
 thoughts?  Neil, you have a nice big opinion on this? 

Yes, mix drives from different manufacturers. Or buy them at different
times. All manufacturers can have bad batches (remember the IBM
Deathstar?). I bought two Seagate drives a couple of years ago, for use
in a RAID. The only time I have ignored my own advice on this matter
(other matters are way off topic!). After a year they both started
showing SMART errors and one of them failed soon after, the other was
replaced before it had a chance to fail.

Yes, it's anecdotal, but it makes sense - true redundancy means using
different sources.


-- 
Neil Bothwick

I don't suffer from insanity. I enjoy every minute of it.


signature.asc
Description: PGP signature


Re: [gentoo-user] smartctrl drive error @60%

2014-06-26 Thread Neil Bothwick
On Wed, 25 Jun 2014 21:15:54 -0500, Dale wrote:

 I like this part:
 
 Extremely simple MTA to get mail off the system to a Mailhub
 
   ^  That part right up there.  :-D  That may be a new thread, if
 needed.

My first thought was even Dale can't have problems with that.

I soon reconsidered...


-- 
Neil Bothwick

Voting Democrat or Republican is like choosing a cabin in the Titanic.


signature.asc
Description: PGP signature


Re: [gentoo-user] smartctrl drive error @60%

2014-06-26 Thread Alan McKinnon
On 26/06/2014 05:54, Dale wrote:
 Curious.  I hope I don't start a flame war here.  I have had WD, Seagate
 and I think there is a Samsung here somewhere, may be the one that is
 rolling over on its back now.  The one drive that failed a few years ago
 was a WD drive.  That said, all the other WD drives I have had just got
 to small to really use, and slow when SATA came out.  I'm partial to WD
 and Seagate still since I got good long term use out of those.  Based on
 your experience, you tend to be of the same opinion? 
 
 Allan, your situation should involve a lot of hard drives.  Any
 thoughts?  Neil, you have a nice big opinion on this? 


My experiences aren't worth much in this case, what I had to deal with
was data center setups where

- the power has never gone off for 6 years
- the drives never spin down and just keep on turning year after year
- the servers were the nice big ones Dell makes with awesome cooling
- the data center feels like a fridge and the ambient temp never varies
more than 1 deg
- the server power supplies are seriously high grade, the 5V and 12V out
of them are solid and do not fluctuate at all

Add all this up and it's an almost perfect environment for drives to
last a long time. You don't have that, not even close.

I have only 1 little bit of anecdotal data:

my nas at home has 4 x 3T WD Green drives in it, going on almost 2 years
now. My kids hammer the blazes out of that thing, and ZFS scrubs keep it
real busy when the kids don't. And those drives just keep on turning and
turning and turning, I didn't do anything special. I put it down to
statistics - no-one makes bad drives (or cars) these days and I haven't
pulled the unlucky card yet. I dunno, go figure

-- 
Alan McKinnon
alan.mckin...@gmail.com




Re: [gentoo-user] smartctrl drive error @60%

2014-06-26 Thread Dale
Neil Bothwick wrote:
 On Wed, 25 Jun 2014 22:54:35 -0500, Dale wrote:

 Curious.  I hope I don't start a flame war here.  I have had WD, Seagate
 and I think there is a Samsung here somewhere, may be the one that is
 rolling over on its back now.  The one drive that failed a few years ago
 was a WD drive.  That said, all the other WD drives I have had just got
 to small to really use, and slow when SATA came out.  I'm partial to WD
 and Seagate still since I got good long term use out of those.  Based on
 your experience, you tend to be of the same opinion? 

 Allan, your situation should involve a lot of hard drives.  Any
 thoughts?  Neil, you have a nice big opinion on this? 
 Yes, mix drives from different manufacturers. Or buy them at different
 times. All manufacturers can have bad batches (remember the IBM
 Deathstar?). I bought two Seagate drives a couple of years ago, for use
 in a RAID. The only time I have ignored my own advice on this matter
 (other matters are way off topic!). After a year they both started
 showing SMART errors and one of them failed soon after, the other was
 replaced before it had a chance to fail.

 Yes, it's anecdotal, but it makes sense - true redundancy means using
 different sources.



Yep, it makes good sense.  Each batch can have one oddball failure but
if a batch has a firmware/hardware fault, the whole batch can die at the
same time.  One could certainly see the point that having say a WD and a
Seagate mirroring each other would be good advice. Having two drives
that are only one digit apart on the serial number could very well be a
recipe for problems, unless one is really lucky and got two well made
drives.

Given how things are manufactured nowadays and the compact data on the
media, it doesn't take much to make a dud for sure.  This sort of
reminds me of a old saying.  A chain is only as strong as its weakest
link.  It doesn't take much to make a hard drive either really good or
really bad.  I don't think they aim for really good, just good enough to
stay out of the really bad area.  ;-)

I may have to keep a eye out on a WD drive for the next one.

Dale

:-)  :-)



Re: [gentoo-user] smartctrl drive error @60%

2014-06-26 Thread Dale
Neil Bothwick wrote:
 On Wed, 25 Jun 2014 21:15:54 -0500, Dale wrote:

 I like this part:

 Extremely simple MTA to get mail off the system to a Mailhub

   ^  That part right up there.  :-D  That may be a new thread, if
 needed.
 My first thought was even Dale can't have problems with that.

 I soon reconsidered...



Well, remember me building a init thingy a good while back?  Yep, it
failed to boot a while back.  It sort of started but from what was on
the screen, I could tell the init thingy failed.  I edited grub to
remove the init thingy and booted.  Then I removed all the init
thingys.  I was using the drop dead simple dracut to make it and it
still failed.  After it failed and I tried to boot without it, I was
diggin for my Kubuntu DVD.  It booted fine without it tho.  Whew!!!

I'm thinking of something tho.  Btrfs.  While I have a new drive with no
file system on it, it's a good time to think on switching from LVM. 
Hmmm.   I'm currently on gentoo-sources 3.14. 

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-06-26 Thread Dale
Alan McKinnon wrote:
 My experiences aren't worth much in this case, what I had to deal with
 was data center setups where - the power has never gone off for 6
 years - the drives never spin down and just keep on turning year after
 year - the servers were the nice big ones Dell makes with awesome
 cooling - the data center feels like a fridge and the ambient temp
 never varies more than 1 deg - the server power supplies are seriously
 high grade, the 5V and 12V out of them are solid and do not fluctuate
 at all Add all this up and it's an almost perfect environment for
 drives to last a long time. You don't have that, not even close. I
 have only 1 little bit of anecdotal data: my nas at home has 4 x 3T WD
 Green drives in it, going on almost 2 years now. My kids hammer the
 blazes out of that thing, and ZFS scrubs keep it real busy when the
 kids don't. And those drives just keep on turning and turning and
 turning, I didn't do anything special. I put it down to statistics -
 no-one makes bad drives (or cars) these days and I haven't pulled the
 unlucky card yet. I dunno, go figure 

Well, it does make good points tho.  I keep my room here pretty cool. 
It's not as cool as your data center but I have a window A/C and my own
heater.  I don't mind it being a little cool in the winter but don't
like it warm in the summer either.  The cooler the better. 

I also have the Cooler Master HAF-932 case with those really nice large
fans.  The hard drives are right in front of the front intake fan.  I
have a power supply that is really to big for what I have running.  I
can't recall the brand and wattage just that it doesn't pull near as
much power as I thought it would.  It pulls less than half what my older
and much slower puter pulled.  Also, I rarely shut this thing down.  I
did the other night to unplug/re-plug all the cables but other than
that, it is usually because I have lost power from the mains. 

So, keep them cool, good clean power and leave them running when ya
can.  Sounds like a plan.  ;-) 

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-06-26 Thread Rich Freeman
On Thu, Jun 26, 2014 at 7:03 AM, Dale rdalek1...@gmail.com wrote:
 I'm thinking of something tho.  Btrfs.  While I have a new drive with no
 file system on it, it's a good time to think on switching from LVM.
 Hmmm.   I'm currently on gentoo-sources 3.14.

I think btrfs is usable, but not without its problems.  I don't think
I'd run it without a daily backup onto something that doesn't run
btrfs.  I'm running btrfs in raid1 configuration, and I've only
restored from the backup once, though if I didn't have it I probably
could have still recovered (ENOSPC issue - the usual solutions weren't
working).  Earlier this week I was having lockup issues (I tried to
re-enable snapper and deleting a few snapshots at once caused it to
stop syncing, and then for several days any kind of heavy write
activity would cause the issue to repeat, which I suspect was the
result of delayed cleanup).

Rich



Re: [gentoo-user] smartctrl drive error @60%

2014-06-26 Thread Dale
Rich Freeman wrote:
 On Thu, Jun 26, 2014 at 7:03 AM, Dale rdalek1...@gmail.com wrote:
 I'm thinking of something tho.  Btrfs.  While I have a new drive with no
 file system on it, it's a good time to think on switching from LVM.
 Hmmm.   I'm currently on gentoo-sources 3.14.
 I think btrfs is usable, but not without its problems.  I don't think
 I'd run it without a daily backup onto something that doesn't run
 btrfs.  I'm running btrfs in raid1 configuration, and I've only
 restored from the backup once, though if I didn't have it I probably
 could have still recovered (ENOSPC issue - the usual solutions weren't
 working).  Earlier this week I was having lockup issues (I tried to
 re-enable snapper and deleting a few snapshots at once caused it to
 stop syncing, and then for several days any kind of heavy write
 activity would cause the issue to repeat, which I suspect was the
 result of delayed cleanup).

 Rich



Well, I really don't have enough time to read up and get to know the
commands and such either.  I guess I better stick with LVM, for now at
least.  When I get another drive, I could always switch then and maybe
it will be more stable then as well.  Maybe. 

Thanks for sharing the info. 

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-06-26 Thread Alan McKinnon
On 26/06/2014 13:20, Dale wrote:
 Alan McKinnon wrote:
 My experiences aren't worth much in this case, what I had to deal with
 was data center setups where - the power has never gone off for 6
 years - the drives never spin down and just keep on turning year after
 year - the servers were the nice big ones Dell makes with awesome
 cooling - the data center feels like a fridge and the ambient temp
 never varies more than 1 deg - the server power supplies are seriously
 high grade, the 5V and 12V out of them are solid and do not fluctuate
 at all Add all this up and it's an almost perfect environment for
 drives to last a long time. You don't have that, not even close. I
 have only 1 little bit of anecdotal data: my nas at home has 4 x 3T WD
 Green drives in it, going on almost 2 years now. My kids hammer the
 blazes out of that thing, and ZFS scrubs keep it real busy when the
 kids don't. And those drives just keep on turning and turning and
 turning, I didn't do anything special. I put it down to statistics -
 no-one makes bad drives (or cars) these days and I haven't pulled the
 unlucky card yet. I dunno, go figure 
 
 Well, it does make good points tho.  I keep my room here pretty cool. 
 It's not as cool as your data center but I have a window A/C and my own
 heater.  I don't mind it being a little cool in the winter but don't
 like it warm in the summer either.  The cooler the better. 
 
 I also have the Cooler Master HAF-932 case with those really nice large
 fans.  The hard drives are right in front of the front intake fan.  I
 have a power supply that is really to big for what I have running.  I
 can't recall the brand and wattage just that it doesn't pull near as
 much power as I thought it would.  It pulls less than half what my older
 and much slower puter pulled.  Also, I rarely shut this thing down.  I
 did the other night to unplug/re-plug all the cables but other than
 that, it is usually because I have lost power from the mains. 
 
 So, keep them cool, good clean power and leave them running when ya
 can.  Sounds like a plan.  ;-) 


You got it :-)

hard drives are mechanical objects, not electronic ones, and they fail
for mechanical reasons. Motors fail, bearings seize, spindle arms wear
out. Transforming magnetic blobs on the platter into binary bits is very
reliable, as long as the head is in exactly the place it is supposed to
be. So the enemies of disks are environmental;

- temperature and humidity changes
- frequent spin ups and spin downs
- dust
- power dips/fluctuations and brown-outs
- being dropped, knocked and generally ubused

etc, etc, etc

Take care of the environmental factors, and statistics fall in your
favour making the odds good you'll get the life you expect




-- 
Alan McKinnon
alan.mckin...@gmail.com




Re: [gentoo-user] smartctrl drive error @60%

2014-06-26 Thread Dale
Alan McKinnon wrote:
 On 26/06/2014 13:20, Dale wrote:
 Alan McKinnon wrote:
 My experiences aren't worth much in this case, what I had to deal with
 was data center setups where - the power has never gone off for 6
 years - the drives never spin down and just keep on turning year after
 year - the servers were the nice big ones Dell makes with awesome
 cooling - the data center feels like a fridge and the ambient temp
 never varies more than 1 deg - the server power supplies are seriously
 high grade, the 5V and 12V out of them are solid and do not fluctuate
 at all Add all this up and it's an almost perfect environment for
 drives to last a long time. You don't have that, not even close. I
 have only 1 little bit of anecdotal data: my nas at home has 4 x 3T WD
 Green drives in it, going on almost 2 years now. My kids hammer the
 blazes out of that thing, and ZFS scrubs keep it real busy when the
 kids don't. And those drives just keep on turning and turning and
 turning, I didn't do anything special. I put it down to statistics -
 no-one makes bad drives (or cars) these days and I haven't pulled the
 unlucky card yet. I dunno, go figure 
 Well, it does make good points tho.  I keep my room here pretty cool. 
 It's not as cool as your data center but I have a window A/C and my own
 heater.  I don't mind it being a little cool in the winter but don't
 like it warm in the summer either.  The cooler the better. 

 I also have the Cooler Master HAF-932 case with those really nice large
 fans.  The hard drives are right in front of the front intake fan.  I
 have a power supply that is really to big for what I have running.  I
 can't recall the brand and wattage just that it doesn't pull near as
 much power as I thought it would.  It pulls less than half what my older
 and much slower puter pulled.  Also, I rarely shut this thing down.  I
 did the other night to unplug/re-plug all the cables but other than
 that, it is usually because I have lost power from the mains. 

 So, keep them cool, good clean power and leave them running when ya
 can.  Sounds like a plan.  ;-) 

 You got it :-)

 hard drives are mechanical objects, not electronic ones, and they fail
 for mechanical reasons. Motors fail, bearings seize, spindle arms wear
 out. Transforming magnetic blobs on the platter into binary bits is very
 reliable, as long as the head is in exactly the place it is supposed to
 be. So the enemies of disks are environmental;

 - temperature and humidity changes
 - frequent spin ups and spin downs
 - dust
 - power dips/fluctuations and brown-outs
 - being dropped, knocked and generally ubused

 etc, etc, etc

 Take care of the environmental factors, and statistics fall in your
 favour making the odds good you'll get the life you expect


I think that is one reason I have had some pretty good luck with that. 
I might also add, I have actually only had one computer that failed. 
That includes the ones that folks just gave me which is quite a few. 
Most of them just get to slow to use.  The ones I build, I build them
like a tank.  I put coolers on everything that is even a little warm. 
My CPU cooler on my current rig is pretty large.  Case fans blowing a
lot of air, quiet if possible.  For this drive that I have going out now
to go out, it has to have a issue not related to cooling and such. 
Unless it was somehow handled badly while being shipped to me, its never
been dropped or anything either.  This is a desktop, with wheels since
it is on carpet, and it rarely goes anywhere.  It doesn't get rattled
around like a laptop or something. 

My old rig, AMD 2500+ in a old full tower case still runs good.  I
booted it a month or so ago.  I had a Volcano 11 or 12 on the CPU which
is solid copper.  I replaced the northbridge cooler with a copper cooler
with a fan.  The mosfets close to the CPU, I added coolers to them too. 
It had 5 case fans.  It wasn't quiet but it ran cool.  The mobo temps
were usually just a couple degrees above room temp.  CPU never got over
100F.  Heck, the CPU in my current rig has never seen 110F.  The highest
I have ever seen was 107F and that was when I was compiling and had
power to blink just enough to cut off my A/C for a hour or so.  Maybe I
need a UPS for my A/C too.  :-D 

It seems the best thing WE can do, good power, good cooling, don't drop
it and keep backups. 

I went back through the error logs and found this:

Jun 12 23:30:36 localhost smartd[2688]: Device: /dev/sdc [SAT], 104
Currently unreadable (pending) sectors
Jun 12 23:30:36 localhost smartd[2688]: Device: /dev/sdc [SAT], 104
Offline uncorrectable sectors 

That's the first error I could find.  It went from nothing to that in
one huge jump.  I also found this:

Jun  8 03:10:02 localhost sSMTP[7164]: Unable to locate mail
Jun  8 03:10:02 localhost sSMTP[7164]: Cannot open mail:25
Jun  8 03:10:03 localhost CROND[7145]: (root) MAIL (mailed 57 bytes of
output but got status 0x0001
)

It seems it is trying to mail something.  I need 

Re: [gentoo-user] smartctrl drive error @60%

2014-06-26 Thread Dale
Mick wrote:
 On Thursday 26 Jun 2014 06:56:26 Mick wrote:
 Sort out access rights to 0604
 Oops!  Potentially dangerous typo!  Should be:  0640 of course.


Picking a short one to reply too.

Holy sheep.  It worked.  I lost my jaw yesterday I think it was.  I'm
not sure what I am going to be missing now.  :-D  Neil and Allan will so
impressed.  LOL 

OK.  So, what will send me a message now?  Do I need to tell it to send
me something, say from smart stuff, or does it just know to do it?

I found this in messages:

Jun  8 03:10:02 localhost sSMTP[7164]: Unable to locate mail
Jun  8 03:10:02 localhost sSMTP[7164]: Cannot open mail:25
Jun  8 03:10:03 localhost CROND[7145]: (root) MAIL (mailed 57 bytes of
output but got status 0x0001) 

It seems that something was trying to email me.  Crond maybe?  It was at
the top of the file right after logrotate did its thing.  Could that be it?

Thanks much.  Going to look for missing body parts now.  Hmmm, maybe I
will find a better brain.  ;-) 

Dale

:-)  :-)




Re: [gentoo-user] smartctrl drive error @60%

2014-06-26 Thread Neil Bothwick
On Thu, 26 Jun 2014 09:14:39 -0500, Dale wrote:

 Holy sheep.  It worked.  I lost my jaw yesterday I think it was.  I'm
 not sure what I am going to be missing now.  :-D  Neil and Allan will so
 impressed.  LOL 
 
 OK.  So, what will send me a message now?  Do I need to tell it to send
 me something, say from smart stuff, or does it just know to do it?

You tell cron where to mail reports by setting MAILTO=you@wherever at the
top of /etc/crontab. It will then mail you every time a cronjob produces
output.


-- 
Neil Bothwick

... I just forgot to increment the counter, Tom said, nonplussed.


signature.asc
Description: PGP signature


Re: [gentoo-user] smartctrl drive error @60%

2014-06-26 Thread Dale
Neil Bothwick wrote:
 On Thu, 26 Jun 2014 09:14:39 -0500, Dale wrote:

 Holy sheep.  It worked.  I lost my jaw yesterday I think it was.  I'm
 not sure what I am going to be missing now.  :-D  Neil and Allan will so
 impressed.  LOL 

 OK.  So, what will send me a message now?  Do I need to tell it to send
 me something, say from smart stuff, or does it just know to do it?
 You tell cron where to mail reports by setting MAILTO=you@wherever at the
 top of /etc/crontab. It will then mail you every time a cronjob produces
 output.



Well, right now I am trying to get smartmon to send them since that is
pretty important and I have a error at the moment to really test it.  I
think I am having some success here.  I been googling and trying to sort
this thing out.  So far, I have this:

/dev/sda -a -d sat -o on -S on -s (S/../.././02|L/../../6/12) -m root
/dev/sdb -a -d sat -o on -S on -s (S/../.././02|L/../../6/12) -m root
/dev/sdc -a -d sat -o on -S on -s (S/../.././02|L/../../6/12) -m root

I changed it to testing during the day since I am usually sleeping then
and up at night.  Anyway, when I restart smartd, I get a email about the
error.  Yeppie  What I would like tho, it to just know to do this
for whatever drives are added/removed without me having to change the
config.  I bet there is a way to do that. 

Now, let me go check into that cron thing.  Is there anything else I
need to set up to send emails like this?  Smartd is working, about to
work on cron so what else usually tries to send a email?

I have got to go find those body parts I am losing.

Thanks much.

Dale

:-)  :-)



Re: [gentoo-user] smartctrl drive error @60%

2014-06-26 Thread Dale
Rich Freeman wrote:
 And do check on your warranty. You can migrate all your data to the
 new drive, and then replace the old one as a backup disk. Either use
 it with raid, or as an offline backup. If you want to do raid you can
 set up mdadm with a degraded raid1 so that you can copy your data over
 from your old drive, and then when it is replaced you just partition
 the new one, add it to the raid, and watch it rebuild automatically. Rich 

I thought I was never going to find that thing.  It is pretty well
hidden.  Anyway:

Serial Number  Seagate Part Number Warranty Status 
Z1F0PKT5   9YN166-302   Out of Warranty  

So, I guess I need to check newegg and see if I can show a invoice that
it is less than two years old.  I'm not sure that it is tho. 

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-06-26 Thread Daniel Frey
On 06/26/2014 05:57 AM, Alan McKinnon wrote:
 
 Take care of the environmental factors, and statistics fall in your
 favour making the odds good you'll get the life you expect
 

Yep, but sometimes crap just fails for no reason whatsoever. As an
example, my old house had central A/C and never went above 23.5 C in the
summer and was typically 19 C in winter. All machines on their own UPS
boxes. Oddly enough most failures I've seen were from drives on 24x7 and
for whatever reason I had to shut them down and they would not power up
again. I still lost a quite a few drives there, regardless.

Computer hardware fails. I just say shit happens and move on. ;-)

Oh, and I make sure critical stuff is in more than two places. What I
consider critical is likely less than 5% of my total storage.

Dan




Re: [gentoo-user] smartctrl drive error @60%

2014-06-26 Thread Daniel Frey
On 06/25/2014 08:54 PM, Dale wrote:
 Curious.  I hope I don't start a flame war here.  I have had WD, Seagate
 and I think there is a Samsung here somewhere, may be the one that is
 rolling over on its back now.  The one drive that failed a few years ago
 was a WD drive.  That said, all the other WD drives I have had just got
 to small to really use, and slow when SATA came out.  I'm partial to WD
 and Seagate still since I got good long term use out of those.  Based on
 your experience, you tend to be of the same opinion? 
 

Flame war on a mailing list? Nah, will never happen. :-)

We have over 100 workstations at work and so far the WD Blue drives fail
the most, Seagate second, Samsung third.

At home once I get a bad run of one manufacturer I usually get ticked
off and find another to switch to.

I originally used WD, then Maxtor, then Seagate, then Samsung, then WD
again... where I am now.

Dan




Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread J. Roeleveld
On 25 June 2014 07:05:03 CEST, Dale rdalek1...@gmail.com wrote:
J. Roeleveld wrote:
 On 25 June 2014 01:09:03 CEST, Dale rdalek1...@gmail.com wrote:
 Howdy,

 I run this test every once in a while.  How bad is this:

 root@fireball / # smartctl -l selftest /dev/sdc
 smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local
 build)
 Copyright (C) 2002-13, Bruce Allen, Christian Franke,
 www.smartmontools.org

 === START OF READ SMART DATA SECTION ===
 SMART Self-test log structure revision number 1
 Num  Test_DescriptionStatus  Remaining 
 LifeTime(hours)  LBA_of_first_error
 # 1  Extended offlineCompleted: read failure   60%
 16365 2905482560
 # 2  Extended offlineCompleted: read failure   60%
 16352 2905482560
 # 3  Extended offlineCompleted without error   00% 
 8044 -
 # 4  Extended offlineCompleted without error   00% 
 3121 -

 And better yet, is there any way to tell it to not use that part and
 finish the test?  It seems it stopped when it got to that, or I
think
 it
 did. 

 Thoughts? 

 Dale

 :-)  :-) 
 Dale,

 Not sure how to get it to go past. Think that is in the firmware of
the disk.

 I would start with making a backup first.

 --
 Joost

That's a 3TB drive.  I don't have anything big enough to back it up to.

Is there anyway to find out if this error is really serious or just a
run of the mill type error?  I would think that if it was a run of the
mill error the drive would handle the error itself and I wouldn't even
see it.  Something like marking the area as bad and just not trying to
use it anymore, even for the test. 

Thanks.  Any advice is appreciated.  I need a hard drive guru.  ;-)

Here is additional info:

root@fireball / # hdparm -i /dev/sdc

/dev/sdc:

 Model=ST3000DM001-9YN166, FwRev=CC4C, SerialNo=Z1F0PKT5
 Config={ HardSect NotMFM HdSw15uSec Fixed DTR10Mbs RotSpdTol.5% }
 RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
 BuffType=unknown, BuffSize=unknown, MaxMultSect=16, MultSect=16
 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=5860533168
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes:  pio0 pio1 pio2 pio3 pio4
 DMA modes:  mdma0 mdma1 mdma2
 UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6
 AdvancedPM=yes: unknown setting WriteCache=enabled
 Drive conforms to: unknown:  ATA/ATAPI-4,5,6,7

 * signifies the current active mode

root@fireball / #



Dale

:-)  :-) 

There are some options with smartctl you could try to force the drive to swap 
that bad sector with a spare one.

A full disk read could also force that. Eg. Try ' dd if=/dev/sdc of=/dev/null '.

But, I usually order a replacement when Smart tests start throwing errors.

I know 3TB is a lot for you to have to backup, but it's also a lot of data to 
loose...

--
Joost
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Marc Joliet
Am Tue, 24 Jun 2014 18:09:03 -0500
schrieb Dale rdalek1...@gmail.com:

 Howdy,
 
 I run this test every once in a while.  How bad is this:
 
 root@fireball / # smartctl -l selftest /dev/sdc
 smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build)
 Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
 
 === START OF READ SMART DATA SECTION ===
 SMART Self-test log structure revision number 1
 Num  Test_DescriptionStatus  Remaining 
 LifeTime(hours)  LBA_of_first_error
 # 1  Extended offlineCompleted: read failure   60%
 16365 2905482560
 # 2  Extended offlineCompleted: read failure   60%
 16352 2905482560
 # 3  Extended offlineCompleted without error   00% 
 8044 -
 # 4  Extended offlineCompleted without error   00% 
 3121 -
 
 And better yet, is there any way to tell it to not use that part and
 finish the test?  It seems it stopped when it got to that, or I think it
 did. 
 
 Thoughts? 

I have no idea, really, but I had a similar situation that was caused by a
loose SATA connection.  In my case the drive stopped working first, then after
checking the SATA connection, it was detected again, but didn't work correctly,
including failing its SMART extended tests at a specific sector. Then, after
fiddling with the connection several weeks later, it started working flawlessly
again.  I plan on buying SATA cables with clips, now :-/ (though I have to check
if that will work with my mainboard first).

Otherwise, to reiterate what J. Roeleveld wrote: backups, backups, backups ;) .

 Dale
 
 :-)  :-) 

HTH
-- 
Marc Joliet
--
People who think they know everything really annoy those of us who know we
don't - Bjarne Stroustrup


signature.asc
Description: PGP signature


Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread thegeezer
On 06/25/2014 06:05 AM, Dale wrote:
 J. Roeleveld wrote:
 On 25 June 2014 01:09:03 CEST, Dale rdalek1...@gmail.com wrote:
 Howdy,

 I run this test every once in a while.  How bad is this:

 root@fireball / # smartctl -l selftest /dev/sdc
 smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local
 build)
 Copyright (C) 2002-13, Bruce Allen, Christian Franke,
 www.smartmontools.org

 === START OF READ SMART DATA SECTION ===
 SMART Self-test log structure revision number 1
 Num  Test_DescriptionStatus  Remaining 
 LifeTime(hours)  LBA_of_first_error
 # 1  Extended offlineCompleted: read failure   60%
 16365 2905482560
 # 2  Extended offlineCompleted: read failure   60%
 16352 2905482560
 # 3  Extended offlineCompleted without error   00% 
 8044 -
 # 4  Extended offlineCompleted without error   00% 
 3121 -


this is pretty bad. enough to really go and get a replacement asap, and
turn that disk off if you can.
the self test stops at the first error it comes to and in this case it
is LBA#2905482560
for calculation of where the error is check out the smartcl [1] site
which will help you to mark the block bad though the data that was in
that block is probably lost forever.
i'd also suggest you run
# smartctl -a /dev/sdc 
and paste the results here. the crucial rows are 196/197 the reallocated
sector counts and pending sector counts.
they show how many blocks have been reallocated, and also how many are
pending. this will give you a scaling factor, at the moment you are in
trouble, if these figures are very high you are in very high trouble, if
they are low you are in low trouble.

[1] http://smartmontools.sourceforge.net/badblockhowto.html



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Dale
J. Roeleveld wrote:
 There are some options with smartctl you could try to force the drive
 to swap that bad sector with a spare one. A full disk read could also
 force that. Eg. Try ' dd if=/dev/sdc of=/dev/null '. But, I usually
 order a replacement when Smart tests start throwing errors. I know
 3TB is a lot for you to have to backup, but it's also a lot of data
 to loose... -- Joost 


I just don't have anything to put the data on.  I been saying I was
going to get me a backup drive but hadn't yet.  Looks like I better
order one unless someone pops on and says this is normal and OK, sort of
doubting that will happen tho. 

Dale

:-)  :-) 

-- 
I am only responsible for what I said ... Not for what you understood or how 
you interpreted my words!




Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Dale
thegeezer wrote:
 this is pretty bad. enough to really go and get a replacement asap,
 and turn that disk off if you can. the self test stops at the first
 error it comes to and in this case it is LBA#2905482560 for
 calculation of where the error is check out the smartcl [1] site which
 will help you to mark the block bad though the data that was in that
 block is probably lost forever. i'd also suggest you run # smartctl -a
 /dev/sdc and paste the results here. the crucial rows are 196/197 the
 reallocated sector counts and pending sector counts. they show how
 many blocks have been reallocated, and also how many are pending. this
 will give you a scaling factor, at the moment you are in trouble, if
 these figures are very high you are in very high trouble, if they are
 low you are in low trouble. [1]
 http://smartmontools.sourceforge.net/badblockhowto.html 

Here is the output:

root@fireball / # smartctl -a /dev/sdc
smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST3000DM001-9YN166
Serial Number:Z1F0PKT5
LU WWN Device Id: 5 000c50 04d79e15c
Firmware Version: CC4C
User Capacity:3,000,592,982,016 bytes [3.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate:7200 rpm
Device is:In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:Wed Jun 25 02:46:39 2014 CDT

== WARNING: A firmware update for this drive is available,
see the following Seagate web pages:
http://knowledge.seagate.com/articles/en_US/FAQ/207931en
http://knowledge.seagate.com/articles/en_US/FAQ/223651en

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection:
Disabled.
Self-test execution status:  ( 118) The previous self-test completed
having
the read element of the test failed.
Total time to complete Offline
data collection:(  584) seconds.
Offline data collection
capabilities:(0x73) SMART execute Offline immediate.
Auto Offline data collection
on/off support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:(0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:(0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time:(   1) minutes.
Extended self-test routine
recommended polling time:( 340) minutes.
Conveyance self-test routine
recommended polling time:(   2) minutes.
SCT capabilities:  (0x3085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE 
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate 0x000f   119   099   006Pre-fail 
Always   -   234421760
  3 Spin_Up_Time0x0003   092   092   000Pre-fail 
Always   -   0
  4 Start_Stop_Count0x0032   100   100   020Old_age  
Always   -   33
  5 Reallocated_Sector_Ct   0x0033   100   100   036Pre-fail 
Always   -   0
  7 Seek_Error_Rate 0x000f   079   060   030Pre-fail 
Always   -   99909120
  9 Power_On_Hours  0x0032   082   082   000Old_age  
Always   -   16379
 10 Spin_Retry_Count0x0013   100   100   097Pre-fail 
Always   -   0
 12 Power_Cycle_Count   0x0032   100   100   020Old_age  
Always   -   34
183 Runtime_Bad_Block   0x0032   100   100   000Old_age  
Always   -   0
184 End-to-End_Error0x0032   100   100   099Old_age  
Always   -   0
187 Reported_Uncorrect  0x0032   100   100   000Old_age  
Always   -   0
188 Command_Timeout 0x0032   100   100   000

Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Neil Bothwick
On Wed, 25 Jun 2014 00:05:03 -0500, Dale wrote:

 That's a 3TB drive.  I don't have anything big enough to back it up to. 

Then either your data is not important to you or you need to get another
drive ASAP. Meanwhile, you could start backing up the most important data.

 Is there anyway to find out if this error is really serious or just a
 run of the mill type error?

Whenever I have seen this behaviour, it was soon followed by total drive
failure, even to the point that the computer would not boot with that
drive connected.


-- 
Neil Bothwick

WinErr 003: Dynamic linking error - Your mistake is now in every file


signature.asc
Description: PGP signature


Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Dale
Neil Bothwick wrote:
 On Wed, 25 Jun 2014 00:05:03 -0500, Dale wrote:

 That's a 3TB drive.  I don't have anything big enough to back it up to.

 Then either your data is not important to you or you need to get another
 drive ASAP. Meanwhile, you could start backing up the most important data.

I got a drive picked out at Newegg.  

http://www.newegg.com/Product/Product.aspx?Item=N82E16822148844



 Is there anyway to find out if this error is really serious or just a
 run of the mill type error?

 Whenever I have seen this behaviour, it was soon followed by total drive
 failure, even to the point that the computer would not boot with that
 drive connected.



Well, I did blow the dust out a month or so ago so I thought I would
remove the sides and re-seat all the cables.  I've got the long test
running now but it passed the SHORT test.  I'm hoping it will fix this
issue, just hoping.  That is a good deal on that new drive tho.  May get
it anyway.

Dale

:-)  :-)




Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread thegeezer
On 06/25/2014 08:49 AM, Dale wrote:
 thegeezer wrote:
 this is pretty bad. 
 Here is the output:

 root@fireball / # smartctl -a /dev/sdc
 smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build)
 Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

 === START OF INFORMATION SECTION ===
 Model Family: Seagate Barracuda 7200.14 (AF)
 Device Model: ST3000DM001-9YN166
 Serial Number:Z1F0PKT5
 LU WWN Device Id: 5 000c50 04d79e15c
 Firmware Version: CC4C
 User Capacity:3,000,592,982,016 bytes [3.00 TB]
 Sector Sizes: 512 bytes logical, 4096 bytes physical
 Rotation Rate:7200 rpm
 Device is:In smartctl database [for details use: -P show]
 ATA Version is:   ATA8-ACS T13/1699-D revision 4
 SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
 Local Time is:Wed Jun 25 02:46:39 2014 CDT

 == WARNING: A firmware update for this drive is available,
 see the following Seagate web pages:
 http://knowledge.seagate.com/articles/en_US/FAQ/207931en
 http://knowledge.seagate.com/articles/en_US/FAQ/223651en

interesting - not seen that before might be worth a nose

 SMART support is: Available - device has SMART capability.
 SMART support is: Enabled

 === START OF READ SMART DATA SECTION ===
 SMART overall-health self-assessment test result: PASSED

 General SMART Values:
 Offline data collection status:  (0x00) Offline data collection activity
 was never started.
 Auto Offline Data Collection:
 Disabled.
 Self-test execution status:  ( 118) The previous self-test completed
 having
 the read element of the test failed.
 Total time to complete Offline
 data collection:(  584) seconds.
 Offline data collection
 capabilities:(0x73) SMART execute Offline immediate.
 Auto Offline data collection
 on/off support.
 Suspend Offline collection upon new
 command.
 No Offline surface scan supported.
 Self-test supported.
 Conveyance Self-test supported.
 Selective Self-test supported.
 SMART capabilities:(0x0003) Saves SMART data before entering
 power-saving mode.
 Supports SMART auto save timer.
 Error logging capability:(0x01) Error logging supported.
 General Purpose Logging supported.
 Short self-test routine
 recommended polling time:(   1) minutes.
 Extended self-test routine
 recommended polling time:( 340) minutes.
 Conveyance self-test routine
 recommended polling time:(   2) minutes.
 SCT capabilities:  (0x3085) SCT Status supported.

 SMART Attributes Data Structure revision number: 10
 Vendor Specific SMART Attributes with Thresholds:
 ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE 
 UPDATED  WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate 0x000f   119   099   006Pre-fail 
 Always   -   234421760

you can happily ignore this error rate, it is usual for it to be high
and htere is hardware correction for it

   3 Spin_Up_Time0x0003   092   092   000Pre-fail 
 Always   -   0
   4 Start_Stop_Count0x0032   100   100   020Old_age  
 Always   -   33

33 power cycles seem very low but further down we see the power on time
is just under two years which is also erring towards the lighter side of
the mtbf

   5 Reallocated_Sector_Ct   0x0033   100   100   036Pre-fail 
 Always   -   0

zero reallocated sectors suggests there is space to do reallocation

   7 Seek_Error_Rate 0x000f   079   060   030Pre-fail 
 Always   -   99909120
   9 Power_On_Hours  0x0032   082   082   000Old_age  
 Always   -   16379

almost two years of power on time

  10 Spin_Retry_Count0x0013   100   100   097Pre-fail 
 Always   -   0
  12 Power_Cycle_Count   0x0032   100   100   020Old_age  
 Always   -   34
 183 Runtime_Bad_Block   0x0032   100   100   000Old_age  
 Always   -   0
 184 End-to-End_Error0x0032   100   100   099Old_age  
 Always   -   0
 187 Reported_Uncorrect  0x0032   100   100   000Old_age  
 Always   -   0
 188 Command_Timeout 0x0032   100   100   000Old_age  
 Always   -   0 0 0
 189 High_Fly_Writes 0x003a   100   100   000Old_age  
 Always   -   0
 190 Airflow_Temperature_Cel 0x0022   069   063   045Old_age  
 Always   -   31 (Min/Max 26/33)
 191 G-Sense_Error_Rate  0x0032   100   100   

Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread thegeezer
On 06/25/2014 11:05 AM, Dale wrote:

 I got a drive picked out at Newegg.  

 http://www.newegg.com/Product/Product.aspx?Item=N82E16822148844

slightly offtopic - i notice that the drive has a 2year limited warranty

has anyone managed to get anything from hard drive warranties ?


Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Helmut Jarausch

On 06/25/2014 12:55:08 PM, thegeezer wrote:

On 06/25/2014 11:05 AM, Dale wrote:

 I got a drive picked out at Newegg.  

 http://www.newegg.com/Product/Product.aspx?Item=N82E16822148844

slightly offtopic - i notice that the drive has a 2year limited  
warranty


has anyone managed to get anything from hard drive warranties ?


I always buy enterprise editions which have a warranty of 5 years.
I had several drives which got replaced after 3-5 years.
Furthermore, I have the feeling that enterprise editions have been
tested more strictly.
I know they are much more expensive but I even take these for my private
machine.

Helmut.





Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Dale
Dale wrote:


 I got a drive picked out at Newegg.  

 http://www.newegg.com/Product/Product.aspx?Item=N82E16822148844




Drive is ordered.  Be here tomorrow.  Yay Newegg. 

Dale

:-)  :-)


Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Rich Freeman
On Wed, Jun 25, 2014 at 6:55 AM, thegeezer thegee...@thegeezer.net wrote:
 On 06/25/2014 11:05 AM, Dale wrote:


 I got a drive picked out at Newegg.  

 http://www.newegg.com/Product/Product.aspx?Item=N82E16822148844


 slightly offtopic - i notice that the drive has a 2year limited warranty

 has anyone managed to get anything from hard drive warranties ?

Yes.  Most manufacturers have a hard drive warranty tool online.  Just
give it your serial number and it will tell you if you're eligible,
and how to go about it.  I know Seagate wants you to run their own
testing util (which just does a SMART test and spits out a validation
code which you write down).

I've gotten the same sorts of errors several times now on my RAID and
when it happens I just go through the warranty process, select advance
replacement, swap out the drive, then return the old drive in their
packaging.

Typically costs me $10 for HD replacement (I have to pay return shipping only).

Typically drives tend to die for me about a year after I buy them -
alarmingly often, actually.  Anybody who doesn't run smartmon or its
equivalent is insane, as is anybody who doesn't at least run RAID,
though anything valuable should be backed up.

Rich



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Dale
thegeezer wrote:
 On 06/25/2014 08:49 AM, Dale wrote:
 thegeezer wrote:
 this is pretty bad. 
 Here is the output:

 root@fireball / # smartctl -a /dev/sdc
 smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build)
 Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

 === START OF INFORMATION SECTION ===
 Model Family: Seagate Barracuda 7200.14 (AF)
 Device Model: ST3000DM001-9YN166
 Serial Number:Z1F0PKT5
 LU WWN Device Id: 5 000c50 04d79e15c
 Firmware Version: CC4C
 User Capacity:3,000,592,982,016 bytes [3.00 TB]
 Sector Sizes: 512 bytes logical, 4096 bytes physical
 Rotation Rate:7200 rpm
 Device is:In smartctl database [for details use: -P show]
 ATA Version is:   ATA8-ACS T13/1699-D revision 4
 SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
 Local Time is:Wed Jun 25 02:46:39 2014 CDT

 == WARNING: A firmware update for this drive is available,
 see the following Seagate web pages:
 http://knowledge.seagate.com/articles/en_US/FAQ/207931en
 http://knowledge.seagate.com/articles/en_US/FAQ/223651en
 interesting - not seen that before might be worth a nose

I was thinking the same thing myself.  How does it know there is a
update was another question I had. 

 SMART support is: Available - device has SMART capability.
 SMART support is: Enabled

 === START OF READ SMART DATA SECTION ===
 SMART overall-health self-assessment test result: PASSED

 General SMART Values:
 Offline data collection status:  (0x00) Offline data collection activity
 was never started.
 Auto Offline Data Collection:
 Disabled.
 Self-test execution status:  ( 118) The previous self-test completed
 having
 the read element of the test failed.
 Total time to complete Offline
 data collection:(  584) seconds.
 Offline data collection
 capabilities:(0x73) SMART execute Offline immediate.
 Auto Offline data collection
 on/off support.
 Suspend Offline collection upon new
 command.
 No Offline surface scan supported.
 Self-test supported.
 Conveyance Self-test supported.
 Selective Self-test supported.
 SMART capabilities:(0x0003) Saves SMART data before entering
 power-saving mode.
 Supports SMART auto save timer.
 Error logging capability:(0x01) Error logging supported.
 General Purpose Logging supported.
 Short self-test routine
 recommended polling time:(   1) minutes.
 Extended self-test routine
 recommended polling time:( 340) minutes.
 Conveyance self-test routine
 recommended polling time:(   2) minutes.
 SCT capabilities:  (0x3085) SCT Status supported.

 SMART Attributes Data Structure revision number: 10
 Vendor Specific SMART Attributes with Thresholds:
 ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE 
 UPDATED  WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate 0x000f   119   099   006Pre-fail 
 Always   -   234421760
 you can happily ignore this error rate, it is usual for it to be high
 and htere is hardware correction for it

   3 Spin_Up_Time0x0003   092   092   000Pre-fail 
 Always   -   0
   4 Start_Stop_Count0x0032   100   100   020Old_age  
 Always   -   33
 33 power cycles seem very low but further down we see the power on time
 is just under two years which is also erring towards the lighter side of
 the mtbf

About the only time I shutdown is when the power fails.  My puter only
pulls about 150 watts so I just leave it running 24/7. 



   5 Reallocated_Sector_Ct   0x0033   100   100   036Pre-fail 
 Always   -   0
 zero reallocated sectors suggests there is space to do reallocation

   7 Seek_Error_Rate 0x000f   079   060   030Pre-fail 
 Always   -   99909120
   9 Power_On_Hours  0x0032   082   082   000Old_age  
 Always   -   16379
 almost two years of power on time

  10 Spin_Retry_Count0x0013   100   100   097Pre-fail 
 Always   -   0
  12 Power_Cycle_Count   0x0032   100   100   020Old_age  
 Always   -   34
 183 Runtime_Bad_Block   0x0032   100   100   000Old_age  
 Always   -   0
 184 End-to-End_Error0x0032   100   100   099Old_age  
 Always   -   0
 187 Reported_Uncorrect  0x0032   100   100   000Old_age  
 Always   -   0
 188 Command_Timeout 0x0032   100   100   000Old_age  
 Always   -   0 

Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread covici
Helmut Jarausch jarau...@igpm.rwth-aachen.de wrote:

 On 06/25/2014 12:55:08 PM, thegeezer wrote:
  On 06/25/2014 11:05 AM, Dale wrote:
  
   I got a drive picked out at Newegg.  
  
   http://www.newegg.com/Product/Product.aspx?Item=N82E16822148844
  
  slightly offtopic - i notice that the drive has a 2year limited  
  warranty
  
  has anyone managed to get anything from hard drive warranties ?
 
 I always buy enterprise editions which have a warranty of 5 years.
 I had several drives which got replaced after 3-5 years.
 Furthermore, I have the feeling that enterprise editions have been
 tested more strictly.
 I know they are much more expensive but I even take these for my private
 machine.

And lately the warranty is just one year.

-- 
Your life is like a penny.  You're going to lose it.  The question is:
How do
you spend it?

 John Covici
 cov...@ccs.covici.com



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Rich Freeman
On Wed, Jun 25, 2014 at 9:15 AM, Dale rdalek1...@gmail.com wrote:
 thegeezer wrote:
 On 06/25/2014 08:49 AM, Dale wrote:

 thegeezer wrote:

 this says there are 104 pending sectors i.e. bad blocks on the drive
 that have not been reallocatd yet

 Wonder why it hasn't?  Isn't it supposed to do that sort of thing itself?


It can't relocate the sectors until it successfully reads them, or
until something else writes over them.

However, the last few drives I've had this happen to never really
relocated things.  If I scrubbed the drives mdadm would overwrite the
unreadable sectors, which should trigger a relocation, but then a day
or two later the errors would show up again.  So, the drive firmware
must be avoiding relocation or something.  Either that or there is a
large region of the drive that is failing (which would make sense) and
I was just playing whack-a-mole with the bad sectors.  In any case, if
the drive is under warranty I've yet to have a complaint returning it
with a copy of the smartctl output showing the failed test/etc.  With
advance replacement I can keep the old drive until the new one
arrives.

 I usually just run the test manually but I sort of had family stuff
 going on for the past year, almost a year anyway.  Sort of behind on
 things although I have been doing my normal updates.

rc-update add smartd default

I don't know that I even had to configure it - it is set to email
root@localhost when there is a problem.  I also run mdadm to monitor
raid.

I don't think anybody makes a monitor for btrfs, though my boot is
mirrored across all my btrfs drives using mdadm so a drive failure
should be detected in any case.  I need to check up on that, though -
I'd like an email if something goes wrong with btrfs storage.


 I ordered a drive.  It should be here tomorrow.  In the meantime, I
 shutdown and re-seated all the cables, power too. I got the test running
 again but results is a few hours off yet.  It did pass the short test
 tho.  I'm not sure that it means much.

Short test generally doesn't do much - you need the long ones.  I'd be
shocked if it passed with offline uncorrectable sectors.

And do check on your warranty.  You can migrate all your data to the
new drive, and then replace the old one as a backup disk.  Either use
it with raid, or as an offline backup.  If you want to do raid you can
set up mdadm with a degraded raid1 so that you can copy your data over
from your old drive, and then when it is replaced you just partition
the new one, add it to the raid, and watch it rebuild automatically.

Rich



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Rich Freeman
On Wed, Jun 25, 2014 at 10:23 AM, Neil Bothwick n...@digimed.co.uk wrote:
 On Wed, 25 Jun 2014 08:33:37 -0400, Rich Freeman wrote:

 Typically drives tend to die for me about a year after I buy them -
 alarmingly often, actually.

 Do you have a UPS? I used to get similar levels of failure, and not just
 drives,then I bought a UPS and things got much better. It seems the mains
 supply here is not as stable as it should be.

I do not, and I can't say I was terribly thrilled with the performance
with the last cheap UPS I bought.

The price to do it right tends to be moderately high, so it hasn't
been a priority.  Perhaps I should look into it again.

Rich



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Neil Bothwick
On Wed, 25 Jun 2014 08:33:37 -0400, Rich Freeman wrote:

 Typically drives tend to die for me about a year after I buy them -
 alarmingly often, actually. 

Do you have a UPS? I used to get similar levels of failure, and not just
drives,then I bought a UPS and things got much better. It seems the mains
supply here is not as stable as it should be.


-- 
Neil Bothwick

Grow your own dope, plant a politician!


signature.asc
Description: PGP signature


Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread covici
Rich Freeman ri...@gentoo.org wrote:

 On Wed, Jun 25, 2014 at 6:55 AM, thegeezer thegee...@thegeezer.net wrote:
  On 06/25/2014 11:05 AM, Dale wrote:
 
 
  I got a drive picked out at Newegg.  
 
  http://www.newegg.com/Product/Product.aspx?Item=N82E16822148844
 
 
  slightly offtopic - i notice that the drive has a 2year limited warranty
 
  has anyone managed to get anything from hard drive warranties ?
 
 Yes.  Most manufacturers have a hard drive warranty tool online.  Just
 give it your serial number and it will tell you if you're eligible,
 and how to go about it.  I know Seagate wants you to run their own
 testing util (which just does a SMART test and spits out a validation
 code which you write down).
 
 I've gotten the same sorts of errors several times now on my RAID and
 when it happens I just go through the warranty process, select advance
 replacement, swap out the drive, then return the old drive in their
 packaging.
 
 Typically costs me $10 for HD replacement (I have to pay return shipping 
 only).
 
 Typically drives tend to die for me about a year after I buy them -
 alarmingly often, actually.  Anybody who doesn't run smartmon or its
 equivalent is insane, as is anybody who doesn't at least run RAID,
 though anything valuable should be backed up.

Is it not  true that you cannot run raid on consumer drives because of
timing errors?


-- 
Your life is like a penny.  You're going to lose it.  The question is:
How do
you spend it?

 John Covici
 cov...@ccs.covici.com



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Dale
Rich Freeman wrote:
 On Wed, Jun 25, 2014 at 10:23 AM, Neil Bothwick n...@digimed.co.uk wrote:
 On Wed, 25 Jun 2014 08:33:37 -0400, Rich Freeman wrote:

 Typically drives tend to die for me about a year after I buy them -
 alarmingly often, actually.
 Do you have a UPS? I used to get similar levels of failure, and not just
 drives,then I bought a UPS and things got much better. It seems the mains
 supply here is not as stable as it should be.
 I do not, and I can't say I was terribly thrilled with the performance
 with the last cheap UPS I bought.

 The price to do it right tends to be moderately high, so it hasn't
 been a priority.  Perhaps I should look into it again.

 Rich

 .



I have had two CyberPower UPS's and been happy with them.  Both still
work but had to put in a set of batteries in the older one.  Old one
runs my TV during those frequent blinks we get here and the new one runs
my puter.  I usually catch them on sale for a little over $100 here. I
want to get two more at some point.  One for my Mom's TV and one for my
sis-n-law's puter. 

Out of all the hard drives I have ever had, only one has failed.  The
smart software gave me enough warning to copy the stuff over.  Maybe me
having a UPS has helped on that.  No way to prove it either way tho. 

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Dale
Rich Freeman wrote:
 On Wed, Jun 25, 2014 at 9:15 AM, Dale rdalek1...@gmail.com wrote:
 thegeezer wrote:
 On 06/25/2014 08:49 AM, Dale wrote:
 thegeezer wrote:
 this says there are 104 pending sectors i.e. bad blocks on the drive
 that have not been reallocatd yet
 Wonder why it hasn't?  Isn't it supposed to do that sort of thing itself?

 It can't relocate the sectors until it successfully reads them, or
 until something else writes over them.

 However, the last few drives I've had this happen to never really
 relocated things.  If I scrubbed the drives mdadm would overwrite the
 unreadable sectors, which should trigger a relocation, but then a day
 or two later the errors would show up again.  So, the drive firmware
 must be avoiding relocation or something.  Either that or there is a
 large region of the drive that is failing (which would make sense) and
 I was just playing whack-a-mole with the bad sectors.  In any case, if
 the drive is under warranty I've yet to have a complaint returning it
 with a copy of the smartctl output showing the failed test/etc.  With
 advance replacement I can keep the old drive until the new one
 arrives.

I'm going to bet this drive is out of warranty.  I'm pretty sure it is
over 2 years since I bought it. 

Once I replace that drive, I'll dd the thing and see what it does then. 
It'll either break it or give me a fresh start to play with and see how
long it lasts.


 I usually just run the test manually but I sort of had family stuff
 going on for the past year, almost a year anyway.  Sort of behind on
 things although I have been doing my normal updates.
 rc-update add smartd default

 I don't know that I even had to configure it - it is set to email
 root@localhost when there is a problem.  I also run mdadm to monitor
 raid.

 I don't think anybody makes a monitor for btrfs, though my boot is
 mirrored across all my btrfs drives using mdadm so a drive failure
 should be detected in any case.  I need to check up on that, though -
 I'd like an email if something goes wrong with btrfs storage.

I'm using lvm here.  I also don't have a mail server set up which is why
I run them manually.   I usually do it once a month or so but had some
family issues to pop up. 


 I ordered a drive.  It should be here tomorrow.  In the meantime, I
 shutdown and re-seated all the cables, power too. I got the test running
 again but results is a few hours off yet.  It did pass the short test
 tho.  I'm not sure that it means much.
 Short test generally doesn't do much - you need the long ones.  I'd be
 shocked if it passed with offline uncorrectable sectors.

 And do check on your warranty.  You can migrate all your data to the
 new drive, and then replace the old one as a backup disk.  Either use
 it with raid, or as an offline backup.  If you want to do raid you can
 set up mdadm with a degraded raid1 so that you can copy your data over
 from your old drive, and then when it is replaced you just partition
 the new one, add it to the raid, and watch it rebuild automatically.

 Rich



I figured the short test wouldn't say much.  I am backing up some of the
stuff tho.  I do have a 750GB drive that was empty.  It won't save it
all but it is a start.  Test should have been done by now but I guess
the copy process is slowing it down.  I'm getting this so far:

# 1  Extended offlineSelf-test routine in progress 70%
16387 - 

 dale twiddles his thumbs 

Thanks much.

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Neil Bothwick
On Wed, 25 Jun 2014 10:54:51 -0500, Dale wrote:

 I'm using lvm here.  I also don't have a mail server set up which is why
 I run them manually. 

Install a simple forwarding MTA like ssmtp to have al mails from cron and
friends sent to your ISP mailbox.


-- 
Neil Bothwick

Beware! The end is... aaarrgh!


signature.asc
Description: PGP signature


Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Rich Freeman
On Wed, Jun 25, 2014 at 11:30 AM,  cov...@ccs.covici.com wrote:

 Is it not  true that you cannot run raid on consumer drives because of
 timing errors?


Yes, it is not true.  :)

I've never had issues running RAID on consumer drives.

Sure, devices certified for RAID might spend less time trying to
recover data which is a bit more optimal, but only in the situation
where your drive is actually failing.  If my RAID blocks on read for
30 seconds once a year when a drive is about to die I can live with
that, assuming mdadm doesn't figure out it should give up sooner than
that.

Rich



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Rich Freeman
On Wed, Jun 25, 2014 at 11:54 AM, Dale rdalek1...@gmail.com wrote:
 I'm going to bet this drive is out of warranty.  I'm pretty sure it is
 over 2 years since I bought it.

 Once I replace that drive, I'll dd the thing and see what it does then.
 It'll either break it or give me a fresh start to play with and see how
 long it lasts.

Well, finding out for sure is a 30 second process, so up to you
whether it is worth the time.

smartctl will give you the serial/model number, and you punch that
into a website, and it will say whether it is under warranty or not.

If you plan to wipe the disk before return, print out the results of
smartctl -a first, since wiping will probably clear the pending
sectors.

But, it is your drive, so do whatever you want with it!  :)

Rich



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Mick
On Wednesday 25 Jun 2014 17:09:52 Neil Bothwick wrote:
 On Wed, 25 Jun 2014 10:54:51 -0500, Dale wrote:
  I'm using lvm here.  I also don't have a mail server set up which is why
  I run them manually.
 
 Install a simple forwarding MTA like ssmtp to have al mails from cron and
 friends sent to your ISP mailbox.

... and when you find out please tell us:

1) What syntax is appropriate to allow the use of mail account passwds which 
contain not just alphanumeric characters but also symbols like [~@#$] ?

2) How can you force it to NOT use RC4 cipher when it logs into Google Mail to 
relay messages, but the more secure ECDHE-RSA-AES128-GCM-SHA256 that the 
server proposes ?

-- 
Regards,
Mick


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Volker Armin Hemmann
Am 25.06.2014 09:49, schrieb Dale:
 thegeezer wrote:
 this is pretty bad. enough to really go and get a replacement asap,
 and turn that disk off if you can. the self test stops at the first
 error it comes to and in this case it is LBA#2905482560 for
 calculation of where the error is check out the smartcl [1] site which
 will help you to mark the block bad though the data that was in that
 block is probably lost forever. i'd also suggest you run # smartctl -a
 /dev/sdc and paste the results here. the crucial rows are 196/197 the
 reallocated sector counts and pending sector counts. they show how
 many blocks have been reallocated, and also how many are pending. this
 will give you a scaling factor, at the moment you are in trouble, if
 these figures are very high you are in very high trouble, if they are
 low you are in low trouble. [1]
 http://smartmontools.sourceforge.net/badblockhowto.html 
 Here is the output:

 root@fireball / # smartctl -a /dev/sdc
 smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build)
 Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

 === START OF INFORMATION SECTION ===
 Model Family: Seagate Barracuda 7200.14 (AF)
 Device Model: ST3000DM001-9YN166
 Serial Number:Z1F0PKT5
 LU WWN Device Id: 5 000c50 04d79e15c
 Firmware Version: CC4C
 User Capacity:3,000,592,982,016 bytes [3.00 TB]
 Sector Sizes: 512 bytes logical, 4096 bytes physical
 Rotation Rate:7200 rpm
 Device is:In smartctl database [for details use: -P show]
 ATA Version is:   ATA8-ACS T13/1699-D revision 4
 SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
 Local Time is:Wed Jun 25 02:46:39 2014 CDT

 == WARNING: A firmware update for this drive is available,
 see the following Seagate web pages:
 http://knowledge.seagate.com/articles/en_US/FAQ/207931en
 http://knowledge.seagate.com/articles/en_US/FAQ/223651en

 SMART support is: Available - device has SMART capability.
 SMART support is: Enabled

 === START OF READ SMART DATA SECTION ===
 SMART overall-health self-assessment test result: PASSED

 General SMART Values:
 Offline data collection status:  (0x00) Offline data collection activity
 was never started.
 Auto Offline Data Collection:
 Disabled.
 Self-test execution status:  ( 118) The previous self-test completed
 having
 the read element of the test failed.
 Total time to complete Offline
 data collection:(  584) seconds.
 Offline data collection
 capabilities:(0x73) SMART execute Offline immediate.
 Auto Offline data collection
 on/off support.
 Suspend Offline collection upon new
 command.
 No Offline surface scan supported.
 Self-test supported.
 Conveyance Self-test supported.
 Selective Self-test supported.
 SMART capabilities:(0x0003) Saves SMART data before entering
 power-saving mode.
 Supports SMART auto save timer.
 Error logging capability:(0x01) Error logging supported.
 General Purpose Logging supported.
 Short self-test routine
 recommended polling time:(   1) minutes.
 Extended self-test routine
 recommended polling time:( 340) minutes.
 Conveyance self-test routine
 recommended polling time:(   2) minutes.
 SCT capabilities:  (0x3085) SCT Status supported.

 SMART Attributes Data Structure revision number: 10
 Vendor Specific SMART Attributes with Thresholds:
 ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE 
 UPDATED  WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate 0x000f   119   099   006Pre-fail 
 Always   -   234421760
   3 Spin_Up_Time0x0003   092   092   000Pre-fail 
 Always   -   0
   4 Start_Stop_Count0x0032   100   100   020Old_age  
 Always   -   33
   5 Reallocated_Sector_Ct   0x0033   100   100   036Pre-fail 
 Always   -   0
   7 Seek_Error_Rate 0x000f   079   060   030Pre-fail 
 Always   -   99909120
   9 Power_On_Hours  0x0032   082   082   000Old_age  
 Always   -   16379
  10 Spin_Retry_Count0x0013   100   100   097Pre-fail 
 Always   -   0
  12 Power_Cycle_Count   0x0032   100   100   020Old_age  
 Always   -   34
 183 Runtime_Bad_Block   0x0032   100   100   000Old_age  
 Always   -   0
 184 End-to-End_Error0x0032   100   100   099Old_age  
 Always   -   0
 187 Reported_Uncorrect  0x0032 

Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Rich Freeman
On Wed, Jun 25, 2014 at 12:55 PM, Volker Armin Hemmann
volkerar...@googlemail.com wrote:

 so without looking that drive up - you are using a desktop part for
 non-stop setup?

Honestly, I think it makes far more sense to build a fault-tolerant
setup than to try to avoid faults by spending more on the parts.  I've
only run desktop hard drives on my 24x7 RAID.  If they die I replace
them under warranty - I've yet to have one die outside of warranty,
and I'm usually upgrading for size by that timeframe anyway, and I can
use the old drives for storage.

By all means get better-grade components, but I wouldn't use that as
an excuse for not having backups of some kind.  ALL hard drives WILL
fail, it is just a matter of when.  ANY hard drive can fail the day
after you buy it, a month after you buy it, and so on, though
obviously the probability of a particular drive failing at any point
in time may vary by what you pay for it.

I'd buy a more expensive drive only if the TCO is actually lower.  I'd
engineer any system to accept the failure of at least one drive, and
for any data I actually cared about I'd engineer the system to resist
fire, the rm star, and so on.

Rich



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Volker Armin Hemmann
Am 25.06.2014 19:06, schrieb Rich Freeman:
 On Wed, Jun 25, 2014 at 12:55 PM, Volker Armin Hemmann
 volkerar...@googlemail.com wrote:
 so without looking that drive up - you are using a desktop part for
 non-stop setup?
 Honestly, I think it makes far more sense to build a fault-tolerant
 setup than to try to avoid faults by spending more on the parts.  I've
 only run desktop hard drives on my 24x7 RAID.  If they die I replace
 them under warranty
so you are ripping of other customers?

  - I've yet to have one die outside of warranty,
 and I'm usually upgrading for size by that timeframe anyway, and I can
 use the old drives for storage.

 By all means get better-grade components, but I wouldn't use that as
 an excuse for not having backups of some kind.

there is no excuse for not having backups.


   ALL hard drives WILL
 fail, it is just a matter of when. 
indeed.

  ANY hard drive can fail the day
 after you buy it, a month after you buy it, and so on, though
 obviously the probability of a particular drive failing at any point
 in time may vary by what you pay for it.

or if it was meant to be used the way you use it.




Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Neil Bothwick
On Wed, 25 Jun 2014 17:44:48 +0100, Mick wrote:

  Install a simple forwarding MTA like ssmtp to have all mails from cron
  and friends sent to your ISP mailbox.  
 
 ... and when you find out please tell us:
 
 1) What syntax is appropriate to allow the use of mail account passwds
 which contain not just alphanumeric characters but also symbols like
 [~@#$] ?
 
 2) How can you force it to NOT use RC4 cipher when it logs into Google
 Mail to relay messages, but the more secure ECDHE-RSA-AES128-GCM-SHA256
 that the server proposes ?

It's debatable whether either of those scenarios fall within the
definition of simple. If something that simple won't do what you want,
and there are several to try: ssmtp, esmtp, nullmailer etc, then you may
need to use the likes of Postfix - but for Dale's situation, a lightweight
forwarder is better than not being able to monitor his system.


-- 
Neil Bothwick

I thought the 10 commandments were multiple choice.


signature.asc
Description: PGP signature


Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Rich Freeman
On Wed, Jun 25, 2014 at 1:15 PM, Volker Armin Hemmann
volkerar...@googlemail.com wrote:
 Am 25.06.2014 19:06, schrieb Rich Freeman:
 Honestly, I think it makes far more sense to build a fault-tolerant
 setup than to try to avoid faults by spending more on the parts.  I've
 only run desktop hard drives on my 24x7 RAID.  If they die I replace
 them under warranty
 so you are ripping of other customers?


I certainly am not aware of any warranty terms I'm violating.  I just
spot checked a drive warranty and it makes no mention of excluding
continuous use, and the drive specifications do not contain any
exclusions for continuous use.

The SMART data in the drives I've returned contains both the number of
power cycles and power-on time, and I've yet to have a manufacturer
question either.

To exclude continuous operation their warranty would have to specify
just how many hours per day their drives can be operated for.


  ANY hard drive can fail the day
 after you buy it, a month after you buy it, and so on, though
 obviously the probability of a particular drive failing at any point
 in time may vary by what you pay for it.

 or if it was meant to be used the way you use it.

Like I said, I'm certainly interested in any actual data that supports
that drives sold to run 24x7 last any longer than desktop drives when
run 24x7.

Rich



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Alan McKinnon
On 25/06/2014 17:30, cov...@ccs.covici.com wrote:
 Rich Freeman ri...@gentoo.org wrote:
 
 On Wed, Jun 25, 2014 at 6:55 AM, thegeezer thegee...@thegeezer.net wrote:
 On 06/25/2014 11:05 AM, Dale wrote:


 I got a drive picked out at Newegg.  

 http://www.newegg.com/Product/Product.aspx?Item=N82E16822148844


 slightly offtopic - i notice that the drive has a 2year limited warranty

 has anyone managed to get anything from hard drive warranties ?

 Yes.  Most manufacturers have a hard drive warranty tool online.  Just
 give it your serial number and it will tell you if you're eligible,
 and how to go about it.  I know Seagate wants you to run their own
 testing util (which just does a SMART test and spits out a validation
 code which you write down).

 I've gotten the same sorts of errors several times now on my RAID and
 when it happens I just go through the warranty process, select advance
 replacement, swap out the drive, then return the old drive in their
 packaging.

 Typically costs me $10 for HD replacement (I have to pay return shipping 
 only).

 Typically drives tend to die for me about a year after I buy them -
 alarmingly often, actually.  Anybody who doesn't run smartmon or its
 equivalent is insane, as is anybody who doesn't at least run RAID,
 though anything valuable should be backed up.
 
 Is it not  true that you cannot run raid on consumer drives because of
 timing errors?
 
 


That sounds like something EMC and WD/Seagate would say.

There's no reason in the world not to use consumer drives for RAID -
unless you plan to add the drives to those obscenely expensive full-rack
SAN jobs vendors want folk to buy.

The reason consumer drives tend not to work in those arrays has nothing
to do with the performance of the drive itself. The manufacturers flip a
bit in the firmware and without that signature the array hardware often
will not use the drive. It often really is as simple as that.





-- 
Alan McKinnon
alan.mckin...@gmail.com




Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Douglas J Hunley
On Wed, Jun 25, 2014 at 7:03 AM, Rich Freeman ri...@gentoo.org wrote:

 I don't think anybody makes a monitor for btrfs, though my boot is
 mirrored across all my btrfs drives using mdadm so a drive failure
 should be detected in any case.  I need to check up on that, though -
 I'd like an email if something goes wrong with btrfs storage.


You're going to want to cron a 'scrub' and have it email you. There's no
background daemon that I'm aware of to handle this. ZFS just introduced
'zed' and it would be nice if BTRFS would do the same


-- 
Douglas J Hunley (doug.hun...@gmail.com)
Twitter: @hunleyd   Web:
about.me/douglas_hunley
G+: http://google.com/+DouglasHunley


Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Rich Freeman
On Wed, Jun 25, 2014 at 2:44 PM, Douglas J Hunley doug.hun...@gmail.com wrote:

 You're going to want to cron a 'scrub' and have it email you. There's no
 background daemon that I'm aware of to handle this. ZFS just introduced
 'zed' and it would be nice if BTRFS would do the same

Actually, I think that for serious failures smartd will take care of it.

I was reading the btrfs list archives and apparently btrfs doesn't
make as much as a whisper when a drive fails.  It just keeps on going.

Now, the keeps on going part I'm fine with, but you'd think that
operating in a degraded mode would trigger some kind of message.

Granted, it isn't 100% done yet, either.  In fact, if your replace the
failed drive you have to manually force a re-balance or it will just
continue to operate degraded.

Rich



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Dale
Volker Armin Hemmann wrote:
 so without looking that drive up - you are using a desktop part for
 non-stop setup? 

If I recall correctly, the last drive that died was a more expensive
type of drive, intended for a server setup.  So far, the cheaper
drives are the ones that have lasted until I outgrew them.  So far, I
have yet to ever have a drive die under warranty.  So far.  I need to
check on this one but will do that after I get things changed out. 

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread J. Roeleveld
On Wednesday, June 25, 2014 01:44:23 PM Rich Freeman wrote:
 On Wed, Jun 25, 2014 at 1:15 PM, Volker Armin Hemmann
   ANY hard drive can fail the day
  
  after you buy it, a month after you buy it, and so on, though
  obviously the probability of a particular drive failing at any point
  in time may vary by what you pay for it.
  
  or if it was meant to be used the way you use it.
 
 Like I said, I'm certainly interested in any actual data that supports
 that drives sold to run 24x7 last any longer than desktop drives when
 run 24x7.

Not hard data, but while still using desktop drives, I had a drive failure on 
average once or twice a year. Now with enterprise 24x7 drives, the failure 
rate has dropped to 1 in the past 3 years.

That is, for both, using proper UPS equipment.
Additionally, I noticed a definite speed increase after switching to 
enterprise disks.

--
Joost



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Dale
Neil Bothwick wrote:
 On Wed, 25 Jun 2014 17:44:48 +0100, Mick wrote:

 Install a simple forwarding MTA like ssmtp to have all mails from cron
 and friends sent to your ISP mailbox.  
 ... and when you find out please tell us:

 1) What syntax is appropriate to allow the use of mail account passwds
 which contain not just alphanumeric characters but also symbols like
 [~@#$] ?

 2) How can you force it to NOT use RC4 cipher when it logs into Google
 Mail to relay messages, but the more secure ECDHE-RSA-AES128-GCM-SHA256
 that the server proposes ?
 It's debatable whether either of those scenarios fall within the
 definition of simple. If something that simple won't do what you want,
 and there are several to try: ssmtp, esmtp, nullmailer etc, then you may
 need to use the likes of Postfix - but for Dale's situation, a lightweight
 forwarder is better than not being able to monitor his system.




I have to say, I dread setting up a mail server about as bad as I dread
going to the Doctor.  It's just something I really don't want to add to
my system unless I have to.  It's sort of like the init thingy.  I don't
want to add something else that will eventually break and I'll have to
fix.  The mail system won't keep me from booting but it is just one more
thing to keep a eye on and make sure it is working.  So, making sure the
mail system is working will likely take up the same amount of time that
checking the drive manually every month or so will take.  The only good
part is, and this is the point you are making so well, even tho I had
other things going on, it would have been testing my drive and spit out
a error to get my attention.  Going back, the error has been there for a
while.  It would have been nice to know this before now.   Hindsight
again.  ;-)

What I really need to do, set up a RAID or some other backup method so
that even if this happens again, I don't risk losing anything.  Then
again, that will take time as well.  Also takes money.

From df -h:

Filesystem Size  Used Avail Use% Mounted on
/dev/mapper/home-home  2.7T  1.5T  1.3T  56% /home

Most of that is recorded TV shows, movies etc.  I also have some pics I
took with my camera that can't be replaced.  Those I backup to DVDs
pretty regular.  I use kbackup to tarball them and then burn them to
DVDs.  It works.  One set is outside the home in case of fire.  The
biggest thing is some of those shows would be hard to get again plus the
effort to get them as well.

Let's hope it lasts until at least tomorrow.  I bet it takes a while to
copy all that tho.  O_O

Thanks.

Dale

:-)  :-)




Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Rich Freeman
On Wed, Jun 25, 2014 at 6:16 PM, Dale rdalek1...@gmail.com wrote:
 What I really need to do, set up a RAID or some other backup method so
 that even if this happens again, I don't risk losing anything.  Then
 again, that will take time as well.  Also takes money.

Keep in mind that RAID is more about speed of recovery and protects
against the failure mode of total drive failure, which is a fairly
common failure mode.  A hard drive failure on a RAID involves no
unplanned downtime, and a need for some short planned downtime to
replace the drive.

Backup protects against a lot more, but typically results in a
recovery that takes hours, and when the drive goes you're down without
warning.


 Most of that is recorded TV shows, movies etc.  I also have some pics I
 took with my camera that can't be replaced.  Those I backup to DVDs
 pretty regular.  I use kbackup to tarball them and then burn them to
 DVDs.  It works.  One set is outside the home in case of fire.  The
 biggest thing is some of those shows would be hard to get again plus the
 effort to get them as well.

So, stuff like photos I backup to the cloud, or to offsite media
(generally I favor the cloud for active stuff, and offsite media for
stuff I'm done with).  Ditto for things like /etc, mysql, documents,
email, and other small but important things.

For stuff like MythTV recordings I used to just rely on RAID -
recognizing that there was a very real possibility that I could lose
them all.  Now I also do a backup to a drive that is normally left
unmounted, which isn't great, but since I moved to btrfs I wanted
something on ext4 that had daily rsnapshots.  Again, I'm willing to
risk losing this stuff.

Rich



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Neil Bothwick
On Wed, 25 Jun 2014 17:16:17 -0500, Dale wrote:

  Install a simple forwarding MTA like ssmtp to have all mails from
  cron and friends sent to your ISP mailbox.  

 I have to say, I dread setting up a mail server about as bad as I dread
 going to the Doctor.  It's just something I really don't want to add to
 my system unless I have to.

Which is why I suggesting something like ssmtp, which you can't call  a
server, it just forwards. Often the only configuration needed is changing
one line in ssmtp.conf, to the address of your ISP's mail server.

That's it, now any program can send mail using sendmail and it just goes
to your ISP mailbox.


-- 
Neil Bothwick

God said, div D = rho, div B = 0, curl E = - @B/@t, curl H = J + @D/@t,
and there was light.


signature.asc
Description: PGP signature


Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Bill Kenworthy
On 26/06/14 06:16, Dale wrote:
 Neil Bothwick wrote:
 On Wed, 25 Jun 2014 17:44:48 +0100, Mick wrote:

 Install a simple forwarding MTA like ssmtp to have all mails from cron
 and friends sent to your ISP mailbox.  
 ... and when you find out please tell us:


 
 What I really need to do, set up a RAID or some other backup method so
 that even if this happens again, I don't risk losing anything.  Then
 again, that will take time as well.  Also takes money.
 

Repeat after me ... RAID IS NOT A BACKUP

There are many ways to do a backup - various raid forms, mirrors etc can
help in some (and only some) instances but only a spatially separated
copy of the data is relatively safe.

Have two computers? - cross backup between them. (keep an old machine as
a file server in the back room, start it up a couple of times a week and
run a backup script - can even be automated)

Have a friend/relative nearby? - take your PC over, create a backup and
then sync the differences across the net using rsync etc - most normal
people do fill up todays large disks, or have large personal valuable
data requirements.

You dont need to backup the whole machine, just the valuable bits
(configs, personal data, email archives, ...)

There are many ways to do it - if you only have one disk and no backups,
the data by definition is not valuable :)

Ive just been caught by an old 1G WD green drive failing (possibly the
MB's fault as the sata interface died as well - seen a few of those
now!) that took out the middle drive from a striped LVM.  Didnt bother
to recover, just built a new machine from leftover bits, bought another
drive and rebuilt it using btrfs raid 1 on the two orignal WD 2G green
drives and a new WD red, and restored from backups on another machine -
over the years this type of event has happened a few times - you only
need to get burnt once to learn!.

BillK




Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread David Haller
Hello,

On Wed, 25 Jun 2014, Dale wrote:
thegeezer wrote:
 On 06/25/2014 08:49 AM, Dale wrote:
 Device Model: ST3000DM001-9YN166

I have (had sort of) the same disc, with the same FW.

 see the following Seagate web pages:
 http://knowledge.seagate.com/articles/en_US/FAQ/207931en
 http://knowledge.seagate.com/articles/en_US/FAQ/223651en
 interesting - not seen that before might be worth a nose

I was thinking the same thing myself.  How does it know there is a
update was another question I had. 

Those FW-Updates do _NOT_ apply to FW-Version 9YN166. From what I
found, you'd brick the drive. The smartctl DB does not take the
FW-version into account, just the model, to display above notice.

   7 Seek_Error_Rate 0x000f   079   060   030Pre-fail 
 Always   -   99909120
   9 Power_On_Hours  0x0032   082   082   000Old_age  
 Always   -   16379
 almost two years of power on time

looks familiar

  4 Start_Stop_Count0x0032   100   100   020Old_age   Always
   915
  7 Seek_Error_Rate 0x000f   072   060   030Pre-fail  Always
   19309568
  9 Power_On_Hours  0x0032   088   088   000Old_age   Always
   11351

[..]
 197 Current_Pending_Sector  0x0012   100   100   000Old_age  
 Always   -   104
 197
 this says there are 104 pending sectors i.e. bad blocks on the drive
 that have not been reallocatd yet

Wonder why it hasn't?  Isn't it supposed to do that sort of thing itself? 

 198 Offline_Uncorrectable   0x0010   100   100   000Old_age  
 Offline  -   104
 this says it was not able to reallocate. which is odd because of the
 entry 5 being zero

Uh oh. 

Yeah. Oh, and I had a clean smart until a few days ago, luckily I
alread had a WD Red (WD40EFRX) drive waiting when this attrib jumped
from 0 to:

  5 Reallocated_Sector_Ct   0x0033   087   087   036Pre-fail  Always
   17688

Other Seagates (a few 1.5T drives) have also made me trouble, the 2T
Samsung already relabeled and sold as a Seagate but with Samsung in
the FW though is still ok.

[..]
I ordered a drive.  It should be here tomorrow.  In the meantime, I
shutdown and re-seated all the cables, power too. I got the test running
again but results is a few hours off yet.  It did pass the short test
tho.  I'm not sure that it means much. 

Good. Do not use dd, it WILL fail at the first error. Use gnu ddrescue
or dd_rescue to grab an image. I used mc to copy via filesystem, eg. 
'rsync -auxlPRAXSHD /foo/ /bar/' is fine too. Oh, and I hope you
didn't buy a Seagate again ;)

-dnh

-- 
The sigmonster ate my sig and all I got was this stupid tagline.



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Daniel Frey
On 06/25/2014 10:44 AM, Rich Freeman wrote:
 
 Like I said, I'm certainly interested in any actual data that supports
 that drives sold to run 24x7 last any longer than desktop drives when
 run 24x7.
 

Anecdotal, but...

In 2008 I bought four 24x7 drives (500GB) and eight regular drives to be
used in raid. Out of the eight regular drives, six failed before 4 years
was up.

All of the 24x7 drives are still in use (although I don't remember which
machine(s) they're in now), six years later.

All Seagate.

I initially did do warranty replacement on the failed drives (all drives
had 5 year warranty back then), and out of the six replacements, four
failed a little over three months in.

At that point I went and bought a real battery backed raid card
(computer still has a UPS) with WD enterprise drives and no hiccups of
any kind in about two years. And disk performance is way, way up.

Dan



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Dale
Rich Freeman wrote:
 On Wed, Jun 25, 2014 at 6:16 PM, Dale rdalek1...@gmail.com wrote:
 What I really need to do, set up a RAID or some other backup method so
 that even if this happens again, I don't risk losing anything.  Then
 again, that will take time as well.  Also takes money.
 Keep in mind that RAID is more about speed of recovery and protects
 against the failure mode of total drive failure, which is a fairly
 common failure mode.  A hard drive failure on a RAID involves no
 unplanned downtime, and a need for some short planned downtime to
 replace the drive.

 Backup protects against a lot more, but typically results in a
 recovery that takes hours, and when the drive goes you're down without
 warning.

True.  My issue with RAID is that it is yet another thing I have to
maintain.  I started using lvm and so far, it has been low maintenance
and has made changing things MUCH easier when I do need to move things
around a bit.  It is a time saver to be more accurate.  RAID also leaves
me open to theft, house fire and such too.  At the moment, I think, like
you, having a external drive that I keep somewhere else is the safest
method.  Thing is, getting a drive big enough to do this. Buying this
drive put a dent in my debt.  That said, I really need to buy another
drive if this old one turns out to be bad and set up some sort of backup
plan.  If it turns out to be OK somehow, then I may have a solution,
maybe. 

While I don't want to lose anything, my camera pics is the most
important to keep.  That's why I rotate backups and keep one set outside
the house.  I would rather not lose my videos and could get most of them
back but it won't be easy for sure. 


 Most of that is recorded TV shows, movies etc.  I also have some pics I
 took with my camera that can't be replaced.  Those I backup to DVDs
 pretty regular.  I use kbackup to tarball them and then burn them to
 DVDs.  It works.  One set is outside the home in case of fire.  The
 biggest thing is some of those shows would be hard to get again plus the
 effort to get them as well.
 So, stuff like photos I backup to the cloud, or to offsite media
 (generally I favor the cloud for active stuff, and offsite media for
 stuff I'm done with).  Ditto for things like /etc, mysql, documents,
 email, and other small but important things.

 For stuff like MythTV recordings I used to just rely on RAID -
 recognizing that there was a very real possibility that I could lose
 them all.  Now I also do a backup to a drive that is normally left
 unmounted, which isn't great, but since I moved to btrfs I wanted
 something on ext4 that had daily rsnapshots.  Again, I'm willing to
 risk losing this stuff.

 Rich



I don't have anything on the cloud to backup too.  That would likely be
a good idea but I can't afford anything pricey, which is why I hadn't
bought a backup drive before now either.  Plus, something I'd prefer to
keep under my thumb.  Heck, some things here are encrypted, bank info
and such.  Also, while I have DSL, it ain't real speedy.  Backing up
that much data over my connection could take a while, like days, maybe
even a week or more. 

I really do need a plan that I can manage to put in place tho.  Murphy's
law and all.  :-D 

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Dale
Rich Freeman wrote:
 On Wed, Jun 25, 2014 at 11:54 AM, Dale rdalek1...@gmail.com wrote:
 I'm going to bet this drive is out of warranty.  I'm pretty sure it is
 over 2 years since I bought it.

 Once I replace that drive, I'll dd the thing and see what it does then.
 It'll either break it or give me a fresh start to play with and see how
 long it lasts.
 Well, finding out for sure is a 30 second process, so up to you
 whether it is worth the time.

 smartctl will give you the serial/model number, and you punch that
 into a website, and it will say whether it is under warranty or not.

 If you plan to wipe the disk before return, print out the results of
 smartctl -a first, since wiping will probably clear the pending
 sectors.

 But, it is your drive, so do whatever you want with it!  :)

 Rich



I do plan to check and see if it is under warranty.  I'll do that after
I get things moved over and can test a bit more.  Who knows, it could be
Murphy and he will just leave at some point.  ;-)   I'm pretty sure this
drive is close to three years old tho.  Heck, I can go look at the
Newegg order history and find out.  I would think the manufacturer goes
by the date made where a invoice dated later would tend to slide that
out further. 

Either way, I'll find out.  If it is under warranty and can be swapped
out, that would solve a few issues.  I'll have one backup drive at
least.  ;-)

Dale

:-)  :-)



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Rich Freeman
On Wed, Jun 25, 2014 at 10:07 PM, Dale rdalek1...@gmail.com wrote:
 Rich Freeman wrote:
 I don't have anything on the cloud to backup too.  That would likely be
 a good idea but I can't afford anything pricey, which is why I hadn't
 bought a backup drive before now either.  Plus, something I'd prefer to
 keep under my thumb.  Heck, some things here are encrypted, bank info
 and such.  Also, while I have DSL, it ain't real speedy.  Backing up
 that much data over my connection could take a while, like days, maybe
 even a week or more.


I put my backups on Amazon S3 reduced-redundancy - it is a few cents
per GB per month.  I think I have something like 20-30GB backed up.
Oh, if you need to actually retrieve it that will cost you 10 cents
per GB, but frankly if my house burned down that would be the least of
my concerns.

I'd only use the cloud to back up critical data.  If you want to back
up your mythtv and mp3 collection, then you're going to be uploading a
LOT of data and paying quite a bit to store it.  If you want to be
storing TB of data offsite there are better ways of doing it.

The advantage of something like S3 is that it is always there, which
means you stick a duplicity script in your crontab and just
periodically check up on it.  You don't have to remember to do your
backups.  It just isn't practical to use it for more than a few dozen
GB depending on your incremental strategy.

I also have a 50Mbps outbound connection, which doesn't hurt.

Your next best option is to find a friend with similar needs and give
each other a place to upload your encrypted backups to.  That will
just cost you drive space, but if you're both planning on backing up
1TB of data it will still cost you the one-time drive purchase.

If you want a quick cloud-capable backup solution, I'd look at
duplicity.  I just wish it had options for Google Drive (it supposedly
does, but as far as I can tell it doesn't work, at least not with a
two factor application password).

Rich



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Dale
Neil Bothwick wrote:

 I have to say, I dread setting up a mail server about as bad as I dread
 going to the Doctor.  It's just something I really don't want to add to
 my system unless I have to.
 Which is why I suggesting something like ssmtp, which you can't call  a
 server, it just forwards. Often the only configuration needed is changing
 one line in ssmtp.conf, to the address of your ISP's mail server.

 That's it, now any program can send mail using sendmail and it just goes
 to your ISP mailbox.



I like this part:

Extremely simple MTA to get mail off the system to a Mailhub

  ^  That part right up there.  :-D  That may be a new thread, if
needed.

Dale

:-)  :-)




Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Dale
Bill Kenworthy wrote:
 On 26/06/14 06:16, Dale wrote:
 Neil Bothwick wrote:
 On Wed, 25 Jun 2014 17:44:48 +0100, Mick wrote:

 Install a simple forwarding MTA like ssmtp to have all mails from cron
 and friends sent to your ISP mailbox.  
 ... and when you find out please tell us:

 What I really need to do, set up a RAID or some other backup method so
 that even if this happens again, I don't risk losing anything.  Then
 again, that will take time as well.  Also takes money.

 Repeat after me ... RAID IS NOT A BACKUP

I agree with that.  Power supply goes nuts and burns out the whole
puter.  RAID won't help that.  House catches fire, ooops.  Thief steals
puter,  uh oh.  That list could go on for a while.  About the only thing
it does is allow quick recovery from a failing/dead drive.  Basically. 
It's good at that from what I have read. 


 There are many ways to do a backup - various raid forms, mirrors etc can
 help in some (and only some) instances but only a spatially separated
 copy of the data is relatively safe.

 Have two computers? - cross backup between them. (keep an old machine as
 a file server in the back room, start it up a couple of times a week and
 run a backup script - can even be automated)

I do have a old puter at the moment.  I thought about sticking it in a
outbuilding and just turning it on to do backups then shutting it back
down.  That puts distance between house and outbuilding too.  Thing is,
I plan to let a family member use it when I can get around to getting a
new case for it.  I guess I could use any old slow junky puter with a
LARGE drive in it. 



 Have a friend/relative nearby? - take your PC over, create a backup and
 then sync the differences across the net using rsync etc - most normal
 people do fill up todays large disks, or have large personal valuable
 data requirements.

 You dont need to backup the whole machine, just the valuable bits
 (configs, personal data, email archives, ...)

 There are many ways to do it - if you only have one disk and no backups,
 the data by definition is not valuable :)

 Ive just been caught by an old 1G WD green drive failing (possibly the
 MB's fault as the sata interface died as well - seen a few of those
 now!) that took out the middle drive from a striped LVM.  Didnt bother
 to recover, just built a new machine from leftover bits, bought another
 drive and rebuilt it using btrfs raid 1 on the two orignal WD 2G green
 drives and a new WD red, and restored from backups on another machine -
 over the years this type of event has happened a few times - you only
 need to get burnt once to learn!.

 BillK


I do backup what I know can't be replaced at all.  My camera pics can't
be replaced since they are not anywhere else.  Some other things here
that are nowhere else I can live without, just would rather not if I can
help it. 

I never backup the OS.  I just reinstall it if needed.  Generally, I try
to keep a copy of /etc and the world file.  I'll copy /etc over and use
the world file as a guide on what to install on the new install.  Heck,
I can install Kubuntu in a hour or less.  Then I can install Gentoo from
that while doing my usual puter activities. 

I had a WD 80GB drive to fail several years ago.  That's the only drive
I have ever had to fail on me tho.  It spit out errors and I was able to
do backups and save the data before it died for good.  I can't recall
the exact error but it mentioned '24 hours' and 'right now'.  It didn't
miss it by much either. 

Just imagine if we had no tools to warn us of a failure at all.  That
would suck.

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Dale
David Haller wrote:
 Hello,

 On Wed, 25 Jun 2014, Dale wrote:
 Yeah. Oh, and I had a clean smart until a few days ago, luckily I
 alread had a WD Red (WD40EFRX) drive waiting when this attrib jumped
 from 0 to: 5 Reallocated_Sector_Ct 0x0033 087 087 036 Pre-fail Always
 17688 Other Seagates (a few 1.5T drives) have also made me trouble,
 the 2T Samsung already relabeled and sold as a Seagate but with
 Samsung in the FW though is still ok. [..] 

I was wondering about how that would be updated since a lot of that
stuff requires windoze. 


 I ordered a drive.  It should be here tomorrow.  In the meantime, I
 shutdown and re-seated all the cables, power too. I got the test running
 again but results is a few hours off yet.  It did pass the short test
 tho.  I'm not sure that it means much. 
 Good. Do not use dd, it WILL fail at the first error. Use gnu ddrescue
 or dd_rescue to grab an image. I used mc to copy via filesystem, eg. 
 'rsync -auxlPRAXSHD /foo/ /bar/' is fine too. Oh, and I hope you
 didn't buy a Seagate again ;)

 -dnh


I plan to rsync or cp the data over.  The dd part will come into play
after I am sure I got everything off that I can get and am just erasing
the drive completely.  I plan to dd the drive then run the tests again
just to see what it is doing.  Heck, maybe it will reallocate that area
like it should be doing already, I guess. 

Time will tell.  I'll be having fun tomorrow tho.  ;-)

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Dale
J. Roeleveld wrote:
 On Wednesday, June 25, 2014 01:44:23 PM Rich Freeman wrote:
 On Wed, Jun 25, 2014 at 1:15 PM, Volker Armin Hemmann
  ANY hard drive can fail the day

 after you buy it, a month after you buy it, and so on, though
 obviously the probability of a particular drive failing at any point
 in time may vary by what you pay for it.
 or if it was meant to be used the way you use it.
 Like I said, I'm certainly interested in any actual data that supports
 that drives sold to run 24x7 last any longer than desktop drives when
 run 24x7.
 Not hard data, but while still using desktop drives, I had a drive failure on 
 average once or twice a year. Now with enterprise 24x7 drives, the failure 
 rate has dropped to 1 in the past 3 years.

 That is, for both, using proper UPS equipment.
 Additionally, I noticed a definite speed increase after switching to 
 enterprise disks.

 --
 Joost



I have one WD black which I think is a more expensive drive.  I have to
say, when I run hdparm -tT on it, it is faster than the other regular
drives that claim the same specs, SATA etc etc.  They do cost more tho. 
Some a good bit more unless you can catch a good sale.  

While I was looking for this new drive, I looked into a 4TB drive.  I am
still trying to get my jaw back up off the floor.  Holy sheep.  They are
still fairly proud of some of those puppies.  I did notice they have a 5
and 6TB one now.  O_O  Double holy sheep.  I think I lost my jaw now. 
Good bye nose, hello China.  :-( 

Well, one of these days we will be talking about getting 6TB drives for
$50 and how much we want a 20TB drive, to put all our worthless junk
on.  lol   Oh, we will still complain about how they die to soon too. 
We may even have CPUs that run at light speed with many dozens of cores,
but still to dang slow.  ;-)  Pass the rice please. 

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Dale
Daniel Frey wrote:
 On 06/25/2014 10:44 AM, Rich Freeman wrote:
 Like I said, I'm certainly interested in any actual data that supports
 that drives sold to run 24x7 last any longer than desktop drives when
 run 24x7.

 Anecdotal, but...

 In 2008 I bought four 24x7 drives (500GB) and eight regular drives to be
 used in raid. Out of the eight regular drives, six failed before 4 years
 was up.

 All of the 24x7 drives are still in use (although I don't remember which
 machine(s) they're in now), six years later.

 All Seagate.

 I initially did do warranty replacement on the failed drives (all drives
 had 5 year warranty back then), and out of the six replacements, four
 failed a little over three months in.

 At that point I went and bought a real battery backed raid card
 (computer still has a UPS) with WD enterprise drives and no hiccups of
 any kind in about two years. And disk performance is way, way up.

 Dan



Curious.  I hope I don't start a flame war here.  I have had WD, Seagate
and I think there is a Samsung here somewhere, may be the one that is
rolling over on its back now.  The one drive that failed a few years ago
was a WD drive.  That said, all the other WD drives I have had just got
to small to really use, and slow when SATA came out.  I'm partial to WD
and Seagate still since I got good long term use out of those.  Based on
your experience, you tend to be of the same opinion? 

Allan, your situation should involve a lot of hard drives.  Any
thoughts?  Neil, you have a nice big opinion on this? 

I realize that any brand of drive will break eventually.  That's one
reason I don't hold the one failure I have had against WD.  I got a lot
of use out of that drive and it did let me know it was going to die,
like real soon.  I'm going to duck now.  :/

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Dale
Rich Freeman wrote:
 On Wed, Jun 25, 2014 at 10:07 PM, Dale rdalek1...@gmail.com wrote:
 Rich Freeman wrote:
 I don't have anything on the cloud to backup too.  That would likely be
 a good idea but I can't afford anything pricey, which is why I hadn't
 bought a backup drive before now either.  Plus, something I'd prefer to
 keep under my thumb.  Heck, some things here are encrypted, bank info
 and such.  Also, while I have DSL, it ain't real speedy.  Backing up
 that much data over my connection could take a while, like days, maybe
 even a week or more.

 I put my backups on Amazon S3 reduced-redundancy - it is a few cents
 per GB per month.  I think I have something like 20-30GB backed up.
 Oh, if you need to actually retrieve it that will cost you 10 cents
 per GB, but frankly if my house burned down that would be the least of
 my concerns.

 I'd only use the cloud to back up critical data.  If you want to back
 up your mythtv and mp3 collection, then you're going to be uploading a
 LOT of data and paying quite a bit to store it.  If you want to be
 storing TB of data offsite there are better ways of doing it.

Outside my camera pics, I don't think I have anything that critical.  I
backed them up on 7 DVDs yesterday.  I been doing that for many years. 
Two sets just to be sure.  I also rotate the DVDs after a while too.  I
burn sysrescue ISOs to it or something. 



 The advantage of something like S3 is that it is always there, which
 means you stick a duplicity script in your crontab and just
 periodically check up on it.  You don't have to remember to do your
 backups.  It just isn't practical to use it for more than a few dozen
 GB depending on your incremental strategy.

 I also have a 50Mbps outbound connection, which doesn't hurt.

Downstream Rate  1536 (Kbits/Sec)
Upstream Rate  384 (Kbits/Sec)


While it ain't super fast, it beats dial-up and I remember those days
very well.  Still pretty slow to do backups over tho.  :/



 Your next best option is to find a friend with similar needs and give
 each other a place to upload your encrypted backups to.  That will
 just cost you drive space, but if you're both planning on backing up
 1TB of data it will still cost you the one-time drive purchase.

 If you want a quick cloud-capable backup solution, I'd look at
 duplicity.  I just wish it had options for Google Drive (it supposedly
 does, but as far as I can tell it doesn't work, at least not with a
 two factor application password).

 Rich



I'm just going to try and buy another 3TB drive as soon as I can.  I may
even make it into a removable thingy.  Then I can make backups and just
put it in a outbuilding.  By the way, my outbuilding is pretty far from
the house.  A house fire wouldn't hurt it any.  I got so much junk in
there, a thief would shake his head and leave empty handed.  May even
cry at the thought of it. 

Working up a plan and hoping to work the plan. 

While at it.  Latest test results.  It finished a bit ago. 

root@fireball / # smartctl -l selftest /dev/sdc
smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_DescriptionStatus  Remaining 
LifeTime(hours)  LBA_of_first_error
# 1  Extended offlineCompleted: read failure   60%
16394 2905482560
# 2  Extended offlineCompleted: read failure   60%
16389 2905482560

It is still rolling over.  It should throw up its feet any day now.  :-( 

Dale

:-)  :-) 



Re: [gentoo-user] smartctrl drive error @60%

2014-06-25 Thread Mick
On Thursday 26 Jun 2014 03:15:54 Dale wrote:
 Neil Bothwick wrote:
  I have to say, I dread setting up a mail server about as bad as I dread
  going to the Doctor.  It's just something I really don't want to add to
  my system unless I have to.
  
  Which is why I suggesting something like ssmtp, which you can't call  a
  server, it just forwards. Often the only configuration needed is changing
  one line in ssmtp.conf, to the address of your ISP's mail server.
  
  That's it, now any program can send mail using sendmail and it just goes
  to your ISP mailbox.
 
 I like this part:
 
 Extremely simple MTA to get mail off the system to a Mailhub
 
   ^  That part right up there.  :-D  That may be a new thread, if
 needed.

Try this basic setup in your /etc/ssmtp/ssmtp.conf:


root=d...@gmail.com  #Change to your preferred email address

mailhub=smtp.gmail.com:465  #Could also use port 587 for STARTTLS

rewriteDomain=dales_smoker.shack  #Something to denote your machine's name

FromLineOverride=YES

UseTLS=YES  #Can also try UseSTARTTLS=YES as an alternative

AuthUser=d...@gmail.com
AuthPass=dalesgmails3cr3tpasswd #Special characters seem to barf with ssmtp


Sort out access rights to 0604, since it now contains your mail passwd 
unencrypted:

# ls -la /etc/ssmtp/ssmtp.conf
-rw-r- 1 root ssmtp 1696 May 19 23:40 /etc/ssmtp/ssmtp.conf

Add this in your /etc/ssmtp/revaliases:
===
root:d...@gmail.com:smtp.gmail.com:465
dale:d...@gmail.com:smtp.gmail.com:465
other_user:d...@gmail.com:smtp.gmail.com:465
===

Then ping a message to yourself as a test to see that all works fine:

echo My first test message | mail -v -s Test for sSMTP 1 d...@gmail.com

It should then appear in your gmail account (Sent folder).  Set a label/filter 
to find such messages easily and you're done.

-- 
Regards,
Mick


signature.asc
Description: This is a digitally signed message part.


[gentoo-user] smartctrl drive error @60%

2014-06-24 Thread Dale
Howdy,

I run this test every once in a while.  How bad is this:

root@fireball / # smartctl -l selftest /dev/sdc
smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_DescriptionStatus  Remaining 
LifeTime(hours)  LBA_of_first_error
# 1  Extended offlineCompleted: read failure   60%
16365 2905482560
# 2  Extended offlineCompleted: read failure   60%
16352 2905482560
# 3  Extended offlineCompleted without error   00% 
8044 -
# 4  Extended offlineCompleted without error   00% 
3121 -

And better yet, is there any way to tell it to not use that part and
finish the test?  It seems it stopped when it got to that, or I think it
did. 

Thoughts? 

Dale

:-)  :-) 

-- 
I am only responsible for what I said ... Not for what you understood or how 
you interpreted my words!




Re: [gentoo-user] smartctrl drive error @60%

2014-06-24 Thread J. Roeleveld
On 25 June 2014 01:09:03 CEST, Dale rdalek1...@gmail.com wrote:
Howdy,

I run this test every once in a while.  How bad is this:

root@fireball / # smartctl -l selftest /dev/sdc
smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local
build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke,
www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_DescriptionStatus  Remaining 
LifeTime(hours)  LBA_of_first_error
# 1  Extended offlineCompleted: read failure   60%
16365 2905482560
# 2  Extended offlineCompleted: read failure   60%
16352 2905482560
# 3  Extended offlineCompleted without error   00% 
8044 -
# 4  Extended offlineCompleted without error   00% 
3121 -

And better yet, is there any way to tell it to not use that part and
finish the test?  It seems it stopped when it got to that, or I think
it
did. 

Thoughts? 

Dale

:-)  :-) 

Dale,

Not sure how to get it to go past. Think that is in the firmware of the disk.

I would start with making a backup first.

--
Joost
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: [gentoo-user] smartctrl drive error @60%

2014-06-24 Thread Dale
J. Roeleveld wrote:
 On 25 June 2014 01:09:03 CEST, Dale rdalek1...@gmail.com wrote:
 Howdy,

 I run this test every once in a while.  How bad is this:

 root@fireball / # smartctl -l selftest /dev/sdc
 smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local
 build)
 Copyright (C) 2002-13, Bruce Allen, Christian Franke,
 www.smartmontools.org

 === START OF READ SMART DATA SECTION ===
 SMART Self-test log structure revision number 1
 Num  Test_DescriptionStatus  Remaining 
 LifeTime(hours)  LBA_of_first_error
 # 1  Extended offlineCompleted: read failure   60%
 16365 2905482560
 # 2  Extended offlineCompleted: read failure   60%
 16352 2905482560
 # 3  Extended offlineCompleted without error   00% 
 8044 -
 # 4  Extended offlineCompleted without error   00% 
 3121 -

 And better yet, is there any way to tell it to not use that part and
 finish the test?  It seems it stopped when it got to that, or I think
 it
 did. 

 Thoughts? 

 Dale

 :-)  :-) 
 Dale,

 Not sure how to get it to go past. Think that is in the firmware of the disk.

 I would start with making a backup first.

 --
 Joost

That's a 3TB drive.  I don't have anything big enough to back it up to. 
Is there anyway to find out if this error is really serious or just a
run of the mill type error?  I would think that if it was a run of the
mill error the drive would handle the error itself and I wouldn't even
see it.  Something like marking the area as bad and just not trying to
use it anymore, even for the test. 

Thanks.  Any advice is appreciated.  I need a hard drive guru.  ;-)

Here is additional info:

root@fireball / # hdparm -i /dev/sdc

/dev/sdc:

 Model=ST3000DM001-9YN166, FwRev=CC4C, SerialNo=Z1F0PKT5
 Config={ HardSect NotMFM HdSw15uSec Fixed DTR10Mbs RotSpdTol.5% }
 RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
 BuffType=unknown, BuffSize=unknown, MaxMultSect=16, MultSect=16
 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=5860533168
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes:  pio0 pio1 pio2 pio3 pio4
 DMA modes:  mdma0 mdma1 mdma2
 UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6
 AdvancedPM=yes: unknown setting WriteCache=enabled
 Drive conforms to: unknown:  ATA/ATAPI-4,5,6,7

 * signifies the current active mode

root@fireball / #



Dale

:-)  :-) 

-- 
I am only responsible for what I said ... Not for what you understood or how 
you interpreted my words!