Re: Sata Hard drive testing

2021-10-24 Thread Thomas Anderson

Thank you everyone that helped me with this issue.

I learned a great deal.

Namely, for my particular use case, SMR drives are sub-optimal.

It is VERY difficult to even find out if drives are SMR or CMR, because

apparently the manufacturers try to put it all out the door. As mentioned

by Mr. Ritter, the information of writing technology is not disclosed 
easily,


and in some cases, bordering on deceptive.

I will replace this drive since I have little faith in it, as it was 
originally purposed,


important backup. I will just use it as a redundant backup or just a 
short term/temporary


storage drive - and monitor it.

On 10/21/21 6:08 PM, Kenneth Parker wrote:



On Thu, Oct 21, 2021, 9:02 AM David > wrote:


On Thu, 21 Oct 2021 at 23:53, Gene Heskett mailto:ghesk...@shentel.net>> wrote:

> And what does this SMR acronym mean, Dan?

Questions like this can be answered with an
internet search engine. Search for "Seagate SMR".
For fun you can add search terms like
"controversy guilty dreaded bad press" etc


What?  Seagate Magnetic Recording caught Shingles?  (Sorry!)

Kenneth Parker


Re: Sata Hard drive testing

2021-10-21 Thread Kenneth Parker
On Thu, Oct 21, 2021, 9:02 AM David  wrote:

> On Thu, 21 Oct 2021 at 23:53, Gene Heskett  wrote:
>
> > And what does this SMR acronym mean, Dan?
>
> Questions like this can be answered with an
> internet search engine. Search for "Seagate SMR".
> For fun you can add search terms like
> "controversy guilty dreaded bad press" etc
>

What?  Seagate Magnetic Recording caught Shingles?  (Sorry!)

Kenneth Parker

>


Re: Sata Hard drive testing

2021-10-21 Thread piorunz

On 21/10/2021 03:34, David Christensen wrote:

Very detailed analysis, better than mine! Much appreciated.

--
With kindest regards, Piotr.

⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system
⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org
⠈⠳⣄


Re: Sata Hard drive testing

2021-10-21 Thread Gene Heskett
On Thursday 21 October 2021 09:02:01 David wrote:

> On Thu, 21 Oct 2021 at 23:53, Gene Heskett  
wrote:
> > And what does this SMR acronym mean, Dan?
>
> Questions like this can be answered with an
> internet search engine. Search for "Seagate SMR".
> For fun you can add search terms like
> "controversy guilty dreaded bad press" etc.

Very Interesting and most informative Dan, and thank you.

Carefully reading between the lines of several hits also explains why I 
stopped having major failures of amanda, the backup software, when I 
replaced a 1Tbyte seagate with a 240 gig series 860 samsung SSD, used as 
a holding disk while amanda is collecting its data to be written to tape 
w/o shoeshining the tape while at the same time cutting the time to do a 
20 gig backup from 3 or 4 hours, to 20 minutes.

This post is also being sent to the amanda-users list to alert the users 
who are currently being affected by this and not finding a definitive 
answer.

Bottom line for amanda users is, do not use todays spinniing rust drives 
as holding disks. They are simply not trustworthy for short term buffer 
storage.

Symptoms are reported to the user as crc errors in the holding disk. That 
particular DiskList Entry is then repeated until a good crc is obtained. 
But that also leaves the bad crc data left on the holding disk 
eventually using up its capacity without manual intervention to clean up 
the mess. A genuine PITA for us, and leaves the drive makers in an 
unfavorable, bad dog, no biscuit light.

Cheers, Gene Heskett.
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
 - Louis D. Brandeis
Genes Web page 



Re: Sata Hard drive testing

2021-10-21 Thread Andrei POPESCU
On Jo, 21 oct 21, 16:02:43, Andrei POPESCU wrote:
> On Jo, 21 oct 21, 08:53:10, Gene Heskett wrote:
> > On Thursday 21 October 2021 08:13:26 Dan Ritter wrote:
> > 
> > > > === START OF INFORMATION SECTION ===
> > > > Device Model: ST8000DM004-2CX188
> > >
> > > You should be able to return this drive without proving that it
> > > is defective; this is one of Seagates' SMR drives that they did
> > > not disclose were SMR.
> > >
> > > -dsr-
> > 
> > And what does this SMR acronym mean, Dan?
> 
> Wikipedia is quite helpful for such queries.
> 
> Searching for 'SMR' brings one to a fairly long disambiguation page, but 
> there is only one entry mentioning hard drives ;)
> 
> The first two paragraphs of the article is already quite informative on 
> why one might want to avoid SMR drives.

And let's not forget about
https://www.truenas.com/community/resources/list-of-known-smr-drives.141/

Kind regards,
Andrei
-- 
http://wiki.debian.org/FAQsFromDebianUser


signature.asc
Description: PGP signature


Re: Sata Hard drive testing

2021-10-21 Thread Andrei POPESCU
On Jo, 21 oct 21, 08:53:10, Gene Heskett wrote:
> On Thursday 21 October 2021 08:13:26 Dan Ritter wrote:
> 
> > > === START OF INFORMATION SECTION ===
> > > Device Model: ST8000DM004-2CX188
> >
> > You should be able to return this drive without proving that it
> > is defective; this is one of Seagates' SMR drives that they did
> > not disclose were SMR.
> >
> > -dsr-
> 
> And what does this SMR acronym mean, Dan?

Wikipedia is quite helpful for such queries.

Searching for 'SMR' brings one to a fairly long disambiguation page, but 
there is only one entry mentioning hard drives ;)

The first two paragraphs of the article is already quite informative on 
why one might want to avoid SMR drives.

Kind regards,
Andrei
-- 
http://wiki.debian.org/FAQsFromDebianUser


signature.asc
Description: PGP signature


Re: Sata Hard drive testing

2021-10-21 Thread David
On Thu, 21 Oct 2021 at 23:53, Gene Heskett  wrote:

> And what does this SMR acronym mean, Dan?

Questions like this can be answered with an
internet search engine. Search for "Seagate SMR".
For fun you can add search terms like
"controversy guilty dreaded bad press" etc.



Re: Sata Hard drive testing

2021-10-21 Thread Gene Heskett
On Thursday 21 October 2021 08:13:26 Dan Ritter wrote:

> > === START OF INFORMATION SECTION ===
> > Device Model: ST8000DM004-2CX188
>
> You should be able to return this drive without proving that it
> is defective; this is one of Seagates' SMR drives that they did
> not disclose were SMR.
>
> -dsr-

And what does this SMR acronym mean, Dan?

Cheers, Gene Heskett.
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
 - Louis D. Brandeis
Genes Web page 



Re: Sata Hard drive testing

2021-10-21 Thread Dan Ritter
Thomas Anderson wrote: 
> Here are the results, of my smartctl test:
> 
> I am trying to parse them myself, to see if I can learn anything. But,
> immediate glance
> 
> and queries did not reveal anything that could help me determine if the
> drive is good or not.
> 
> 
> smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-18-amd64] (local build)
> Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
> 
> === START OF INFORMATION SECTION ===
> Device Model: ST8000DM004-2CX188


You should be able to return this drive without proving that it
is defective; this is one of Seagates' SMR drives that they did
not disclose were SMR.

-dsr-



Re: Sata Hard drive testing

2021-10-21 Thread Andrei POPESCU
On Jo, 21 oct 21, 02:12:25, piorunz wrote:
> 
> Lastly, you should use it in RAID1 or similar mode, to ensure there is
> always backup of data this drive keeps.

RAID is meant only for uptime, backup is something else, see: 
http://taobackup.com/.

> You can try Btrfs raid1 or mdadm raid1.

Mostly agreed with btrfs[1] (though ZFS is probably better, if you are 
fine with out-of-tree kernel modules[2]).

On the other hand mdadm can't detect silent corruption. In case a flaky 
drive (like the one discussed here) returns corrupted data claiming it 
is good mdadm RAID can't detect this, and in case it detects it, it has 
no way to determine which data is good.

Worst case it might even overwrite the good data with the bad one 
(because the bad data appears to be newer).

> Whatever you choose, try not to use this drive without any
> backup, as you are not sure yet if damage are not progressing.

Yep.

[1] 
https://arstechnica.com/gadgets/2021/09/examining-btrfs-linuxs-perpetually-half-finished-filesystem/
[2] DKMS makes this mostly painless

Kind regards,
Andrei
-- 
http://wiki.debian.org/FAQsFromDebianUser


signature.asc
Description: PGP signature


Re: Sata Hard drive testing

2021-10-20 Thread Reco
Hi.

On Thu, Oct 21, 2021 at 01:45:52AM +0200, Thomas Anderson wrote:
> I am trying to parse them myself, to see if I can learn anything. But,
> immediate glance and queries did not reveal anything that could help
> me determine if the drive is good or not.

It's not. You have a Seagate, after all, and those are good only as long
as trash can is considered :)


But anyway, it's not new.

>   9 Power_On_Hours  0x0032   084   084   000    Old_age  Always   
> -   14558 (99 41 0)

It has bad sectors, a small amount compared to the drive size.

> 183 Runtime_Bad_Block   0x0032   095   095   000    Old_age  Always   
> -   5
> 187 Reported_Uncorrect  0x0032   001   001   000    Old_age  Always   
> -   1334
> 197 Current_Pending_Sector  0x0012   100   100   000    Old_age  Always   
> -   8
> 198 Offline_Uncorrectable   0x0010   100   100   000    Old_age  Offline  
> -   8


And you had some problems with the drive in the past, which could be a
bad SATA cable, but could be the drive itself:

> Error 1334 occurred at disk power-on lifetime: 10525 hours (438 days + 13 
> hours)
>   When the command that caused the error occurred, the device was active or 
> idle.
> 
>   After command completion occurred, registers were:
>   ER ST SC SN CL CH DH
>   -- -- -- -- -- -- --
>   40 53 00 ff ff ff 0f  Error: UNC at LBA = 0x0fff = 268435455
> 
>   Commands leading to the command that caused the error were:
>   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
>   -- -- -- -- -- -- -- --    
>   60 00 08 ff ff ff 4f 00  15d+00:06:46.256  READ FPDMA QUEUED
>   ef 10 02 00 00 00 a0 00  15d+00:06:46.247  SET FEATURES [Enable SATA 
> feature]
>   27 00 00 00 00 00 e0 00  15d+00:06:46.220  READ NATIVE MAX ADDRESS EXT 
> [OBS-ACS-3]
>   ec 00 00 00 00 00 a0 00  15d+00:06:46.217  IDENTIFY DEVICE
>   ef 03 46 00 00 00 a0 00  15d+00:06:46.205  SET FEATURES [Set transfer mode]


Assuming you make backups, I'd call this drive servicable. I'd replace
it sooner or later, because it has bad sectors, but it won't be the
first priority.

Reco



Re: Sata Hard drive testing

2021-10-20 Thread David Christensen

> On 10/18/21 6:52 PM, Reco wrote:
>>Hi.
>>
>> On Mon, Oct 18, 2021 at 06:25:19PM +0200, Thomas Anderson wrote:
>>> I have been having problems with a drive (non-SSD) for a while now,
>>> but I would like to "identify" the problem specifically, so that I may
>>> perhaps be able to get the drive replaced.
>> Assuming it's SATA/IDE drive, all you need to do is:
>>
>> apt install smartmontools
>> smartctl -t long 
>> # wait for the test to finish
>> smartctl -a 
>>
>> Please post the output of the last command.
>>
>> Reco


On 10/20/21 4:45 PM, Thomas Anderson wrote:

Here are the results, of my smartctl test:



Please use inline posting style, not top posting.



I am trying to parse them myself, to see if I can learn anything. But,
immediate glance

and queries did not reveal anything that could help me determine if the
drive is good or not.


smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-18-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: ST8000DM004-2CX188



So, an 8 TB desktop drive.


Here is the datasheet:

https://www.seagate.com/www-content/datasheets/pdfs/3-5-barracudaDS1900-11-1806US-en_US.pdf


Here is the SMART attributes specification:

http://t1.daumcdn.net/brunch/service/user/axm/file/zRYOdwPu3OMoKYmBOby1fEEQEbU.pdf



Serial Number:    ZCT1706V
LU WWN Device Id: 5 000c50 0c2b1cd83
Firmware Version: 0001
User Capacity:    8,001,563,222,016 bytes [8.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate:    5425 rpm
Form Factor:  3.5 inches
Device is:    Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Oct 20 14:36:49 2021 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED



"PASSED" is a good thing.



General SMART Values:
Offline data collection status:  (0x00)    Offline data collection activity
                     was never started.
                     Auto Offline Data Collection: Disabled.
Self-test execution status:  (   0)    The previous self-test
routine completed
                     without error or no self-test has ever
                     been run.



No self-test errors ia a good thing.



Total time to complete Offline
data collection:         (    0) seconds.
Offline data collection
capabilities:              (0x73) SMART execute Offline immediate.
                     Auto Offline data collection on/off support.
                     Suspend Offline collection upon new
                     command.
                     No Offline surface scan supported.
                     Self-test supported.
                     Conveyance Self-test supported.
                     Selective Self-test supported.
SMART capabilities:    (0x0003)    Saves SMART data before entering
                     power-saving mode.
                     Supports SMART auto save timer.
Error logging capability:    (0x01)    Error logging supported.
                     General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   1) minutes.
Extended self-test routine
recommended polling time:      ( 987) minutes.
Conveyance self-test routine
recommended polling time:      (   2) minutes.
SCT capabilities:        (0x30a5)    SCT Status supported.
                     SCT Data Table supported.

SMART Attributes Data Structure revision number: 10



Decoding SMART attributes can be a challenge.  In addition to the 
Seagate spec above, see:


https://wiki.unraid.net/Understanding_SMART_Reports#SMART_attributes_section



Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE UPDATED
WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate 0x000f   064   051   006    Pre-fail
Always   -   7226480
   3 Spin_Up_Time    0x0003   093   091   000    Pre-fail
Always   -   0
   4 Start_Stop_Count    0x0032   100   100   020    Old_age
Always   -   361
   5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail
Always   -   0
   7 Seek_Error_Rate 0x000f   080   060   045    Pre-fail
Always   -   104115990
   9 Power_On_Hours  0x0032   084   084   000    Old_age
Always   -   14558 (99 41 0)



AIUI your drive had been powered up for a total of 14,558 hours over its 
lifetime when the report was generated.



Spec is 2,400 hours per year (8,760 hours).



  10 Spin_Retry_Count    0x0013   100   100   097    Pre-fail
Always   -   0
  12 Power_Cycle_Count   0x0032   100   100   020    Old_age
Always   -   77
183 Runtime_Bad_Block   0x0032   095   095   000    Old_age
Always   -   5



AIUI 

Re: Sata Hard drive testing

2021-10-20 Thread piorunz

On 21/10/2021 00:45, Thomas Anderson wrote:

Here are the results, of my smartctl test:


Five metrics from smartctl require attention:

Device ModelST8000DM004-2CX188
9   Power_On_Hours  14558

183 Runtime_Bad_Block   5
187 Reported_Uncorrect  1334
195 Hardware_ECC_Recovered  7226480
197 Current_Pending_Sector  8
198 Offline_Uncorrectable   8

I have similar Seagate drives (4), and they look like this:

Device ModelST2000DM006-2DM164
9   Power_On_Hours  20124,20125,23527,23511

183 Runtime_Bad_Block   2,3,1,1 yours 5
187 Reported_Uncorrect  0,0,0,0 yours 1334
195 Hardware_ECC_Recovered my HDDs don't have that field, yours 7226480
197 Current_Pending_Sector  0,0,0,0 yours 5
198 Offline_Uncorrectable   0,0,0,0 yours 5

Clearly you have problem with surface errors. 1334 reported bad sectors,
5 current pending sectors and so on. However, last full surface read
test SMART "long" test you performed at 14551 hrs (7 hours before
fetching this SMART data) has passed, meaning drive is still able to
read all sectors. That means it still has spares and/or it doesn't have
any more unknown bad sectors.

Another good news is, that SMART log has logged last error in 10525
power-on hours. You are on 14558 hrs. That's 4033 hrs or 168 days ago if
you run it 24/7. That means, drive is not discovering new bad sectors,
either because you don't read or write to certain surface spots where
undiscovered bad sectors are, or simply because remainder of the drive
is in good condition.

If your drive is still on warranty, you should return it, most likely it
will be accepted based solely on smartctl results. If not, you can do
following:

Change data cable first thing, as also Gene suggested

Watch these five metrics very closely:
183 Runtime_Bad_Block   5
187 Reported_Uncorrect  1334
195 Hardware_ECC_Recovered  7226480
197 Current_Pending_Sector  8
198 Offline_Uncorrectable   8

If any of these raise rapidly over time (maybe except ECC errors, which
are usually kinda normal nowadays), you may consider retiring it.

Also do smart "long" test every 1 month or so. Check SMART results after
each test.

Lastly, you should use it in RAID1 or similar mode, to ensure there is
always backup of data this drive keeps. You can try Btrfs raid1 or mdadm
raid1. Whatever you choose, try not to use this drive without any
backup, as you are not sure yet if damage are not progressing.

--
With kindest regards, Piotr.

⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system
⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org
⠈⠳⣄



Re: Sata Hard drive testing

2021-10-20 Thread Gene Heskett
On Wednesday 20 October 2021 19:45:52 Thomas Anderson wrote:

> Here are the results, of my smartctl test:
>
This would be much easier to decode if you turned off wordwrap, and 
repasted from that terminal screen.

> I am trying to parse them myself, to see if I can learn anything. But,
> immediate glance
>
> and queries did not reveal anything that could help me determine if
> the drive is good or not.
>
>
> smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-18-amd64] (local
> build) Copyright (C) 2002-17, Bruce Allen, Christian Franke,
> www.smartmontools.org
>
> === START OF INFORMATION SECTION ===
> Device Model: ST8000DM004-2CX188
> Serial Number:    ZCT1706V
> LU WWN Device Id: 5 000c50 0c2b1cd83
> Firmware Version: 0001
> User Capacity:    8,001,563,222,016 bytes [8.00 TB]
> Sector Sizes: 512 bytes logical, 4096 bytes physical
> Rotation Rate:    5425 rpm
> Form Factor:  3.5 inches
> Device is:    Not in smartctl database [for details use: -P
> showall] ATA Version is:   ACS-3 T13/2161-D revision 5
> SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
> Local Time is:    Wed Oct 20 14:36:49 2021 CEST
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
>
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
>
> General SMART Values:
> Offline data collection status:  (0x00)    Offline data collection
> activity was never started.
>                     Auto Offline Data Collection: Disabled.
> Self-test execution status:  (   0)    The previous self-test
> routine completed
>                     without error or no self-test has ever
>                     been run.
> Total time to complete Offline
> data collection:         (    0) seconds.
> Offline data collection
> capabilities:              (0x73) SMART execute Offline immediate.
>                     Auto Offline data collection on/off support.
>                     Suspend Offline collection upon new
>                     command.
>                     No Offline surface scan supported.
>                     Self-test supported.
>                     Conveyance Self-test supported.
>                     Selective Self-test supported.
> SMART capabilities:    (0x0003)    Saves SMART data before
> entering power-saving mode.
>                     Supports SMART auto save timer.
> Error logging capability:    (0x01)    Error logging supported.
>                     General Purpose Logging supported.
> Short self-test routine
> recommended polling time:      (   1) minutes.
> Extended self-test routine
> recommended polling time:      ( 987) minutes.
> Conveyance self-test routine
> recommended polling time:      (   2) minutes.
> SCT capabilities:        (0x30a5)    SCT Status supported.
>                     SCT Data Table supported.
>
> SMART Attributes Data Structure revision number: 10
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE UPDATED 
> WHEN_FAILED RAW_VALUE
>   1 Raw_Read_Error_Rate 0x000f   064   051   006    Pre-fail
> Always   -   7226480
>   3 Spin_Up_Time    0x0003   093   091   000    Pre-fail
> Always   -   0
>   4 Start_Stop_Count    0x0032   100   100   020    Old_age
> Always   -   361
>   5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail
> Always   -   0
none is good
>   7 Seek_Error_Rate 0x000f   080   060   045    Pre-fail
> Always   -   104115990
>   9 Power_On_Hours  0x0032   084   084   000    Old_age
> Always   -   14558 (99 41 0)
relatively young
>  10 Spin_Retry_Count    0x0013   100   100   097    Pre-fail
> Always   -   0
>  12 Power_Cycle_Count   0x0032   100   100   020    Old_age
> Always   -   77
this is the single most dangerous time for a spinning rust drive

> 183 Runtime_Bad_Block   0x0032   095   095   000    Old_age
> Always   -   5
> 184 End-to-End_Error    0x0032   100   100   099    Old_age
> Always   -   0
> 187 Reported_Uncorrect  0x0032   001   001   000    Old_age
> Always   -   1334
> 188 Command_Timeout 0x0032   100   100   000    Old_age
> Always   -   0
> 189 High_Fly_Writes 0x003a   100   100   000    Old_age
> Always   -   0
> 190 Airflow_Temperature_Cel 0x0022   070   052   040    Old_age
> Always   -   30 (Min/Max 28/33)
operating temps good
> 191 G-Sense_Error_Rate  0x0032   100   100   000    Old_age
> Always   -   0
> 192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age
> Always   -   25
should normally match power cycle?
> 193 Load_Cycle_Count    0x0032   099   099   000    Old_age
> Always   -   3362
> 194 Temperature_Celsius 0x0022   030   048   000    Old_age
> Always   -   30 (0 16 0 0 0)
> 195 Hardware_ECC_Recovered  0x001a   069   064   000    

Re: Sata Hard drive testing

2021-10-20 Thread Thomas Anderson
Here are the results, of my smartctl test:

I am trying to parse them myself, to see if I can learn anything. But,
immediate glance

and queries did not reveal anything that could help me determine if the
drive is good or not.


smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-18-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: ST8000DM004-2CX188
Serial Number:    ZCT1706V
LU WWN Device Id: 5 000c50 0c2b1cd83
Firmware Version: 0001
User Capacity:    8,001,563,222,016 bytes [8.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate:    5425 rpm
Form Factor:  3.5 inches
Device is:    Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Oct 20 14:36:49 2021 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)    Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:  (   0)    The previous self-test
routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         (    0) seconds.
Offline data collection
capabilities:              (0x73) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:    (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:    (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   1) minutes.
Extended self-test routine
recommended polling time:      ( 987) minutes.
Conveyance self-test routine
recommended polling time:      (   2) minutes.
SCT capabilities:        (0x30a5)    SCT Status supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE UPDATED 
WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate 0x000f   064   051   006    Pre-fail
Always   -   7226480
  3 Spin_Up_Time    0x0003   093   091   000    Pre-fail
Always   -   0
  4 Start_Stop_Count    0x0032   100   100   020    Old_age
Always   -   361
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail
Always   -   0
  7 Seek_Error_Rate 0x000f   080   060   045    Pre-fail
Always   -   104115990
  9 Power_On_Hours  0x0032   084   084   000    Old_age
Always   -   14558 (99 41 0)
 10 Spin_Retry_Count    0x0013   100   100   097    Pre-fail
Always   -   0
 12 Power_Cycle_Count   0x0032   100   100   020    Old_age
Always   -   77
183 Runtime_Bad_Block   0x0032   095   095   000    Old_age
Always   -   5
184 End-to-End_Error    0x0032   100   100   099    Old_age
Always   -   0
187 Reported_Uncorrect  0x0032   001   001   000    Old_age
Always   -   1334
188 Command_Timeout 0x0032   100   100   000    Old_age
Always   -   0
189 High_Fly_Writes 0x003a   100   100   000    Old_age
Always   -   0
190 Airflow_Temperature_Cel 0x0022   070   052   040    Old_age
Always   -   30 (Min/Max 28/33)
191 G-Sense_Error_Rate  0x0032   100   100   000    Old_age
Always   -   0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age
Always   -   25
193 Load_Cycle_Count    0x0032   099   099   000    Old_age
Always   -   3362
194 Temperature_Celsius 0x0022   030   048   000    Old_age
Always   -   30 (0 16 0 0 0)
195 Hardware_ECC_Recovered  0x001a   069   064   000    Old_age
Always   -   7226480
197 Current_Pending_Sector  0x0012   100   100   000    Old_age
Always   -   8
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age
Offline  -   8
199 UDMA_CRC_Error_Count    0x003e   200   198   000    Old_age
Always   -   15
240 Head_Flying_Hours   0x   100   253   000    Old_age
Offline  -   2368 (67 0 0)
241 Total_LBAs_Written  0x   100   253   000    Old_age
Offline  -   

Re: Sata Hard drive testing

2021-10-18 Thread Stefan Monnier
> Essentially, I have been experienced data loss, where nodes become
> unreadable, when I try to "fix it" with fdisk, it says it moves unreadable
> to the trash, basically deleting data.

"it says"?  Can you clarify what is this "it"?

Drive-level errors of "unreadable data" normally lead to errors that
appear in the kernel log and say scary things and "read sense" and
include hex numbers showing the actual bytes of the SATA command sent.

I don't expect such error messages to say anything about a "trash".
So maybe your errors occur elsewhere.


Stefan



Re: Sata Hard drive testing

2021-10-18 Thread David Christensen

On 10/18/21 9:25 AM, Thomas Anderson wrote:

Hello Gurus,

I have been having problems with a drive (non-SSD) for a while now, but 
I would like to "identify" the problem specifically, so that I may 
perhaps be able to get the drive replaced.


Essentially, I have been experienced data loss, where nodes become 
unreadable, when I try to "fix it" with fdisk, it says it moves 
unreadable to the trash, basically deleting data.


I have tried reformatting the drive, writing zeros to it, and then 
temporarily the drive looks fine; but it's not! it's as if the drive is 
some how decaying: where it looks good, but then when you go to use it, 
the data later becomes unreadable.


Is there any kind of test I can execute on this drive to show it needs 
to be replaced??


Thanks!



The canonical testing tool is the one made by the drive manufacturer. 
This is the definitive way to obtain a warranty RMA.  For example, Seagate:


https://www.seagate.com/support/downloads/seatools/


As suggested by other readers, smartctl(8) can be used to obtain 
information from the drive controller, direct the controller to perform 
short and/or long self-tests, etc..  If your drive is out of warranty, 
you want a second opinion, you want to do periodic monitoring/ logging, 
etc., this tool works well.



But before you blame the drive, you also have to eliminate the SATA 
interface (typically on the motherboard), the SATA cable, the power 
supply, and all connections.  This means you need at least two 
interfaces, two cables, and two drives; it is best to have a second 
computer with two interfaces to confirm.  Test all of the permutations, 
work up a truth table of results, and find the bad component(s) by a 
process of elimination.  Beware that you may have more than one bad 
component, and that the failures may be intermittent.



David



Re: Sata Hard drive testing

2021-10-18 Thread basti
Have a look at "smartctl". It will print some info about the drive and
can also "test" your drive known as S.M.A.R.T-test.

First of all i would have a look what "smartctl -a /dev/sdX" will say.

Am 18.10.21 um 18:25 schrieb Thomas Anderson:
> Hello Gurus,
> 
> I have been having problems with a drive (non-SSD) for a while now, but
> I would like to "identify" the problem specifically, so that I may
> perhaps be able to get the drive replaced.
> 
> Essentially, I have been experienced data loss, where nodes become
> unreadable, when I try to "fix it" with fdisk, it says it moves
> unreadable to the trash, basically deleting data.
> 
> I have tried reformatting the drive, writing zeros to it, and then
> temporarily the drive looks fine; but it's not! it's as if the drive is
> some how decaying: where it looks good, but then when you go to use it,
> the data later becomes unreadable.
> 
> Is there any kind of test I can execute on this drive to show it needs
> to be replaced??
> 
> Thanks!
> 



Re: Sata Hard drive testing

2021-10-18 Thread Thomas Anderson

Cool, thanks Reco!

Will post again tomorrow =) Takes 987 minutes apparently.

On 10/18/21 6:52 PM, Reco wrote:

Hi.

On Mon, Oct 18, 2021 at 06:25:19PM +0200, Thomas Anderson wrote:

I have been having problems with a drive (non-SSD) for a while now,
but I would like to "identify" the problem specifically, so that I may
perhaps be able to get the drive replaced.

Assuming it's SATA/IDE drive, all you need to do is:

apt install smartmontools
smartctl -t long 
# wait for the test to finish
smartctl -a 

Please post the output of the last command.

Reco





Re: Sata Hard drive testing

2021-10-18 Thread Reco
Hi.

On Mon, Oct 18, 2021 at 06:25:19PM +0200, Thomas Anderson wrote:
> I have been having problems with a drive (non-SSD) for a while now,
> but I would like to "identify" the problem specifically, so that I may
> perhaps be able to get the drive replaced.

Assuming it's SATA/IDE drive, all you need to do is:

apt install smartmontools
smartctl -t long 
# wait for the test to finish
smartctl -a 

Please post the output of the last command.

Reco



Re: Sata Hard drive testing

2021-10-18 Thread Felix Miata
Thomas Anderson composed on 2021-10-18 18:25 (UTC+0200):

> I have been having problems with a drive (non-SSD) for a while now, but 
> I would like to "identify" the problem specifically, so that I may 
> perhaps be able to get the drive replaced.

> Essentially, I have been experienced data loss, where nodes become 
> unreadable, when I try to "fix it" with fdisk, it says it moves 
> unreadable to the trash, basically deleting data.

> I have tried reformatting the drive, writing zeros to it, and then 
> temporarily the drive looks fine; but it's not! it's as if the drive is 
> some how decaying: where it looks good, but then when you go to use it, 
> the data later becomes unreadable.

> Is there any kind of test I can execute on this drive to show it needs 
> to be replaced??  
>   
>   
man smartctl
-- 
Evolution as taught in public schools is, like religion,
based on faith, not based on science.

 Team OS/2 ** Reg. Linux User #211409 ** a11y rocks!

Felix Miata



Sata Hard drive testing

2021-10-18 Thread Thomas Anderson

Hello Gurus,

I have been having problems with a drive (non-SSD) for a while now, but 
I would like to "identify" the problem specifically, so that I may 
perhaps be able to get the drive replaced.


Essentially, I have been experienced data loss, where nodes become 
unreadable, when I try to "fix it" with fdisk, it says it moves 
unreadable to the trash, basically deleting data.


I have tried reformatting the drive, writing zeros to it, and then 
temporarily the drive looks fine; but it's not! it's as if the drive is 
some how decaying: where it looks good, but then when you go to use it, 
the data later becomes unreadable.


Is there any kind of test I can execute on this drive to show it needs 
to be replaced??


Thanks!