On 5/22/26 12:43, nwe wrote:
On 5/22/26 1:32 PM, Van Snyder wrote:

I remarked to a local computer repair shop that I have a four TB backup drive. He said "replace it. Four TB isn't ready yet."

How so? I though 4TB is showing its age...


+1  I am also curious why 4 TB HDD's are not "ready yet".


I'm running 12x 4TB drives. Used SAS drives. Accumulated power on time ranges from 40,166 to 73,439 hours.

Smartctl informs me device /dev/sdf is worsening with increased read errors over time. That one shows 73408 hours powered up, 72360.67 GB read, 119545.193 GB written, 195 power cycles (13 since July 13 2024). Defect list increased from 3 to 6872.

I see two other drives have defect lists of 23 and 14, respectively. All others are at 0. Considering that, I should probably prioritize replacing at least sdf soon to avoid losing redundancy during resilver, considering the age of the pool.


I am still trying to understand the smartctl(8) "SMART Attributes Data Structure". The RAW_VALUE seems to be a binary bit field (?) for several attributes and is useless without manufacturer engineering data. The VALUE column is supposed to be a percentage that starts at 100% and goes down to 0% as the disk wears out:

* Raw_Read_Error_Rate, Seek_Error_Rate, and Hardware_ECC_Recovered can have low VALUE numbers, but the disk seems to keep working.

* Low VALUE numbers for Reallocated_Sector_Ct, Current_Pending_Sector, and/or Offline_Uncorrectable seem to be reliable indicators of a failing disk.

* I have not seen a VALUE number other than 100% for End-to-End_Error.


nwe@srv01:~$ zpool status -c vendor,model,size
pool: POOL1
state: ONLINE
scan: scrub repaired 0B in 04:10:05 with 0 errors on Sun May 10 04:34:06 2026
config:

NAME        STATE     READ WRITE CKSUM   vendor         model  size
POOL1       ONLINE       0     0     0
raidz3-0  ONLINE       0     0     0
sdb     ONLINE       0     0     0  SEAGATE  ST4000NM0023  3.6T
sdc     ONLINE       0     0     0  TOSHIBA   MG04SCA40EN  3.6T
sdd     ONLINE       0     0     0  TOSHIBA   MG04SCA40EN  3.6T
sde     ONLINE       0     0     0  SEAGATE  ST4000NM0023  3.6T
sdf     ONLINE       0     0     0  SEAGATE  ST4000NM0023  3.6T
sdh     ONLINE       0     0     0       HP   MB4000FCWDK  3.6T
sdg     ONLINE       0     0     0       HP   MB4000FCWDK  3.6T
sdi     ONLINE       0     0     0  SEAGATE  ST4000NM0023  3.6T
sdj     ONLINE       0     0     0  SEAGATE  ST4000NM0023  3.6T
sdk     ONLINE       0     0     0  SEAGATE  ST4000NM0023  3.6T
sdl     ONLINE       0     0     0       HP   MB4000FCWDK  3.6T
sdm     ONLINE       0     0     0       HP   MB4000FCWDK  3.6T

errors: No known data errors


Twelve disks gives you many choices for how to layout the pool and trade-off redundancy vs. capacity vs. performance. Is the data balanced across disks? Does the machine have enough memory? Is the ARC working well?


On two of my earlier pools, I added a 60 GB SSD as a cache vdev after the pools had data. I did not notice any improvement.


On one of my earlier pools of one mirror of two 3 TB HDD's that was nearly full, I added another mirror of two 3 TB HDD's. I did not notice any improvement.


I rebuilt the storage pool with two mirrors of two 3 TB HDD's each and a special vdev mirror of two 180 GB SSD's. I also set special_small_blocks=16K. I then restored the data via replication. The data is now balanced across disks, latency has dropped, throughput has increased, and overall performance is noticeably better:

2026-05-22 15:12:45 toor@f5 ~
# freebsd-version
13.5-RELEASE-p12

2026-05-22 15:19:47 toor@f5 ~
# zpool iostat -v p5
                    capacity     operations     bandwidth
pool              alloc   free   read  write   read  write
----------------  -----  -----  -----  -----  -----  -----
p5                3.76T  1.82T      6      1  3.68M  32.2K
  mirror-0        1.87T   871G      2      0  1.82M  4.48K
    gpt/hdd0.eli      -      -      1      0   931K  2.24K
    gpt/hdd1.eli      -      -      1      0   931K  2.24K
  mirror-1        1.86T   876G      2      0  1.81M  4.35K
    gpt/hdd2.eli      -      -      1      0   928K  2.18K
    gpt/hdd3.eli      -      -      1      0   928K  2.18K
special               -      -      -      -      -      -
  mirror-2        31.1G   118G      1      1  51.2K  23.3K
    gpt/ssd0.eli      -      -      0      0  25.6K  11.7K
    gpt/ssd1.eli      -      -      0      0  25.6K  11.7K
----------------  -----  -----  -----  -----  -----  -----  -----

2026-05-22 15:32:42 toor@f5 ~
# top -d 1 | head -n 7
last pid: 57622; load averages: 0.24, 0.21, 0.17 up 24+22:47:05 15:32:45
27 processes:  1 running, 26 sleeping
CPU:  0.0% user,  0.0% nice,  0.6% system,  0.0% interrupt, 99.4% idle
Mem: 4848K Active, 330M Inact, 856K Laundry, 14G Wired, 920M Buf, 694M Free
ARC: 12G Total, 10G MFU, 485M MRU, 3328K Anon, 200M Header, 899M Other
     9921M Compressed, 33G Uncompressed, 3.36:1 Ratio
Swap: 764M Total, 764M Free

2026-05-22 15:33:12 toor@f5 ~
# arc_summary | grep -A 5 "ARC total accesses"
ARC total accesses (hits + misses):                               512.7M
        Cache hit ratio:                               99.8 %     511.8M
        Cache miss ratio:                               0.2 %     886.5k
        Actual hit ratio (MFU + MRU hits):             99.3 %     509.2M
        Data demand efficiency:                        99.5 %       4.8M
        Data prefetch efficiency:                      19.2 %      96.9k


In hindsight:

1. I gathered file system statistics prior rebuilding the pool and setting special_small_blocks=16K, but it now appears I could have used a larger value.

2. If I get worried about HDD's failing, I can add disks to the pool as spares and/or add disks to the data mirrors. The latter should increase read performance even more.

3. My ~10 year old HDD's can already saturate Gigabit with sequential I/O. RAID 10 with SSD acceleration is even more overkill. I want 10 GbE.


David

Reply via email to