On 5/22/26 12:43, nwe wrote:
On 5/22/26 1:32 PM, Van Snyder wrote:
I remarked to a local computer repair shop that I have a four TB
backup drive. He said "replace it. Four TB isn't ready yet."
How so? I though 4TB is showing its age...
+1 I am also curious why 4 TB HDD's are not "ready yet".
I'm running 12x 4TB drives. Used SAS drives. Accumulated power on time
ranges from 40,166 to 73,439 hours.
Smartctl informs me device /dev/sdf is worsening with increased read
errors over time. That one shows 73408 hours powered up, 72360.67 GB
read, 119545.193 GB written, 195 power cycles (13 since July 13 2024).
Defect list increased from 3 to 6872.
I see two other drives have defect lists of 23 and 14, respectively. All
others are at 0. Considering that, I should probably prioritize
replacing at least sdf soon to avoid losing redundancy during resilver,
considering the age of the pool.
I am still trying to understand the smartctl(8) "SMART Attributes Data
Structure". The RAW_VALUE seems to be a binary bit field (?) for
several attributes and is useless without manufacturer engineering data.
The VALUE column is supposed to be a percentage that starts at 100%
and goes down to 0% as the disk wears out:
* Raw_Read_Error_Rate, Seek_Error_Rate, and Hardware_ECC_Recovered can
have low VALUE numbers, but the disk seems to keep working.
* Low VALUE numbers for Reallocated_Sector_Ct, Current_Pending_Sector,
and/or Offline_Uncorrectable seem to be reliable indicators of a failing
disk.
* I have not seen a VALUE number other than 100% for End-to-End_Error.
nwe@srv01:~$ zpool status -c vendor,model,size
pool: POOL1
state: ONLINE
scan: scrub repaired 0B in 04:10:05 with 0 errors on Sun May 10 04:34:06
2026
config:
NAME STATE READ WRITE CKSUM vendor model size
POOL1 ONLINE 0 0 0
raidz3-0 ONLINE 0 0 0
sdb ONLINE 0 0 0 SEAGATE ST4000NM0023 3.6T
sdc ONLINE 0 0 0 TOSHIBA MG04SCA40EN 3.6T
sdd ONLINE 0 0 0 TOSHIBA MG04SCA40EN 3.6T
sde ONLINE 0 0 0 SEAGATE ST4000NM0023 3.6T
sdf ONLINE 0 0 0 SEAGATE ST4000NM0023 3.6T
sdh ONLINE 0 0 0 HP MB4000FCWDK 3.6T
sdg ONLINE 0 0 0 HP MB4000FCWDK 3.6T
sdi ONLINE 0 0 0 SEAGATE ST4000NM0023 3.6T
sdj ONLINE 0 0 0 SEAGATE ST4000NM0023 3.6T
sdk ONLINE 0 0 0 SEAGATE ST4000NM0023 3.6T
sdl ONLINE 0 0 0 HP MB4000FCWDK 3.6T
sdm ONLINE 0 0 0 HP MB4000FCWDK 3.6T
errors: No known data errors
Twelve disks gives you many choices for how to layout the pool and
trade-off redundancy vs. capacity vs. performance. Is the data balanced
across disks? Does the machine have enough memory? Is the ARC working
well?
On two of my earlier pools, I added a 60 GB SSD as a cache vdev after
the pools had data. I did not notice any improvement.
On one of my earlier pools of one mirror of two 3 TB HDD's that was
nearly full, I added another mirror of two 3 TB HDD's. I did not notice
any improvement.
I rebuilt the storage pool with two mirrors of two 3 TB HDD's each and a
special vdev mirror of two 180 GB SSD's. I also set
special_small_blocks=16K. I then restored the data via replication.
The data is now balanced across disks, latency has dropped, throughput
has increased, and overall performance is noticeably better:
2026-05-22 15:12:45 toor@f5 ~
# freebsd-version
13.5-RELEASE-p12
2026-05-22 15:19:47 toor@f5 ~
# zpool iostat -v p5
capacity operations bandwidth
pool alloc free read write read write
---------------- ----- ----- ----- ----- ----- -----
p5 3.76T 1.82T 6 1 3.68M 32.2K
mirror-0 1.87T 871G 2 0 1.82M 4.48K
gpt/hdd0.eli - - 1 0 931K 2.24K
gpt/hdd1.eli - - 1 0 931K 2.24K
mirror-1 1.86T 876G 2 0 1.81M 4.35K
gpt/hdd2.eli - - 1 0 928K 2.18K
gpt/hdd3.eli - - 1 0 928K 2.18K
special - - - - - -
mirror-2 31.1G 118G 1 1 51.2K 23.3K
gpt/ssd0.eli - - 0 0 25.6K 11.7K
gpt/ssd1.eli - - 0 0 25.6K 11.7K
---------------- ----- ----- ----- ----- ----- ----- -----
2026-05-22 15:32:42 toor@f5 ~
# top -d 1 | head -n 7
last pid: 57622; load averages: 0.24, 0.21, 0.17 up 24+22:47:05
15:32:45
27 processes: 1 running, 26 sleeping
CPU: 0.0% user, 0.0% nice, 0.6% system, 0.0% interrupt, 99.4% idle
Mem: 4848K Active, 330M Inact, 856K Laundry, 14G Wired, 920M Buf, 694M Free
ARC: 12G Total, 10G MFU, 485M MRU, 3328K Anon, 200M Header, 899M Other
9921M Compressed, 33G Uncompressed, 3.36:1 Ratio
Swap: 764M Total, 764M Free
2026-05-22 15:33:12 toor@f5 ~
# arc_summary | grep -A 5 "ARC total accesses"
ARC total accesses (hits + misses): 512.7M
Cache hit ratio: 99.8 % 511.8M
Cache miss ratio: 0.2 % 886.5k
Actual hit ratio (MFU + MRU hits): 99.3 % 509.2M
Data demand efficiency: 99.5 % 4.8M
Data prefetch efficiency: 19.2 % 96.9k
In hindsight:
1. I gathered file system statistics prior rebuilding the pool and
setting special_small_blocks=16K, but it now appears I could have used a
larger value.
2. If I get worried about HDD's failing, I can add disks to the pool as
spares and/or add disks to the data mirrors. The latter should increase
read performance even more.
3. My ~10 year old HDD's can already saturate Gigabit with sequential
I/O. RAID 10 with SSD acceleration is even more overkill. I want 10 GbE.
David