While testing some new hardware on a recent RELENG_14 image (from Nov
10th), I noticed some of my ssh sessions would get killed off with the
errors below (twice in 24hrs)
pid 1697 (sshd), jid 0, uid 1001, was killed: failed to reclaim memory
pid 6274 (sshd), jid 0, uid 1001, was killed: failed to reclaim memory
Nothing fancy bencthmark wise, I am just testing a whole mess of HDDs
off a backplane by generating some synthetic traffic on a big pool of
disks. 65G of RAM. ARC is not limited and seems to try and take the max
possible. Any ideas what might be going on ?
CPU: 1.1% user, 0.0% nice, 20.1% system, 0.0% interrupt, 78.7% idle
Mem: 124K Active, 16M Inact, 6156K Laundry, 59G Wired, 3061M Free
ARC: 53G Total, 1418M MFU, 50G MRU, 374M Anon, 389M Header, 396M Other
50G Compressed, 211G Uncompressed, 4.22:1 Ratio
Swap: 4096M Total, 22M Used, 4074M Free
Script is
#!/bin/sh
while true
do
bonnie -s 190000 -d /hddpool/test/
md5 /hddpool/junk*
bonnie++ -u root -d /hddpool/test
date
sleep 10
done
pool looks like
# zpool status
pool: hddpool
state: ONLINE
scan: scrub repaired 0B in 00:01:39 with 0 errors on Mon Nov 13
08:33:53 2023
config:
NAME STATE READ WRITE CKSUM
hddpool ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
da0p1 ONLINE 0 0 0
da7p1 ONLINE 0 0 0
da12p1 ONLINE 0 0 0
da13p1 ONLINE 0 0 0
raidz1-1 ONLINE 0 0 0
da1p1 ONLINE 0 0 0
da6p1 ONLINE 0 0 0
da9p1 ONLINE 0 0 0
da11p1 ONLINE 0 0 0
raidz1-2 ONLINE 0 0 0
da4p1 ONLINE 0 0 0
da5p1 ONLINE 0 0 0
da8p1 ONLINE 0 0 0
da10p1 ONLINE 0 0 0
errors: No known data errors
# pstat -T
94/2090092 files
22M/4096M swap space
# cat /etc/fstab
# Device Mountpoint FStype Options Dump Pass#
/dev/ada0p2 none swap sw 0 0
/dev/ada1p2 none swap sw 0 0
#