I'm seeing a condition on FreeBSD 9.1 (built October 24th) where I/O seems to
hang on any local zpools after several hours of hosting a large-ish Postgres
database. The database occupies about 14TB of a 38TB zpool with a single SSD
ZIL. The OS is on a ZFS boot disk. The system also has 24GB of physical memory.
Smartmon tools reports no errors on any disks attached to the system, and IPMI
reports all temperatures, CPU voltages, and fan speeds are normal.
The database has been gradually increasing in size since it was first deployed
on FreeBSD 9.1 this fall. There were no problems until last night, when the
database became unresponsive. Attempts to interact with the shell would block
(specifically, any interaction with the disk), and no error messages were
logged to the console. I restarted the system at that time, and brought the
database back up. Everything seemed normal until this morning, where the
database had become unresponsive again. Fortunately, I was able to grab some
system statistics before the shell and console went AWOL.
The only finding that I thought was suspicious were the kmem_map numbers:
vm.kmem_map_free: 655360
vm.kmem_map_size: 17141383168
It's something like 0.004% free. I haven't been able to find much documentation
on what to expect here, but I don't see anything like that for other databases
that I've monitored. It is possible that kmem_map can become exhausted without
generating a kernel panic? Could it be indicative of severe memory
fragmentation?
- .Dustin
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[email protected]"