Hello,

Not sure if it's worth troubleshooting this too much before upgrading, but we recently had an 8.1R/amd64 box hang in a way that suggested everything was waiting on disk access. It's remote and we had to resort to a power-cycle to bring it back (we have serial console, but it hung after accepting the root password).

We run hourly/daily/weekly/monthly snapshots on about a half dozen filesystems using RSE's snaphot script (see http://people.freebsd.org/~rse/snapshot/ - we only use the zfs snapshotting and do not use the amd portion). We have some basic stats logged on all our boxes every 5 minutes and I saw a pile of cron jobs stuck in disk I/O wait. I suspect these were the snapshots. Shortly after that it seems as if all disk I/O got hung.

Some additional info about what the main tasks are on this box:

-qmail deliveries (lots)
-postgres (light use)
-nfs export of qmail log dirs to another box that does log analysis

All services are spread amongst a handful of jails. Each jail has it's out zfs filesystem.

Does this sound familiar to anyone running ZFS with snapshots? Anything I should log to get more data if this happens again? I have output from arc_summary.pl running every 5 minutes as part of our general status logging.

Any pointers to known issues in ZFS (both 8.1 an 8.2) would be helpful.

Also, anywhere to look for the general state of ZFS besides this page?

http://wiki.freebsd.org/ZFS

Thanks,

Charles
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to