Hello Developer@A few days ago, I posted an issue with a corrupted ZFS volume after we performed an upgrade to the freebsd-fs list:
http://lists.freebsd.org/pipermail/freebsd-fs/2014-June/019537.html
After a few days on running on v5000, one of the two servers we upgraded had panic'd, rebooted, and then panic'd on mounting one of the zpool's volumes
Steve Hartland had provided a patch to sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c, which upon a compile and install reported this upon mount:
Solaris: WARNING: dva_get_dsize_sync(): bad DVA 131241:2147483648
Solaris: WARNING: dva_get_dsize_sync(): bad DVA 131241:2147483648
Solaris: WARNING: dva_get_dsize_sync(): bad DVA 131241:2147483648
Interestingly, his patch allowed me to access the data (and I've been able to
recover some data). If I try to remove that filesystem, I get another kernel
panic...
A few days later, the second server we had upgraded that very same
morning had the exact same issue, it had unexpectedly rebooted and would
panic anytime one out of 8 zfs volumes was mounted.
I think the email thread on freebsd-fs contains all of the relevant information, but to give a clear picture of the two systems:
* Both running FreeBSD 9.1-RELEASE-p13
* Both were upgraded to 10.0-RELEASE on the same day using freebsd-update
o freebsd-update ids reported that all the checksums matched...
* Both servers have a single zpool in a raidz1 configuration
* Both servers have ECC memory, and zpool scub's are performed once a
month (no errors reported)
* One server boots off of zfs, the other boots off of a standard UFS2 disk
* One server has a SSD L2ARC, the other does not
* One server is a Dell, the other is a iXsystems/Supermicro board
o The Dell uses the mfi driver (H710 PERC controller), the other
uses the mps driver (LSI controller)
* Both servers had similar sysctl settings, and had the freebsd aio
kernel module loaded (We run Samba on these servers)
These were both multi terabyte storage nodes, one node had regular
snapshots so we were able to re-create a new zfs volume based off of a
snapshot. The other is just "hot" working data that does not last long,
so there were no snapshots.
I'll be rebuilding these servers, and at this point, I'm just curious about the bad DVA message, and if that indicates another issue that I should be aware of.
Thank!
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
