> On Apr 7, 2017, at 8:26 PM, John Barfield <[email protected]> wrote:
>
> Greetings,
>
> I just want to report that after a clean istall of r151020 I found a bug
> whereby importing an older zpool from r151012 and running zpool upgrade
> causes an SSD cache device size to be reported incorrectly. (only 1 out of 4
> devices in this instance)
>
> The cache device size is 93gb and arcstat reported it to be 680gb.
>
> I confirmed by monitoring zpool iostat -v and saw the same size being
> reported.
>
> We've had a lot of weird io lockups (which is how I found the issue, we didnt
> notice it until a month after) that brings all of our NFS mounts to a
> screeching halt and this was the only thing I could find to be out of the
> ordinary on the system.
>
> CPU average @1% , 20% of ram free, no crazy processes waiting on IO. It was
> completely invisible. At least from my testing using several dtrace scripts
> from the net.
>
> I can only assume that the incorrect size reporting caused the zpool to fill
> the cache drive up beyond its physical capacity during periods of heavy load.
>
> I removed all cache devices and then added them back to the zpool. Then all
> disks reported correctly again. Format/diskinfo always reported correctly so
> it was specific to zfs.
>
> We're monitoring the NAS closely to see if the issues occur again.
The only thing I could find that might address the symptoms you see is this:
https://illumos.org/issues/7504
Which didn't make it upstream in time to hit r151020.
You should forward this on to the illumos ZFS developers' list:
[email protected].
Dan
_______________________________________________
OmniOS-discuss mailing list
[email protected]
http://lists.omniti.com/mailman/listinfo/omnios-discuss