On Mon, Mar 3, 2014 at 1:17 AM, Andriy Gapon <[email protected]> wrote:
> > I noticed that on some of our systems we were getting a clearly abnormal > number > of l2arc checksum errors accounted in l2_cksum_bad. The hardware appeared > to be > in good health. Using DTrace I noticed that the data seemed to be > overwritten > with other data. After more DTrace analysis I observed that sometimes > l2arc_write_buffers() would advance l2ad_hand by more than target_sz. > This meant that l2arc_write_buffers() would write beyond a region cleared > by > l2arc_evict() and thus overwrite data belonging to non-evicted buffers. > Havoc > ensues. > > The cache devices in question are all SSDs with logical sector size of 4KB. > I am not sure about other ZFS platforms, but on FreeBSD this fact is > detected > and ashift of 12 is used for the cache vdevs. > > Looking at l2arc_write_buffers() code you can see that it properly > accounts for > a vdev ashift when actually writing buffers and advancing l2ad_hand: > /* > * Keep the clock hand suitably device-aligned. > */ > buf_p_sz = vdev_psize_to_asize(dev->l2ad_vdev, > buf_sz); > write_psize += buf_p_sz; > dev->l2ad_hand += buf_p_sz; > > But the same is not done when selecting buffers to be written and checking > that > target_sz is not exceeded. > So, if ARC contains a lot of buffers smaller than 4K that means that > on-disk > size of the L2ARC buffers could be quite larger than their logical size. > > I propose the following patch which has been tested and seems to fix the > problem > without introducing any side effects: > > https://github.com/avg-I/freebsd/compare/review;l2arc-write-target-size.diff > https://github.com/avg-I/freebsd/compare/review;l2arc-write-target-size Your analysis and fix look correct to me. I think you have misnamed "buf_p_sz". It is the asize (Allocation Size), not the psize. Thus a more appropriate name would be "buf_asize" or "buf_a_sz". Likewise, you should create a new variable "write_asize" to use instead of "write_psize" here. (looks like you've copied the existing incorrect name from line 5133; that should be renamed too) > > > I think that the following FreeBSD changes might also be of interest to > OpenZFS > community: > https://github.com/freebsd/freebsd/commit/bd47afb2893#diff-23 > https://github.com/freebsd/freebsd/commit/1c55b38ae Those also look reasonable to me. --matt > > -- > Andriy Gapon > > _______________________________________________ > developer mailing list > [email protected] > http://lists.open-zfs.org/mailman/listinfo/developer > >
_______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
