Re: ZFS l2arc broken in 10.3
Pete French wrote: Ok, thats a bit worry if true - but I can confirm that l2arc works fine under 10.3 on amd64, so what you say about cross-compling might be true. Am taking an inetrest in this as I have just dpeloyed a lot of machines which are going to be relying on l2arc working to get reasobale performance. Sure on my amd64 it also works fine. AFAIK such things are tolerated when compiling in 64bit. But I was pointed to another point interim: my source is from STABLE branch; in the 10.3 RELEASE the code is different. Obviousely there were recent changes, and that explains why the problem was not yet detected. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ZFS l2arc broken in 10.3
On 12/10/2016 23:18, Peter wrote: > Details: > After upgrading 2 machines from 9.3 to 10.3-STABLE, on one of them the > l2arc stays empty (capacity alloc = 0), although it is online and gets > accessed. It did work well on 9.3. > > I did the following tests: > * Create a zpool on a stick, with two volumes: one filesystem and one >cache. The cache stays with alloc=0. >Export it and move it into the other machine. The cache immediately >fills. >Move it back, the cache stays with alloc=0. >-> this rules out all zpool/zfs get/set options, as they should > walk with the pool. > * Boot the GENERIC kernel. l2arc stays with alloc=0. >-> this rules out all my nonstandard kernel options. > * Boot in single user mode. l2arc stays with alloc=0. >-> this rules out all /etc/* config files. > * Delete the zpool.cache and reimport pools. l2arc stays with alloc=0. > * Copy the /boot/loader.conf settings to the other machine. The l2arc >still works there. > > I could not think of any remaining place where this could come from, > except the kernel code itself. > From there, I found these counters nicely incrementing each second: > kstat.zfs.misc.arcstats.l2_write_buffer_list_iter: 50758 > kstat.zfs.misc.arcstats.l2_write_buffer_list_null_iter: 27121 > kstat.zfs.misc.arcstats.l2_write_buffer_bytes_scanned: 40589375488 > But also this counter incrementing: > kstat.zfs.misc.arcstats.l2_write_full: 14604 > > Then with some printf in the code I saw these values provided: > buf_sz = hdr->b_size; > align = (size_t)1 << dev->l2ad_vdev->vdev_ashift; > buf_a_sz = P2ROUNDUP(buf_sz, align); > if ((write_asize + buf_a_sz) > target_sz) { >full = B_TRUE; >mutex_exit(hash_lock); >ARCSTAT_BUMP(arcstat_l2_write_full); >break; > } > > buf_sz =1536 > align =512 > buf_a_sz =18446744069414585856 > write_asize =0 > target_sz =16777216 > > where buf_a_sz is obviousely off by (2^64 - 2^32). > > Maybe this is an effect of crosscompiling i386 on amd64. Yes, the problem is specific to 32-bit platforms where size_t is 32-bit. > But anyway, as long as > i386 is still supported, it should not happen. Certainly. > Now, my real concern is: if this really obvious ... made it undetected until > 10.3, how many other missing typecasts are still in the code?? No need to be dramatic here. That particular piece code is very new. I committed it to head in April (r297848), MFC-ed even later. Apparently no one else who uses 32-bit systems and has L2ARC configured had a chance to run into the bug. Thank you very much for discovering and analyzing the bug and providing a fix for it! -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ZFS l2arc broken in 10.3
Ok, thats a bit worry if true - but I can confirm that l2arc works fine under 10.3 on amd64, so what you say about cross-compling might be true. Am taking an inetrest in this as I have just dpeloyed a lot of machines which are going to be relying on l2arc working to get reasobale performance. -pete. On 10/12/16 21:18, Peter wrote: > Details: > After upgrading 2 machines from 9.3 to 10.3-STABLE, on one of them the > l2arc stays empty (capacity alloc = 0), although it is online and gets > accessed. It did work well on 9.3. > > I did the following tests: > * Create a zpool on a stick, with two volumes: one filesystem and one >cache. The cache stays with alloc=0. >Export it and move it into the other machine. The cache immediately >fills. >Move it back, the cache stays with alloc=0. >-> this rules out all zpool/zfs get/set options, as they should > walk with the pool. > * Boot the GENERIC kernel. l2arc stays with alloc=0. >-> this rules out all my nonstandard kernel options. > * Boot in single user mode. l2arc stays with alloc=0. >-> this rules out all /etc/* config files. > * Delete the zpool.cache and reimport pools. l2arc stays with alloc=0. > * Copy the /boot/loader.conf settings to the other machine. The l2arc >still works there. > > I could not think of any remaining place where this could come from, > except the kernel code itself. > From there, I found these counters nicely incrementing each second: > kstat.zfs.misc.arcstats.l2_write_buffer_list_iter: 50758 > kstat.zfs.misc.arcstats.l2_write_buffer_list_null_iter: 27121 > kstat.zfs.misc.arcstats.l2_write_buffer_bytes_scanned: 40589375488 > But also this counter incrementing: > kstat.zfs.misc.arcstats.l2_write_full: 14604 > > Then with some printf in the code I saw these values provided: > buf_sz = hdr->b_size; > align = (size_t)1 << dev->l2ad_vdev->vdev_ashift; > buf_a_sz = P2ROUNDUP(buf_sz, align); > if ((write_asize + buf_a_sz) > target_sz) { >full = B_TRUE; >mutex_exit(hash_lock); >ARCSTAT_BUMP(arcstat_l2_write_full); >break; > } > > buf_sz =1536 > align =512 > buf_a_sz =18446744069414585856 > write_asize =0 > target_sz =16777216 > > where buf_a_sz is obviousely off by (2^64 - 2^32). > > Maybe this is an effect of crosscompiling i386 on amd64. But anyway, as > long as i386 is still supported, it should not happen. > > > Now, my real concern is: if this really obvious ... made it undetected > until 10.3, how many other missing typecasts are still in the code?? > > ___ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
[fixed] ZFS l2arc broken in 10.3
sendbug seems not to work anymore, I end up on websites with marketing- babble and finally get asked to provide some login and passwd. :( But the former mail looks like having come back to me, so it seems I'm still allowed to post here... *** sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c.orig Wed Oct 12 21:07:25 2016 --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.cWed Oct 12 21:46:16 2016 *** *** 6508,6514 */ buf_sz = hdr->b_size; align = (size_t)1 << dev->l2ad_vdev->vdev_ashift; ! buf_a_sz = P2ROUNDUP(buf_sz, align); if ((write_asize + buf_a_sz) > target_sz) { full = B_TRUE; --- 6508,6514 */ buf_sz = hdr->b_size; align = (size_t)1 << dev->l2ad_vdev->vdev_ashift; ! buf_a_sz = P2ROUNDUP_TYPED(buf_sz, align, uint64_t); if ((write_asize + buf_a_sz) > target_sz) { full = B_TRUE; ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ZFS l2arc broken in 10.3
Details: After upgrading 2 machines from 9.3 to 10.3-STABLE, on one of them the l2arc stays empty (capacity alloc = 0), although it is online and gets accessed. It did work well on 9.3. I did the following tests: * Create a zpool on a stick, with two volumes: one filesystem and one cache. The cache stays with alloc=0. Export it and move it into the other machine. The cache immediately fills. Move it back, the cache stays with alloc=0. -> this rules out all zpool/zfs get/set options, as they should walk with the pool. * Boot the GENERIC kernel. l2arc stays with alloc=0. -> this rules out all my nonstandard kernel options. * Boot in single user mode. l2arc stays with alloc=0. -> this rules out all /etc/* config files. * Delete the zpool.cache and reimport pools. l2arc stays with alloc=0. * Copy the /boot/loader.conf settings to the other machine. The l2arc still works there. I could not think of any remaining place where this could come from, except the kernel code itself. From there, I found these counters nicely incrementing each second: kstat.zfs.misc.arcstats.l2_write_buffer_list_iter: 50758 kstat.zfs.misc.arcstats.l2_write_buffer_list_null_iter: 27121 kstat.zfs.misc.arcstats.l2_write_buffer_bytes_scanned: 40589375488 But also this counter incrementing: kstat.zfs.misc.arcstats.l2_write_full: 14604 Then with some printf in the code I saw these values provided: buf_sz = hdr->b_size; align = (size_t)1 << dev->l2ad_vdev->vdev_ashift; buf_a_sz = P2ROUNDUP(buf_sz, align); if ((write_asize + buf_a_sz) > target_sz) { full = B_TRUE; mutex_exit(hash_lock); ARCSTAT_BUMP(arcstat_l2_write_full); break; } buf_sz =1536 align = 512 buf_a_sz = 18446744069414585856 write_asize = 0 target_sz = 16777216 where buf_a_sz is obviousely off by (2^64 - 2^32). Maybe this is an effect of crosscompiling i386 on amd64. But anyway, as long as i386 is still supported, it should not happen. Now, my real concern is: if this really obvious ... made it undetected until 10.3, how many other missing typecasts are still in the code?? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
ZFS l2arc broken in 10.3
details to follow ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"