-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.csiden.org/r/132/
-----------------------------------------------------------
Review request for OpenZFS Developer Mailing List, Brian Behlendorf and Steven
Hartland.
Bugs: 5347
https://www.illumos.org/projects/illumos-gate//issues/5347
Repository: illumos-gate
Description
-------
5347 idle pool may run itself out of space
Reviewed by: George Wilson
Reviewed by: Alex Reece
After receiving an incremental send stream, an idle pool will slowly fill up.
If allowed to become completely full, the pool is unusable and can only be
imported readonly. Any write activity on the pool will clean up the extra
space.
steps to reproduce:
zpool create test c2t1d0
zfs create test/fs
zfs snapshot test/fs@a
zfs snapshot test/fs@b
zfs send test/fs@a | zfs recv test/recvd
zfs send -i @a test/fs@b | zfs recv test/recvd
observing the system after this:
zpool list test 5
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
test 7.94G 2.32M 7.94G - 0% 0% 1.00x ONLINE -
test 7.94G 3.09M 7.93G - 0% 0% 1.00x ONLINE -
test 7.94G 3.45M 7.93G - 0% 0% 1.00x ONLINE -
test 7.94G 4.20M 7.93G - 0% 0% 1.00x ONLINE -
test 7.94G 4.56M 7.93G - 0% 0% 1.00x ONLINE -
test 7.94G 5.20M 7.93G - 0% 0% 1.00x ONLINE -
Analysis:
The extra space is consumed by deferred frees (zdb -bb will tell you this).
If there there is an entry in the dp_bptree_obj (and therefore
SPA_FEATURE_ASYNC_DESTROY is active), but the tree is effectively empty
(there's nothing to traverse in it), then we could incorrectly set
scn_async_stalled, leading to this behavior.
spa_sync() will see that dsl_scan_active() is FALSE, and thus conclude that no
changes will be synced this txg, so we don't spa_sync_deferred_frees().
However, dsl_scan_sync() will see that scn_async_stalled is set, and therefore
try again. Though we don't actually try to process the bptree again, because
SPA_FEATURE_ASYNC_DESTROY is no longer active, we do bpobj_iterate() to process
potential background snapshot destroys. This always dirties the bpobj's bonus
buffer, even if nothing is actually changed. The dirty buffer then needs to be
written out.
Touching a file in the pool caused spa_sync_deferred_frees() to be called, thus
resetting the free space back to where it should be (but it the free space
continued reducing again after that).
In addition to fixing the scn_async_stalled issue, we should make the code in
spa_sync() that checks if this is a no-op TXG less fragile.
Original author: Matthew Ahrens
Diffs
-----
usr/src/uts/common/fs/zfs/uberblock.c
a07dc00ae19a84ee787445176e7c5e38ac13a452
usr/src/uts/common/fs/zfs/sys/uberblock.h
b5bb91573145273eeb6c22cf2896690aa119b0f8
usr/src/uts/common/fs/zfs/spa.c 634967c46f1b148912b7c53e2737d36076c8686f
usr/src/uts/common/fs/zfs/dsl_scan.c 2392b7f336952a2cc9dbf4983f8f83c5fc53d9a8
Diff: https://reviews.csiden.org/r/132/diff/
Testing
-------
ztest
zfs test suite
manual testing as described
(internal link: http://jenkins/job/zfs-precommit/1136/)
Thanks,
Matthew Ahrens
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer