----- Original Message ----- From: "Matthew Ahrens" <[email protected]>
snip..
So my questions are:
1. Is this actual intention of the code?


Yes.  If there there is an entry in the dp_bptree_obj (and therefore
SPA_FEATURE_ASYNC_DESTROY is active), but the tree is effectively empty
(there's nothing to traverse in it), then we could incorrectly set
scn_async_stalled, leading to this behavior.

In this case there are entries in it but they are all for items with result in 
early
return.

Do you think for the time being the patch I listed is good enough?

spa_sync() will see that dsl_scan_active() is FALSE, and thus conclude that
no changes will be synced this txg, so we don't spa_sync_deferred_frees().
However, dsl_scan_sync() will see that scn_async_stalled is set, and
therefore try again.  Though we don't actually try to process the bptree
again, because SPA_FEATURE_ASYNC_DESTROY is no longer active, we do
bpobj_iterate() to process potential background snapshot destroys.  This
always dirties the bpobj's bonus buffer, even if nothing is actually
changed.  The dirty buffer then needs to be written out.

I was able to reproduce the bug on illumos using your send stream.
However, touching a file in the pool caused spa_sync_deferred_frees() to be
called, thus resetting the free space back to where it should be (but it
the free space continued reducing again after that).

Ah when you asked about that I was only looking at the incremental and as
you say it continues to reduce, so yes this does reset.

If spa_sync_deferred_frees() is responsible for cleaning up that free space
I'm curious why, as thats called early on in the import process, said import
stalls later on for that test pool image?

In addition to the patch you have above, I think we need to do some work to
make the logic around deferred frees more robust, to prevent similar bugs
from creeping in in the future.

2. Is there a way to cleanup the leaked deferred frees on an existing pool?


If it hasn't totally run out of space, making any changes in the pool will
clean up the deferred frees (e.g. create a new file), but the free space
will start decreasing again.  Rebooting (or export/import the pool) will
clear scn_async_stalled so the problem should go away entirely.

Confirmed.

3. Is there a way to import a pool "writable" which has so many frees its
now full and failing to complete the import with ENOSPC?


Not that I know of, unfortunately.

Really need to have some hard reserved pool space so we can always
recover from things like this.

4. Should these deferred free's have ever multiplied like this or should we
be preventing the multiple storage of deferred frees?


I'm not sure what you mean.  There can normally be more than one block
pointer in the deferred free list.  Are you suggesting that the same
blkptr_t has been added to the deferred free list multiple times?

Well as no new async destroys are being done yet deferred frees are
increasing in size do they not contain the same information, in this
case nothing?

   Regards
   Steve
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to