On Wed, Oct 15, 2014 at 5:11 PM, Steven Hartland <[email protected]>
wrote:

> ----- Original Message ----- From: "Matthew Ahrens" <[email protected]>
> snip..
>
>> So my questions are:
>>> 1. Is this actual intention of the code?
>>>
>>>
>> Yes.  If there there is an entry in the dp_bptree_obj (and therefore
>> SPA_FEATURE_ASYNC_DESTROY is active), but the tree is effectively empty
>> (there's nothing to traverse in it), then we could incorrectly set
>> scn_async_stalled, leading to this behavior.
>>
>
> In this case there are entries in it but they are all for items with
> result in early
> return.
>
> Do you think for the time being the patch I listed is good enough?


Yes.  I'll also work on making the spa_sync() code less fragile.

--matt


>
>
>  spa_sync() will see that dsl_scan_active() is FALSE, and thus conclude
>> that
>> no changes will be synced this txg, so we don't spa_sync_deferred_frees().
>> However, dsl_scan_sync() will see that scn_async_stalled is set, and
>> therefore try again.  Though we don't actually try to process the bptree
>> again, because SPA_FEATURE_ASYNC_DESTROY is no longer active, we do
>> bpobj_iterate() to process potential background snapshot destroys.  This
>> always dirties the bpobj's bonus buffer, even if nothing is actually
>> changed.  The dirty buffer then needs to be written out.
>>
>> I was able to reproduce the bug on illumos using your send stream.
>> However, touching a file in the pool caused spa_sync_deferred_frees() to
>> be
>> called, thus resetting the free space back to where it should be (but it
>> the free space continued reducing again after that).
>>
>
> Ah when you asked about that I was only looking at the incremental and as
> you say it continues to reduce, so yes this does reset.
>
> If spa_sync_deferred_frees() is responsible for cleaning up that free space
> I'm curious why, as thats called early on in the import process, said
> import
> stalls later on for that test pool image?
>
>  In addition to the patch you have above, I think we need to do some work
>> to
>> make the logic around deferred frees more robust, to prevent similar bugs
>> from creeping in in the future.
>>
>> 2. Is there a way to cleanup the leaked deferred frees on an existing
>> pool?
>>
>>>
>>>
>> If it hasn't totally run out of space, making any changes in the pool will
>> clean up the deferred frees (e.g. create a new file), but the free space
>> will start decreasing again.  Rebooting (or export/import the pool) will
>> clear scn_async_stalled so the problem should go away entirely.
>>
>
> Confirmed.
>
>  3. Is there a way to import a pool "writable" which has so many frees its
>>> now full and failing to complete the import with ENOSPC?
>>>
>>>
>> Not that I know of, unfortunately.
>>
>
> Really need to have some hard reserved pool space so we can always
> recover from things like this.
>
>  4. Should these deferred free's have ever multiplied like this or should
>> we
>>
>>> be preventing the multiple storage of deferred frees?
>>>
>>>
>>>  I'm not sure what you mean.  There can normally be more than one block
>> pointer in the deferred free list.  Are you suggesting that the same
>> blkptr_t has been added to the deferred free list multiple times?
>>
>
> Well as no new async destroys are being done yet deferred frees are
> increasing in size do they not contain the same information, in this
> case nothing?
>
>    Regards
>    Steve
>
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to