[caiman-discuss] Code review for 10807

Evan Layton Thu, 20 Aug 2009 18:49:51 -0600

Keith Mitchell wrote:
> 
> 
> Evan Layton wrote:
>> Keith Mitchell wrote:
>>>
>>>
>>> Evan Layton wrote:
>>>> Ethan Quach wrote:
>>>>>
>>>>>
>>>>> Evan Layton wrote:
>>>>>> Ethan Quach wrote:
>>>>>>>
>>>>>>>
>>>>>>> Evan Layton wrote:
>>>>>>>> jeanm wrote:
>>>>>>>>> Looks OK to me. Maybe some comments explaining the B_FALSE and 
>>>>>>>>> what it does?
>>>>>>>>> I looked in the zfs code but that's kind of awkward.
>>>>>>>>
>>>>>>>> Probably a good idea to add that since it's a bit convoluted...
>>>>>>>>
>>>>>>>> I've added:
>>>>>>>> /*
>>>>>>>>  * The boolean set to B_FALSE and passed to zfs_destroy_snaps() 
>>>>>>>> tells
>>>>>>>>  * zfs to process and destroy the snapshots now. Otherwise ZFS will
>>>>>>>>  * wait to destroy these snapshots until a zfs release is called 
>>>>>>>> and
>>>>>>>>  * the references for all the snapshots have been released.
>>>>>>>>  */
>>>>>>>
>>>>>>> nit - for the second sentence, "Otherwise the call will potentially
>>>>>>> return where the snapshot isn't actually destroyed yet, and ZFS is
>>>>>>> waiting until all the references to the snapshot have been released
>>>>>>> before actually destroying the snapshot."
>>>>>>>
>>>>>>
>>>>>> Much better! I'll make that change.
>>>>>
>>>>> Hmm, I may have made an incorrect assumption on the behavior
>>>>> though.  Will the call actually return and ZFS waits, or wait before
>>>>> returning?
>>>>
>>>> If I understand the comments and the code correctly, ZFS returns 
>>>> without
>>>> actually destroying the snapshots. It waits to do that until all the
>>>> references are released and the zfs_release has been called.
>>> Devil's advocate: What's the problem with using B_TRUE if this is how 
>>> it works? If someone has truly set a hold on a BE snapshot, it seems 
>>> more appropriate to let ZFS clean it up later when whoever put the 
>>> hold on it is done with it. As long as it gets cleaned up eventually 
>>> & automatically... And if we don't want people putting holds on it, 
>>> would it be more appropriate to set the ZFS property such that normal 
>>> users can't put a hold on it?
>>
>> There are a couple of problems with this.
>>
>> This approach has us returning form the zfs_destroy_snap call without 
>> knowing if or when the snapshot will be destroyed. We think we're done 
>> but may not be...
>>
>> Also this approach adds the need to place a zfs release in the 
>> appropriate spots so that the last reference is released. With the 
>> changes done for 6803121 when a snapshot is taken a reference is 
>> automatically added. This is used to keep the snapshot around until 
>> the zfs release is called. If we don't call it or call it in the wrong 
>> place we end up returning from the zfs_destroy_snap believing that the 
>> snapshot will be destroyed but since we either didn't call release or 
>> it wasn't in the right spot, we then have no way to know that the 
>> destroy didn't happen. The use of B_FALSE here forces the destroy to 
>> happen now and not at some later point when zfs release is called. 
>> What if we have a bunch of snapshots we're destroying and we have all 
>> of them stacked up waiting for the zfs release. Then when we call the 
>> zfs release it starts destroying them but one of them fails for some 
>> reason, I'm not sure how strait forward it would be to map that back 
>> to which snapshot destroy call we need to reference to return the 
>> error. It may not be as big a deal as I'm thinking right now but why 
>> add another call (zfs release) to do the actual destroy later if we 
>> really don't need to?
> I think I am either misunderstanding the ZFS changes, or 
> misunderstanding this specific instance of using a BE snapshot. From 
> what I understand:
> 
> zfs_create will create a snapshot, and set the user reference count to 1
> zfs_release will decrement the user reference count, and destroy the 
> snapshot if it brings the reference count to 0
> zfs_hold will increment the reference count
> zfs_destroy will succeed if reference count is 1, and fail otherwise.


We're using zfs_destroy_snap here.

> 
> If the snapshot being destroyed was created during this invocation of 
> beadm, I see no problem with immediately destroying it. However, I think 
> the appropriate call would be zfs_release, as this fits the new model 
> more appropriately. Unless I misunderstand the new functionality, or the 
> proposed fix in the CR is not what was implemented, zfs_release should 
> delete the snapshot without a separate call to zfs_destroy.
> 

as I understand things with the new changes both the zfs_destroy_snap() call 
and 
the zfs_release call are needed.

> If the snapshot being destroyed was NOT created during this invocation 
> of beadm, then it is not safe to assume that no user has placed a hold 
> on it, and thus, immediate destruction is the incorrect choice here. 
> Again, zfs_release should be used here.

We are always either dealing with snapshots created during this invocation of 
beadm, snapshots related directly to the clone and snapshots we are 
specifically 
destroying or a snapshot that was specifically requested to be destroyed.

> 
> In either case, by calling zfs_release, we can then immediately check 
> the user reference count or return value, and immediately know whether 
> (a) the snapshot was destroyed or (b) the snapshot is still alive due to 
> a hold.

We still need the zfs_destroy_snap call. I really don't want to have to add an 
additional call to zfs_release which is what would be required if we set the 
boolean to true. Plus it wouldn't buy us anything to have zfs wait for the 
reference count to drop since we've already gone through the process of making 
sure that this snapshot doesn't have any dependencies before we get to this 
point. Additionally if there are still references we don't want to destroy this 
snapshot. If we use the B_TRUE and zfs_release route here it wouldn't destroy 
the snapshot right then. If we then exit and another call comes into the 
library 
that releases that reference the snapshot would automatically be destroyed out 
from underneath us which may not be exactly what we wanted.

> 
> Related question: Are there any situations where failure to delete a 
> snapshot would preclude us from finishing a call to beadm?

yes. If it fails to destroy the snapshot a BE's datasets where cloned from then 
we fail the destroy and we end up with a snapshot hanging around that should 
have been removed.


The short answer to all of this is that the use of B_FALSE here defaults us to 
the original behavior which is what we want. If there's still a reference to 
this snapshot then something in our code has failed and we should catch this.

>>
>> -evan

[caiman-discuss] Code review for 10807

Reply via email to