[caiman-discuss] Code review for 10807

Evan Layton Fri, 21 Aug 2009 14:04:16 -0600

Hi Ethan, Jean and Keith,

Thanks for the comments! The automated testing using the libbe test
suite has now completed successfully. The only things I ran into
were known issues.


I plan to push after indiana-build is updated to build 121 so these
changes don't cause builds to fail.

Thanks!
-evan



Keith Mitchell wrote:
> Hi Evan,
> 
> As discussed over the phone, I now understand the changes and agree with 
> them. Thanks for being patient with me!
> 
> - Keith
> 
> Evan Layton wrote:
>> Keith Mitchell wrote:
>>>
>>>
>>> Evan Layton wrote:
>>>> Keith Mitchell wrote:
>>>>>
>>>>>
>>>>> Evan Layton wrote:
>>>>>> Ethan Quach wrote:
>>>>>>>
>>>>>>>
>>>>>>> Evan Layton wrote:
>>>>>>>> Ethan Quach wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Evan Layton wrote:
>>>>>>>>>> jeanm wrote:
>>>>>>>>>>> Looks OK to me. Maybe some comments explaining the B_FALSE 
>>>>>>>>>>> and what it does?
>>>>>>>>>>> I looked in the zfs code but that's kind of awkward.
>>>>>>>>>>
>>>>>>>>>> Probably a good idea to add that since it's a bit convoluted...
>>>>>>>>>>
>>>>>>>>>> I've added:
>>>>>>>>>> /*
>>>>>>>>>>  * The boolean set to B_FALSE and passed to 
>>>>>>>>>> zfs_destroy_snaps() tells
>>>>>>>>>>  * zfs to process and destroy the snapshots now. Otherwise ZFS 
>>>>>>>>>> will
>>>>>>>>>>  * wait to destroy these snapshots until a zfs release is 
>>>>>>>>>> called and
>>>>>>>>>>  * the references for all the snapshots have been released.
>>>>>>>>>>  */
>>>>>>>>>
>>>>>>>>> nit - for the second sentence, "Otherwise the call will 
>>>>>>>>> potentially
>>>>>>>>> return where the snapshot isn't actually destroyed yet, and ZFS is
>>>>>>>>> waiting until all the references to the snapshot have been 
>>>>>>>>> released
>>>>>>>>> before actually destroying the snapshot."
>>>>>>>>>
>>>>>>>>
>>>>>>>> Much better! I'll make that change.
>>>>>>>
>>>>>>> Hmm, I may have made an incorrect assumption on the behavior
>>>>>>> though.  Will the call actually return and ZFS waits, or wait before
>>>>>>> returning?
>>>>>>
>>>>>> If I understand the comments and the code correctly, ZFS returns 
>>>>>> without
>>>>>> actually destroying the snapshots. It waits to do that until all the
>>>>>> references are released and the zfs_release has been called.
>>>>> Devil's advocate: What's the problem with using B_TRUE if this is 
>>>>> how it works? If someone has truly set a hold on a BE snapshot, it 
>>>>> seems more appropriate to let ZFS clean it up later when whoever 
>>>>> put the hold on it is done with it. As long as it gets cleaned up 
>>>>> eventually & automatically... And if we don't want people putting 
>>>>> holds on it, would it be more appropriate to set the ZFS property 
>>>>> such that normal users can't put a hold on it?
>>>>
>>>> There are a couple of problems with this.
>>>>
>>>> This approach has us returning form the zfs_destroy_snap call 
>>>> without knowing if or when the snapshot will be destroyed. We think 
>>>> we're done but may not be...
>>>>
>>>> Also this approach adds the need to place a zfs release in the 
>>>> appropriate spots so that the last reference is released. With the 
>>>> changes done for 6803121 when a snapshot is taken a reference is 
>>>> automatically added. This is used to keep the snapshot around until 
>>>> the zfs release is called. If we don't call it or call it in the 
>>>> wrong place we end up returning from the zfs_destroy_snap believing 
>>>> that the snapshot will be destroyed but since we either didn't call 
>>>> release or it wasn't in the right spot, we then have no way to know 
>>>> that the destroy didn't happen. The use of B_FALSE here forces the 
>>>> destroy to happen now and not at some later point when zfs release 
>>>> is called. What if we have a bunch of snapshots we're destroying and 
>>>> we have all of them stacked up waiting for the zfs release. Then 
>>>> when we call the zfs release it starts destroying them but one of 
>>>> them fails for some reason, I'm not sure how strait forward it would 
>>>> be to map that back to which snapshot destroy call we need to 
>>>> reference to return the error. It may not be as big a deal as I'm 
>>>> thinking right now but why add another call (zfs release) to do the 
>>>> actual destroy later if we really don't need to?
>>> I think I am either misunderstanding the ZFS changes, or 
>>> misunderstanding this specific instance of using a BE snapshot. From 
>>> what I understand:
>>>
>>> zfs_create will create a snapshot, and set the user reference count to 1
>>> zfs_release will decrement the user reference count, and destroy the 
>>> snapshot if it brings the reference count to 0
>>> zfs_hold will increment the reference count
>>> zfs_destroy will succeed if reference count is 1, and fail otherwise.
>>
>> We're using zfs_destroy_snap here.
>>
>>>
>>> If the snapshot being destroyed was created during this invocation of 
>>> beadm, I see no problem with immediately destroying it. However, I 
>>> think the appropriate call would be zfs_release, as this fits the new 
>>> model more appropriately. Unless I misunderstand the new 
>>> functionality, or the proposed fix in the CR is not what was 
>>> implemented, zfs_release should delete the snapshot without a 
>>> separate call to zfs_destroy.
>>>
>>
>> as I understand things with the new changes both the 
>> zfs_destroy_snap() call and the zfs_release call are needed.
>>
>>> If the snapshot being destroyed was NOT created during this 
>>> invocation of beadm, then it is not safe to assume that no user has 
>>> placed a hold on it, and thus, immediate destruction is the incorrect 
>>> choice here. Again, zfs_release should be used here.
>>
>> We are always either dealing with snapshots created during this 
>> invocation of beadm, snapshots related directly to the clone and 
>> snapshots we are specifically destroying or a snapshot that was 
>> specifically requested to be destroyed.
>>
>>>
>>> In either case, by calling zfs_release, we can then immediately check 
>>> the user reference count or return value, and immediately know 
>>> whether (a) the snapshot was destroyed or (b) the snapshot is still 
>>> alive due to a hold.
>>
>> We still need the zfs_destroy_snap call. I really don't want to have 
>> to add an additional call to zfs_release which is what would be 
>> required if we set the boolean to true. Plus it wouldn't buy us 
>> anything to have zfs wait for the reference count to drop since we've 
>> already gone through the process of making sure that this snapshot 
>> doesn't have any dependencies before we get to this point. 
>> Additionally if there are still references we don't want to destroy 
>> this snapshot. If we use the B_TRUE and zfs_release route here it 
>> wouldn't destroy the snapshot right then. If we then exit and another 
>> call comes into the library that releases that reference the snapshot 
>> would automatically be destroyed out from underneath us which may not 
>> be exactly what we wanted.
>>
>>>
>>> Related question: Are there any situations where failure to delete a 
>>> snapshot would preclude us from finishing a call to beadm?
>>
>> yes. If it fails to destroy the snapshot a BE's datasets where cloned 
>> from then we fail the destroy and we end up with a snapshot hanging 
>> around that should have been removed.
>>
>>
>> The short answer to all of this is that the use of B_FALSE here 
>> defaults us to the original behavior which is what we want. If there's 
>> still a reference to this snapshot then something in our code has 
>> failed and we should catch this.
>>
>>>>
>>>> -evan
>>

[caiman-discuss] Code review for 10807

Reply via email to