On Mon, Jan 28, 2013 at 5:25 PM, Marcus Sorensen <shadow...@gmail.com> wrote: > On Mon, Jan 28, 2013 at 5:21 PM, Chiradeep Vittal > <chiradeep.vit...@citrix.com> wrote: >> >> >> On 1/28/13 12:38 PM, "Marcus Sorensen" <shadow...@gmail.com> wrote: >> >>>On Mon, Jan 28, 2013 at 12:44 PM, Marcus Sorensen <shadow...@gmail.com> >>>wrote: >>>> On Mon, Jan 28, 2013 at 12:31 PM, Chiradeep Vittal >>>> <chiradeep.vit...@citrix.com> wrote: >>>>> Inline >>>>> >>>>> On 1/25/13 4:03 PM, "Brian Angus" <blan...@betterservers.com> wrote: >>>>> >>>>>>My answers inline... >>>>>> >>>>>>On 01/24/2013 06:49 PM, Marcus Sorensen wrote: >>>>>>> Brian is probably the right guy to answer some of these, but I'll >>>>>>>chime >>>>>>>in >>>>>>> on a few. >>>>>>> >>>>>>> On Thu, Jan 24, 2013 at 5:58 PM, Chiradeep Vittal < >>>>>>> chiradeep.vit...@citrix.com> wrote: >>>>>>> >>>>>>>> Thanks for this. >>>>>>>> A few comments: >>>>>>>> 1. These mutating operations have to be async and hence the API >>>>>>>>commands >>>>>>>> have to inherit from BaseAsyncCmd instead of BaseCmd >>>>>>>> >>>>>>> >>>>>>> We can do that >>>>>>This change has been pushed to the branch >>>>> >>>>> I saw this. We have 2 choices: extend from BaseAsyncCreateCmd or >>>>> BaseAsyncCmd. I see that you have chosen the latter. If the former: you >>>>> synchronously create the nic object first and obtain an id (uuid), and >>>>> then asynchronously attach the nic. In the current implementation both >>>>>the >>>>> db entity creation and the attach are in the asynchronous workflow. If >>>>>the >>>>> attach fails in the backend, then there will be a nic object in the >>>>> database theoretically attached to the vm, but not really attached. >>>>> Choices: >>>>> A) change the data model to model the fact that the nic is attached or >>>>>not >>>>> (and use the BaseAsyncCreateCmd approach) >>>>> B) ensure that the nic is destroyed when the attach fails. >>>>> >>>>> Note that with (B) you have a potential race condition where an API >>>>>client >>>>> can retrieve the user vm before the attach has succeeded/failed and >>>>>then >>>>> potentially add a static nat rule. >>>> >>>> This is an interesting issue. If it fails to attach, we don't >>>> necessarily want it to back out. In theory we only bother with the >>>> attach because the VM is running, if it weren't then it would be >>>> sufficient to just create the database entry. The existing code (e.g. >>>> deployVirtualMachine) doesn't delete your nics if they fail to apply >>>> when the VM starts up, correct? >> >> No, the VM will fail to start if any of the nics fail to attach/create. So >> you will never have a VM that claims to have 3 nics attached but in >> reality has only 2 attached. >> >>>>> >>> >>>We appreciate your feedback Chiradeep, and helping us to make sure we >>>account for all scenarios. Do you think we're far enough along on this >>>to get it into master? We are just finishing up a few tests that will >>>be added into the branch but I hope we're to they point that any >>>lingering outliers can be bugfixes post code freeze. >> >> If you fix the failing nic case, sure. >> > > Ok, thanks.
I've been reviewing this, and I'm still not sure that backing out the add nic db entry if the plug nic fails is the right way to go. We can catch the failure and remove it from the database, sure. But consider these scenarios: User runs deployVirtualMachine with parameter startvm=false. We don't destroy their chosen VM config later when they launch the VM and a nic or something else fails to provision. User runs addNicToVirtualMachine when the VM is off. This essentially edits the Virtual Machine's config to the same as had the virtualmachine been deployed with the extra network. Again, if the plug fails when they turn the VM on, we don't destroy their config. All three of these, deployVirtualMachine, addNicToVirtualMachine when VM is on, and addNicToVirtualMachine when vm is off, rely on _networkMgr.createNicForVm to validate and apply for a nic. If we trust that to create a valid config for our VMs, then we shouldn't go back on that if there's a temporary problem. The whole goal of this feature was to be able to edit the VM's network config post-deploy. Doing it while the VM was running is a bonus, but not strictly necessary. Would it be better to go back to only allowing it while it's off (essentially the same thing, we won't know if the plug works until they attempt to start the VM, at which point we wouldn't remove it)? Or simply scrap this branch and move on to making nics their own entity?