On Wed, Jan 30, 2013 at 12:17 PM, Marcus Sorensen <shadow...@gmail.com> wrote: > On Wed, Jan 30, 2013 at 10:58 AM, Chiradeep Vittal > <chiradeep.vit...@citrix.com> wrote: >> >> >> On 1/29/13 3:36 PM, "Marcus Sorensen" <shadow...@gmail.com> wrote: >>> >>> >>>I've been reviewing this, and I'm still not sure that backing out the >>>add nic db entry if the plug nic fails is the right way to go. We can >>>catch the failure and remove it from the database, sure. But consider >>>these scenarios: >>> >>>User runs deployVirtualMachine with parameter startvm=false. We don't >>>destroy their chosen VM config later when they launch the VM and a nic >>>or something else fails to provision. >> >> No, but we don't leave a VM running in an inconsistent state either. >> Either all the nics are created or the vm fails to start > > But we do leave virtual routers in inconsistent state, since the code > in question is used by that. I have seen virtual routers fail to plug > nics and simply not work (usually if there's a problem with a bridge, > for example in development environment). This is not a defense for > doing it in user VMs, but as mentioned, the issue is raised by > existing code we were leveraging, not new code we're adding. > >> >>> >>>User runs addNicToVirtualMachine when the VM is off. This essentially >>>edits the Virtual Machine's config to the same as had the >>>virtualmachine been deployed with the extra network. Again, if the >>>plug fails when they turn the VM on, we don't destroy their config. >> >> This should leave the VM not 'running'. >> >>> >>>All three of these, deployVirtualMachine, addNicToVirtualMachine when >>>VM is on, and addNicToVirtualMachine when vm is off, rely on >>>_networkMgr.createNicForVm to validate and apply for a nic. If we >>>trust that to create a valid config for our VMs, then we shouldn't go >>>back on that if there's a temporary problem. >> >> The question is not of the config, but telling the end-user that the >> addVmToNetwork failed but then leaving the nic entry in the db. An >> automated orchestration tool using the API would list the VM, show it is >> running and then see that there are N nics attached, without any idea that >> some of the nics are in fact not attached. > > Yeah, that's a valid concern, but I tend to think the plug will > succeed even if some things are broken, from what I've seen in > testing. I think the plug might only concern itself with whether it > can change the running VMs config, but not necessarily if it's > actually working as intended. If the plug fails, you're likely in > bigger trouble because other VMs and routers are failing to start, but > then post-issue you have no way to know of any lingering nic issues > that were plugged while the system was having problems. > >> >> >>> >>>The whole goal of this feature was to be able to edit the VM's network >>>config post-deploy. Doing it while the VM was running is a bonus, but >>>not strictly necessary. Would it be better to go back to only allowing >>>it while it's off (essentially the same thing, we won't know if the >>>plug works until they attempt to start the VM, at which point we >>>wouldn't remove it)? Or simply scrap this branch and move on to making >>>nics their own entity? >> >> I wouldn't scrap it. I am comfortable leaving it as is, or removing the >> ability to hot plug. >> > > I would like to leave it as-is and get it merged, and then I think > these concerns can be addressed in testing/bugfix phase. We've done a > lot of testing on it and it works well, though I do understand your > points about potential bugs. Worst case scenario, we can simply patch > it to only work on stopped VMs.
Change is merged. It looks like Brian updated the spec with expected failures (trying to remove default nic, etc).