On Wed, Jan 30, 2013 at 12:17 PM, Marcus Sorensen <shadow...@gmail.com> wrote:
> On Wed, Jan 30, 2013 at 10:58 AM, Chiradeep Vittal
> <chiradeep.vit...@citrix.com> wrote:
>>
>>
>> On 1/29/13 3:36 PM, "Marcus Sorensen" <shadow...@gmail.com> wrote:
>>>
>>>
>>>I've been reviewing this, and I'm still not sure that backing out the
>>>add nic db entry if the plug nic fails is the right way to go. We can
>>>catch the failure and remove it from the database, sure. But consider
>>>these scenarios:
>>>
>>>User runs deployVirtualMachine with parameter startvm=false.  We don't
>>>destroy their chosen VM config later when they launch the VM and a nic
>>>or something else fails to provision.
>>
>> No, but we don't leave a VM running in an inconsistent state either.
>> Either all the nics are created or the vm fails to start
>
> But we do leave virtual routers in inconsistent state, since the code
> in question is used by that. I have seen virtual routers fail to plug
> nics and simply not work (usually if there's a problem with a bridge,
> for example in development environment). This is not a defense for
> doing it in user VMs, but as mentioned, the issue is raised by
> existing code we were leveraging, not new code we're adding.
>
>>
>>>
>>>User runs addNicToVirtualMachine when the VM is off. This essentially
>>>edits the Virtual Machine's config to the same as had the
>>>virtualmachine been deployed with the extra network. Again, if the
>>>plug fails when they turn the VM on, we don't destroy their config.
>>
>> This should leave the VM not 'running'.
>>
>>>
>>>All three of these, deployVirtualMachine, addNicToVirtualMachine when
>>>VM is on, and addNicToVirtualMachine when vm is off, rely on
>>>_networkMgr.createNicForVm to validate and apply for a nic. If we
>>>trust that to create a valid config for our VMs, then we shouldn't go
>>>back on that if there's a temporary problem.
>>
>> The question is not of the config, but telling the end-user that the
>> addVmToNetwork failed but then leaving the nic entry in the db. An
>> automated orchestration tool using the API would list the VM, show it is
>> running and then see that there are N nics attached, without any idea that
>> some of the nics are in fact not attached.
>
> Yeah, that's a valid concern, but I tend to think the plug will
> succeed even if some things are broken, from what I've seen in
> testing. I think the plug might only concern itself with whether it
> can change the running VMs config, but not necessarily if it's
> actually working as intended. If the plug fails, you're likely in
> bigger trouble because other VMs and routers are failing to start, but
> then post-issue you have no way to know of any lingering nic issues
> that were plugged while the system was having problems.
>
>>
>>
>>>
>>>The whole goal of this feature was to be able to edit the VM's network
>>>config post-deploy. Doing it while the VM was running is a bonus, but
>>>not strictly necessary. Would it be better to go back to only allowing
>>>it while it's off (essentially the same thing, we won't know if the
>>>plug works until they attempt to start the VM, at which point we
>>>wouldn't remove it)? Or simply scrap this branch and move on to making
>>>nics their own entity?
>>
>> I wouldn't scrap it. I am comfortable leaving it as is, or removing the
>> ability to hot plug.
>>
>
> I would like to leave it as-is and get it merged, and then I think
> these concerns can be addressed in testing/bugfix phase. We've done a
> lot of testing on it and it works well, though I do understand your
> points about potential bugs.  Worst case scenario, we can simply patch
> it to only work on stopped VMs.

Change is merged.  It looks like Brian updated the spec with expected
failures (trying to remove default nic, etc).

Reply via email to