----- Original Message ----- > From: "Alon Bar-Lev" <[email protected]> > To: "Mike Kolesnik" <[email protected]> > Cc: "arch" <[email protected]> > Sent: Thursday, May 9, 2013 4:42:09 PM > Subject: Re: feature suggestion: initial generation of management network > > > > ----- Original Message ----- > > From: "Mike Kolesnik" <[email protected]> > > To: "Moti Asayag" <[email protected]> > > Cc: "Alon Bar-Lev" <[email protected]>, "arch" <[email protected]> > > Sent: Tuesday, May 7, 2013 2:33:15 PM > > Subject: Re: feature suggestion: initial generation of management network > > > > ----- Original Message ----- > > > I stumbled upon few issues with the current design while implementing it: > > > > > > There seems to be a requirement to reboot the host after the installation > > > is completed in order to assure the host is recoverable. > > > > > > Therefore, the building blocks of the installation process of 3.3 are: > > > 1. host deploy which installs the host expect configuring its management > > > network. > > > 2. SetupNetwork (and CommitNetworkChanges) - for creating the management > > > network > > > on the host and persisting the network configuration. > > > 3. Reboot the host - This is a missing piece. (engine has FenceVds > > > command, > > > but it > > > requires the power management to be configured prior to the installation > > > and > > > might > > > be irrelevant for hosts without PM.) > > > > > > So, there are couple of issues here: > > > 1. How to reboot the host? > > > 1.1. By exposing new RebootNode verb in VDSM and invoking it from the > > > engine > > > > This sounds like a solid and good API to me. > > > > > 1.2. By opening ssh dialog to the host in order to execute the reboot > > > > How would you do this? > > > > > > > > 2. When to perform the reboot? > > > 2.1. After host deploy, by utilizing the host deploy to perform the > > > reboot. > > > It requires to configure the network by the monitor when the host is > > > detected > > > by the engine, > > > detached from the installation flow. However it is a step toward the > > > non-persistent network feature > > > yet to be defined. > > > > I am not sure this statement has merit, if the feature is yet to be > > defined, > > how > > can we know if this is a step towards it or not? > > > > Anyway, I'm not sure that this is a good design - should we setup the > > network > > when host returns from non-responsive status? > > Exactly. > Imagine that after a reboot only single interface is configured to allow > communication to engine. > Once the engine connects to the host, it re-configure anything it needs on > that host. > A completely stateless host.
Not completely stateless we'd have to keep the configuration for this privileged interface in case of reboots. Even so, the less state the better, of course. > > > > 2.2. After setupNetwork is done and network was configured and persisted > > > on > > > the host. > > > There is no special advantage from recoverable aspect, as setupNetwork is > > > constantly > > > used to persist the network configuration (by the complementary > > > CommitNetworkChanges command). > > > In case and network configuration fails, VDSM will revert to the last > > > well > > > known configuration > > > - so connectivity with engine should be restored. Design wise, it fits to > > > configure the management > > > network as part of the installation sequence. > > > If the network configuration fails in this context, the host status will > > > be > > > set to "InstallFailed" rather than "NonOperational", > > > as might occur as a result of a failed setupNetwork command. > > > > This sounds like the good solution to me, design wise. The host is > > installed > > and with that the communication with the management network is configured. > > If this communication is not possible, the host failed to install (also > > meaning > > it's not operational). I see no problem with this approach. > > > > > > > > > > > Your inputs are welcome. > > > > > > Thanks, > > > Moti > > > ----- Original Message ----- > > > > From: "Dan Kenigsberg" <[email protected]> > > > > To: "Simon Grinberg" <[email protected]>, "Moti Asayag" > > > > <[email protected]> > > > > Cc: "arch" <[email protected]> > > > > Sent: Tuesday, January 1, 2013 2:47:57 PM > > > > Subject: Re: feature suggestion: initial generation of management > > > > network > > > > > > > > On Thu, Dec 27, 2012 at 07:36:40AM -0500, Simon Grinberg wrote: > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > From: "Dan Kenigsberg" <[email protected]> > > > > > > To: "Simon Grinberg" <[email protected]> > > > > > > Cc: "arch" <[email protected]> > > > > > > Sent: Thursday, December 27, 2012 2:14:06 PM > > > > > > Subject: Re: feature suggestion: initial generation of management > > > > > > network > > > > > > > > > > > > On Tue, Dec 25, 2012 at 09:29:26AM -0500, Simon Grinberg wrote: > > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > From: "Dan Kenigsberg" <[email protected]> > > > > > > > > To: "arch" <[email protected]> > > > > > > > > Sent: Tuesday, December 25, 2012 2:27:22 PM > > > > > > > > Subject: feature suggestion: initial generation of management > > > > > > > > network > > > > > > > > > > > > > > > > Current condition: > > > > > > > > ================== > > > > > > > > The management network, named ovirtmgmt, is created during host > > > > > > > > bootstrap. It consists of a bridge device, connected to the > > > > > > > > network > > > > > > > > device that was used to communicate with Engine (nic, bonding > > > > > > > > or > > > > > > > > vlan). > > > > > > > > It inherits its ip settings from the latter device. > > > > > > > > > > > > > > > > Why Is the Management Network Needed? > > > > > > > > ===================================== > > > > > > > > Understandably, some may ask why do we need to have a > > > > > > > > management > > > > > > > > network - why having a host with IPv4 configured on it is not > > > > > > > > enough. > > > > > > > > The answer is twofold: > > > > > > > > 1. In oVirt, a network is an abstraction of the resources > > > > > > > > required > > > > > > > > for > > > > > > > > connectivity of a host for a specific usage. This is true > > > > > > > > for > > > > > > > > the > > > > > > > > management network just as it is for VM network or a display > > > > > > > > network. > > > > > > > > The network entity is the key for adding/changing nics and > > > > > > > > IP > > > > > > > > address. > > > > > > > > 2. In many occasions (such as small setups) the management > > > > > > > > network is > > > > > > > > used as a VM/display network as well. > > > > > > > > > > > > > > > > Problems in current connectivity: > > > > > > > > ================================ > > > > > > > > According to alonbl of ovirt-host-deploy fame, and with no > > > > > > > > conflict > > > > > > > > to > > > > > > > > my own experience, creating the management network is the most > > > > > > > > fragile, > > > > > > > > error-prone step of bootstrap. > > > > > > > > > > > > > > +1, > > > > > > > I've raise that repeatedly in the past, bootstrap should not > > > > > > > create > > > > > > > the management network but pick up the existing configuration and > > > > > > > let the engine override later with it's own configuration if it > > > > > > > differs , I'm glad that we finally get to that. > > > > > > > > > > > > > > > > > > > > > > > Currently it always creates a bridged network (even if the DC > > > > > > > > requires a > > > > > > > > non-bridged ovirtmgmt), it knows nothing about the defined MTU > > > > > > > > for > > > > > > > > ovirtmgmt, it uses ping to guess on top of which device to > > > > > > > > build > > > > > > > > (and > > > > > > > > thus requires Vdsm-to-Engine reverse connectivity), and is the > > > > > > > > sole > > > > > > > > remaining user of the addNetwork/vdsm-store-net-conf scripts. > > > > > > > > > > > > > > > > Suggested feature: > > > > > > > > ================== > > > > > > > > Bootstrap would avoid creating a management network. Instead, > > > > > > > > after > > > > > > > > bootstrapping a host, Engine would send a getVdsCaps probe to > > > > > > > > the > > > > > > > > installed host, receiving a complete picture of the network > > > > > > > > configuration on the host. Among this picture is the device > > > > > > > > that > > > > > > > > holds > > > > > > > > the host's management IP address. > > > > > > > > > > > > > > > > Engine would send setupNetwork command to generate ovirtmgmt > > > > > > > > with > > > > > > > > details devised from this picture, and according to the DC > > > > > > > > definition > > > > > > > > of > > > > > > > > ovirtmgmt. For example, if Vdsm reports: > > > > > > > > > > > > > > > > - vlan bond4.3000 has the host's IP, configured to use dhcp. > > > > > > > > - bond4 is comprises eth2 and eth3 > > > > > > > > - ovirtmgmt is defined as a VM network with MTU 9000 > > > > > > > > > > > > > > > > then Engine sends the likes of: > > > > > > > > setupNetworks(ovirtmgmt: {bridged=True, vlan=3000, > > > > > > > > iface=bond4, > > > > > > > > bonding=bond4: {eth2,eth3}, MTU=9000) > > > > > > > > > > > > > > Just one comment here, > > > > > > > In order to save time and confusion - if the ovirtmgmt is defined > > > > > > > with default values meaning the user did not bother to touch it, > > > > > > > let it pick up the VLAN configuration from the first host added > > > > > > > in > > > > > > > the Data Center. > > > > > > > > > > > > > > Otherwise, you may override the host VLAN and loose connectivity. > > > > > > > > > > > > > > This will also solve the situation many users encounter today. > > > > > > > 1. The engine in on a host that actually has VLAN defined > > > > > > > 2. The ovirtmgmt network was not updated in the DC > > > > > > > 3. A host, with VLAN already defined is added - everything works > > > > > > > fine > > > > > > > 4. Any number of hosts are now added, again everything seems to > > > > > > > work fine. > > > > > > > > > > > > > > But, now try to use setupNetworks, and you'll find out that you > > > > > > > can't do much on the interface that contains the ovirtmgmt since > > > > > > > the definition does not match. You can't sync (Since this will > > > > > > > remove the VLAN and cause connectivity lose) you can't add more > > > > > > > networks on top since it already has non-VLAN network on top > > > > > > > according to the DC definition, etc. > > > > > > > > > > > > > > On the other hand you can't update the ovirtmgmt definition on > > > > > > > the > > > > > > > DC since there are clusters in the DC that use the network. > > > > > > > > > > > > > > The only workaround not involving DB hack to change the VLAN on > > > > > > > the > > > > > > > network is to: > > > > > > > 1. Create new DC > > > > > > > 2. Do not use the wizard that pops up to create your cluster. > > > > > > > 3. Modify the ovirtmgmt network to have VLANs > > > > > > > 4. Now create a cluster and add your hosts. > > > > > > > > > > > > > > If you insist on using the default DC and cluster then before > > > > > > > adding the first host, create an additional DC and move the > > > > > > > Default cluster over there. You may then change the network on > > > > > > > the > > > > > > > Default cluster and then move the Default cluster back > > > > > > > > > > > > > > Both are ugly. And should be solved by the proposal above. > > > > > > > > > > > > > > We do something similar for the Default cluster CPU level, where > > > > > > > we > > > > > > > set the intial level based on the first host added to the > > > > > > > cluster. > > > > > > > > > > > > I'm not sure what Engine has for Default cluster CPU level. But I > > > > > > have > > > > > > reservation of the hysteresis in your proposal - after a host is > > > > > > added, > > > > > > the DC cannot forget ovirtmgmt's vlan. > > > > > > > > > > > > How about letting the admin edit ovirtmgmt's vlan in the DC level, > > > > > > thus > > > > > > rendering all hosts out-of-sync. The the admin could manually, or > > > > > > through a script, or in the future through a distributed operation, > > > > > > sync > > > > > > all the hosts to the definition? > > > > > > > > > > Usually if you do that you will loose connectivity to the hosts. > > > > > > > > Yes, changing the management vlan id (or ip address) is never fun, and > > > > requires out-of-band intervention. > > > > > > > > > I'm not insisting on the automatic adjustment of the ovirtmgmt > > > > > network > > > > > to > > > > > match the hosts' (that is just a nice touch) we can take the allow > > > > > edit > > > > > approach. > > > > > > > > > > But allow to change VLAN on the ovirtmgmt network will indeed solve > > > > > the > > > > > issue I'm trying to solve while creating another issue of user > > > > > expecting > > > > > that we'll be able to re-tag the host from the engine side, which is > > > > > challenging to do. > > > > > > > > > > On the other hand, if we allow to change the VLAN as long as the > > > > > change > > > > > matches the hosts' configuration, it will both solve the issue while > > > > > not > > > > > eluding the user to think that we really can solve the chicken and > > > > > egg > > > > > issue of re-tag the entire system. > > > > > > > > > > Now with the above ability you do get a flow to do the re-tag. > > > > > 1. Place all the hosts in maintenance > > > > > 2. Re-tag the ovirtmgmt on all the hosts > > > > > 3. Re-tag the hosts on which the engine on > > > > > 4. Activate the hosts - this should work well now since connectivity > > > > > exist > > > > > 5. Change the tag on ovirtmgmt on the engine to match the hosts' > > > > > > > > > > Simple and clear process. > > > > > > > > > > When the workaround of creating another DC was not possible since the > > > > > system was already long in use and the need was re-tag of the network > > > > > the > > > > > above is what I've recommended in the, except that steps 4-5 where > > > > > done > > > > > as: > > > > > 4. Stop the engine > > > > > 5. Change the tag in the DB > > > > > 6. Start the engine > > > > > 7. Activate the hosts > > > > > > > > Sounds reasonable to me - but as far as I am aware this is not tightly > > > > related to the $Subject, which is the post-boot ovirtmgmt definition. > > > > > > > > I've added a few details to > > > > http://www.ovirt.org/Features/Normalized_ovirtmgmt_Initialization#Engine > > > > and I would apreciate a review from someone with intimate Engine > > > > know-how. > > > > > > > > Dan. > > > > > > > _______________________________________________ > > > Arch mailing list > > > [email protected] > > > http://lists.ovirt.org/mailman/listinfo/arch > > > > > _______________________________________________ > > Arch mailing list > > [email protected] > > http://lists.ovirt.org/mailman/listinfo/arch > > > _______________________________________________ > Arch mailing list > [email protected] > http://lists.ovirt.org/mailman/listinfo/arch > _______________________________________________ Arch mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/arch
