On 05/12/2013 11:15 AM, Barak Azulay wrote: > > > ----- Original Message ----- >> From: "Livnat Peer" <[email protected]> >> To: "Moti Asayag" <[email protected]> >> Cc: "arch" <[email protected]>, "Alon Bar-Lev" <[email protected]>, "Barak >> Azulay" <[email protected]>, "Simon >> Grinberg" <[email protected]> >> Sent: Sunday, May 12, 2013 9:59:07 AM >> Subject: Re: feature suggestion: initial generation of management network >> >> Thread Summary - >> >> 1. We all agree the automatic reboot after host installation is not >> needed anymore and can be removed. >> >> 2. There is a vast agreement that we need to add a new VDSM verb for reboot. > > I disagree with the above > > In addition to the fact that it will not work when VDSM is not responsive > (when this action will be needed the most) >
you can fence the node if VDSM is non responsive, that's the mechanism we use today to deal with such cases. > >> >> 3. There was a suggestion to add a checkbox when adding a host to reboot >> the host after installation, default would be not to reboot. (leaving >> the option to reboot to the administrator). >> >> >> If there is no objection we'll go with the above. >> >> Thanks, Livnat >> >> >> On 05/07/2013 02:22 PM, Moti Asayag wrote: >>> I stumbled upon few issues with the current design while implementing it: >>> >>> There seems to be a requirement to reboot the host after the installation >>> is completed in order to assure the host is recoverable. >>> >>> Therefore, the building blocks of the installation process of 3.3 are: >>> 1. host deploy which installs the host expect configuring its management >>> network. >>> 2. SetupNetwork (and CommitNetworkChanges) - for creating the management >>> network >>> on the host and persisting the network configuration. >>> 3. Reboot the host - This is a missing piece. (engine has FenceVds command, >>> but it >>> requires the power management to be configured prior to the installation >>> and might >>> be irrelevant for hosts without PM.) >>> >>> So, there are couple of issues here: >>> 1. How to reboot the host? >>> 1.1. By exposing new RebootNode verb in VDSM and invoking it from the >>> engine >>> 1.2. By opening ssh dialog to the host in order to execute the reboot >>> >>> 2. When to perform the reboot? >>> 2.1. After host deploy, by utilizing the host deploy to perform the reboot. >>> It requires to configure the network by the monitor when the host is >>> detected by the engine, >>> detached from the installation flow. However it is a step toward the >>> non-persistent network feature >>> yet to be defined. >>> 2.2. After setupNetwork is done and network was configured and persisted on >>> the host. >>> There is no special advantage from recoverable aspect, as setupNetwork is >>> constantly >>> used to persist the network configuration (by the complementary >>> CommitNetworkChanges command). >>> In case and network configuration fails, VDSM will revert to the last well >>> known configuration >>> - so connectivity with engine should be restored. Design wise, it fits to >>> configure the management >>> network as part of the installation sequence. >>> If the network configuration fails in this context, the host status will be >>> set to "InstallFailed" rather than "NonOperational", >>> as might occur as a result of a failed setupNetwork command. >>> >>> >>> Your inputs are welcome. >>> >>> Thanks, >>> Moti >>> ----- Original Message ----- >>>> From: "Dan Kenigsberg" <[email protected]> >>>> To: "Simon Grinberg" <[email protected]>, "Moti Asayag" >>>> <[email protected]> >>>> Cc: "arch" <[email protected]> >>>> Sent: Tuesday, January 1, 2013 2:47:57 PM >>>> Subject: Re: feature suggestion: initial generation of management network >>>> >>>> On Thu, Dec 27, 2012 at 07:36:40AM -0500, Simon Grinberg wrote: >>>>> >>>>> >>>>> ----- Original Message ----- >>>>>> From: "Dan Kenigsberg" <[email protected]> >>>>>> To: "Simon Grinberg" <[email protected]> >>>>>> Cc: "arch" <[email protected]> >>>>>> Sent: Thursday, December 27, 2012 2:14:06 PM >>>>>> Subject: Re: feature suggestion: initial generation of management >>>>>> network >>>>>> >>>>>> On Tue, Dec 25, 2012 at 09:29:26AM -0500, Simon Grinberg wrote: >>>>>>> >>>>>>> >>>>>>> ----- Original Message ----- >>>>>>>> From: "Dan Kenigsberg" <[email protected]> >>>>>>>> To: "arch" <[email protected]> >>>>>>>> Sent: Tuesday, December 25, 2012 2:27:22 PM >>>>>>>> Subject: feature suggestion: initial generation of management >>>>>>>> network >>>>>>>> >>>>>>>> Current condition: >>>>>>>> ================== >>>>>>>> The management network, named ovirtmgmt, is created during host >>>>>>>> bootstrap. It consists of a bridge device, connected to the >>>>>>>> network >>>>>>>> device that was used to communicate with Engine (nic, bonding or >>>>>>>> vlan). >>>>>>>> It inherits its ip settings from the latter device. >>>>>>>> >>>>>>>> Why Is the Management Network Needed? >>>>>>>> ===================================== >>>>>>>> Understandably, some may ask why do we need to have a management >>>>>>>> network - why having a host with IPv4 configured on it is not >>>>>>>> enough. >>>>>>>> The answer is twofold: >>>>>>>> 1. In oVirt, a network is an abstraction of the resources >>>>>>>> required >>>>>>>> for >>>>>>>> connectivity of a host for a specific usage. This is true for >>>>>>>> the >>>>>>>> management network just as it is for VM network or a display >>>>>>>> network. >>>>>>>> The network entity is the key for adding/changing nics and IP >>>>>>>> address. >>>>>>>> 2. In many occasions (such as small setups) the management >>>>>>>> network is >>>>>>>> used as a VM/display network as well. >>>>>>>> >>>>>>>> Problems in current connectivity: >>>>>>>> ================================ >>>>>>>> According to alonbl of ovirt-host-deploy fame, and with no >>>>>>>> conflict >>>>>>>> to >>>>>>>> my own experience, creating the management network is the most >>>>>>>> fragile, >>>>>>>> error-prone step of bootstrap. >>>>>>> >>>>>>> +1, >>>>>>> I've raise that repeatedly in the past, bootstrap should not create >>>>>>> the management network but pick up the existing configuration and >>>>>>> let the engine override later with it's own configuration if it >>>>>>> differs , I'm glad that we finally get to that. >>>>>>> >>>>>>>> >>>>>>>> Currently it always creates a bridged network (even if the DC >>>>>>>> requires a >>>>>>>> non-bridged ovirtmgmt), it knows nothing about the defined MTU >>>>>>>> for >>>>>>>> ovirtmgmt, it uses ping to guess on top of which device to build >>>>>>>> (and >>>>>>>> thus requires Vdsm-to-Engine reverse connectivity), and is the >>>>>>>> sole >>>>>>>> remaining user of the addNetwork/vdsm-store-net-conf scripts. >>>>>>>> >>>>>>>> Suggested feature: >>>>>>>> ================== >>>>>>>> Bootstrap would avoid creating a management network. Instead, >>>>>>>> after >>>>>>>> bootstrapping a host, Engine would send a getVdsCaps probe to the >>>>>>>> installed host, receiving a complete picture of the network >>>>>>>> configuration on the host. Among this picture is the device that >>>>>>>> holds >>>>>>>> the host's management IP address. >>>>>>>> >>>>>>>> Engine would send setupNetwork command to generate ovirtmgmt with >>>>>>>> details devised from this picture, and according to the DC >>>>>>>> definition >>>>>>>> of >>>>>>>> ovirtmgmt. For example, if Vdsm reports: >>>>>>>> >>>>>>>> - vlan bond4.3000 has the host's IP, configured to use dhcp. >>>>>>>> - bond4 is comprises eth2 and eth3 >>>>>>>> - ovirtmgmt is defined as a VM network with MTU 9000 >>>>>>>> >>>>>>>> then Engine sends the likes of: >>>>>>>> setupNetworks(ovirtmgmt: {bridged=True, vlan=3000, iface=bond4, >>>>>>>> bonding=bond4: {eth2,eth3}, MTU=9000) >>>>>>> >>>>>>> Just one comment here, >>>>>>> In order to save time and confusion - if the ovirtmgmt is defined >>>>>>> with default values meaning the user did not bother to touch it, >>>>>>> let it pick up the VLAN configuration from the first host added in >>>>>>> the Data Center. >>>>>>> >>>>>>> Otherwise, you may override the host VLAN and loose connectivity. >>>>>>> >>>>>>> This will also solve the situation many users encounter today. >>>>>>> 1. The engine in on a host that actually has VLAN defined >>>>>>> 2. The ovirtmgmt network was not updated in the DC >>>>>>> 3. A host, with VLAN already defined is added - everything works >>>>>>> fine >>>>>>> 4. Any number of hosts are now added, again everything seems to >>>>>>> work fine. >>>>>>> >>>>>>> But, now try to use setupNetworks, and you'll find out that you >>>>>>> can't do much on the interface that contains the ovirtmgmt since >>>>>>> the definition does not match. You can't sync (Since this will >>>>>>> remove the VLAN and cause connectivity lose) you can't add more >>>>>>> networks on top since it already has non-VLAN network on top >>>>>>> according to the DC definition, etc. >>>>>>> >>>>>>> On the other hand you can't update the ovirtmgmt definition on the >>>>>>> DC since there are clusters in the DC that use the network. >>>>>>> >>>>>>> The only workaround not involving DB hack to change the VLAN on the >>>>>>> network is to: >>>>>>> 1. Create new DC >>>>>>> 2. Do not use the wizard that pops up to create your cluster. >>>>>>> 3. Modify the ovirtmgmt network to have VLANs >>>>>>> 4. Now create a cluster and add your hosts. >>>>>>> >>>>>>> If you insist on using the default DC and cluster then before >>>>>>> adding the first host, create an additional DC and move the >>>>>>> Default cluster over there. You may then change the network on the >>>>>>> Default cluster and then move the Default cluster back >>>>>>> >>>>>>> Both are ugly. And should be solved by the proposal above. >>>>>>> >>>>>>> We do something similar for the Default cluster CPU level, where we >>>>>>> set the intial level based on the first host added to the cluster. >>>>>> >>>>>> I'm not sure what Engine has for Default cluster CPU level. But I >>>>>> have >>>>>> reservation of the hysteresis in your proposal - after a host is >>>>>> added, >>>>>> the DC cannot forget ovirtmgmt's vlan. >>>>>> >>>>>> How about letting the admin edit ovirtmgmt's vlan in the DC level, >>>>>> thus >>>>>> rendering all hosts out-of-sync. The the admin could manually, or >>>>>> through a script, or in the future through a distributed operation, >>>>>> sync >>>>>> all the hosts to the definition? >>>>> >>>>> Usually if you do that you will loose connectivity to the hosts. >>>> >>>> Yes, changing the management vlan id (or ip address) is never fun, and >>>> requires out-of-band intervention. >>>> >>>>> I'm not insisting on the automatic adjustment of the ovirtmgmt network to >>>>> match the hosts' (that is just a nice touch) we can take the allow edit >>>>> approach. >>>>> >>>>> But allow to change VLAN on the ovirtmgmt network will indeed solve the >>>>> issue I'm trying to solve while creating another issue of user expecting >>>>> that we'll be able to re-tag the host from the engine side, which is >>>>> challenging to do. >>>>> >>>>> On the other hand, if we allow to change the VLAN as long as the change >>>>> matches the hosts' configuration, it will both solve the issue while not >>>>> eluding the user to think that we really can solve the chicken and egg >>>>> issue of re-tag the entire system. >>>>> >>>>> Now with the above ability you do get a flow to do the re-tag. >>>>> 1. Place all the hosts in maintenance >>>>> 2. Re-tag the ovirtmgmt on all the hosts >>>>> 3. Re-tag the hosts on which the engine on >>>>> 4. Activate the hosts - this should work well now since connectivity >>>>> exist >>>>> 5. Change the tag on ovirtmgmt on the engine to match the hosts' >>>>> >>>>> Simple and clear process. >>>>> >>>>> When the workaround of creating another DC was not possible since the >>>>> system was already long in use and the need was re-tag of the network the >>>>> above is what I've recommended in the, except that steps 4-5 where done >>>>> as: >>>>> 4. Stop the engine >>>>> 5. Change the tag in the DB >>>>> 6. Start the engine >>>>> 7. Activate the hosts >>>> >>>> Sounds reasonable to me - but as far as I am aware this is not tightly >>>> related to the $Subject, which is the post-boot ovirtmgmt definition. >>>> >>>> I've added a few details to >>>> http://www.ovirt.org/Features/Normalized_ovirtmgmt_Initialization#Engine >>>> and I would apreciate a review from someone with intimate Engine >>>> know-how. >>>> >>>> Dan. >>>> >>> _______________________________________________ >>> Arch mailing list >>> [email protected] >>> http://lists.ovirt.org/mailman/listinfo/arch >>> >>> >> >> > _______________________________________________ > Arch mailing list > [email protected] > http://lists.ovirt.org/mailman/listinfo/arch > > _______________________________________________ Arch mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/arch
