Your suggestions requires transparent passing of extra_specs to ironic, which is something the nova team has objections for quite some time.
On Mon, Oct 23, 2017 at 4:09 PM, Eric Fried <openst...@fried.cc> wrote: > We discussed this a little bit further in IRC [1]. We're all in > agreement, but it's worth being precise on a couple of points: > > * We're distinguishing between a "feature" and the "trait" that > represents it in placement. For the sake of this discussion, a > "feature" can (maybe) be switched on or off, but a "trait" can either be > present or absent on a RP. > * It matters *who* can turn a feature on/off. > * If it can be done by virt at spawn time, then it makes sense to have > the trait on the RP, and you can switch the feature on/off via a > separate extra_spec. > * But if it's e.g. an admin action, and spawn has no control, then the > trait needs to be *added* whenever the feature is *on*, and *removed* > whenever the feature is *off*. > > [1] > http://eavesdrop.openstack.org/irclogs/%23openstack-nova/ > %23openstack-nova.2017-10-23.log.html#t2017-10-23T13:12:13 > > On 10/23/2017 08:15 AM, Sylvain Bauza wrote: > > > > > > On Mon, Oct 23, 2017 at 2:54 PM, Eric Fried <openst...@fried.cc > > <mailto:openst...@fried.cc>> wrote: > > > > I agree with Sean. In general terms: > > > > * A resource provider should be marked with a trait if that feature > > * Can be turned on or off (whether it's currently on or not); or > > * Is always on and can't ever be turned off. > > > > > > No, traits are not boolean. If a resource provider stops providing a > > capability, then the existing related trait should just be removed, > > that's it. > > If you see a trait, that's just means that the related capability for > > the Resource Provider is supported, that's it too. > > > > MHO. > > > > -Sylvain > > > > > > > > * A consumer wanting that feature present (doesn't matter whether > it's > > on or off) should specify it as a required *trait*. > > * A consumer wanting that feature present and turned on should > > * Specify it as a required trait; AND > > * Indicate that it be turned on via some other mechanism (e.g. a > > separate extra_spec). > > > > I believe this satisfies Dmitry's (Ironic's) needs, but also Jay's > drive > > for placement purity. > > > > Please invite me to the hangout or whatever. > > > > Thanks, > > Eric > > > > On 10/23/2017 07:22 AM, Mooney, Sean K wrote: > > > > > > > > > > > > > > > *From:*Jay Pipes [mailto:jaypi...@gmail.com > > <mailto:jaypi...@gmail.com>] > > > *Sent:* Monday, October 23, 2017 12:20 PM > > > *To:* OpenStack Development Mailing List > > <openstack-dev@lists.openstack.org > > <mailto:openstack-dev@lists.openstack.org>> > > > *Subject:* Re: [openstack-dev] [ironic] ironic and traits > > > > > > > > > > > > Writing from my phone... May I ask that before you proceed with > any plan > > > that uses traits for state information that we have a hangout or > > > videoconference to discuss this? Unfortunately today and tomorrow > I'm > > > not able to do a hangout but I can do one on Wednesday any time of > the day. > > > > > > > > > > > > */[Mooney, Sean K] on the uefi boot topic I did bring up at the > > ptg that > > > we wanted to standardizes tratis for “verified boot” /* > > > > > > */that included a trait for uefi secure boot enabled and to > > indicated a > > > hardware root of trust, e.g. intel boot guard or similar/* > > > > > > */we distinctly wanted to be able to tag nova compute hosts with > those > > > new traits so we could require that vms that request/* > > > > > > */a host with uefi secure boot enabled and a hardware root of > > trust are > > > scheduled only to those nodes. /* > > > > > > */ /* > > > > > > */There are many other examples that effect both vms and bare > > metal such > > > as, ecc/interleaved memory, cluster on die, /* > > > > > > */l3 cache code and data prioritization, vt-d/vt-c, HPET, Hyper > > > threading, power states … all of these feature may be present on > the > > > platform/* > > > > > > */but I also need to know if they are turned on. Ruling out state > in > > > traits means all of this logic will eventually get pushed to > scheduler > > > filters/* > > > > > > */which will be suboptimal long term as more state is tracked. > > Software > > > defined infrastructure may be the future but hardware defined > > software/* > > > > > > */is sadly the present…/* > > > > > > */ /* > > > > > > */I do however think there should be a sperateion between asking > for a > > > host that provides x with a trait and asking for x to be > > configure via/* > > > > > > */A trait. The trait secure_boot_enabled should never result in the > > > feature being enabled It should just find a host with it on. If > > you want/* > > > > > > */To request it to be turned on you would request a host with > > > secure_boot_capable as a trait and have a flavor extra spec or > image > > > property to request/* > > > > > > */Ironic to enabled it. these are two very different request and > > should > > > not be treated the same. /* > > > > > > > > > > > > > > > > > > Lemme know! > > > > > > -jay > > > > > > > > > > > > On Oct 23, 2017 5:01 AM, "Dmitry Tantsur" <dtant...@redhat.com > <mailto:dtant...@redhat.com> > > > <mailto:dtant...@redhat.com <mailto:dtant...@redhat.com>>> wrote: > > > > > > Hi Jay! > > > > > > I appreciate your comments, but I think you're approaching the > > > problem from purely VM point of view. Things simply don't work > the > > > same way in bare metal, at least not if we want to provide the > same > > > user experience. > > > > > > > > > > > > On Sun, Oct 22, 2017 at 2:25 PM, Jay Pipes <jaypi...@gmail.com > <mailto:jaypi...@gmail.com> > > > <mailto:jaypi...@gmail.com <mailto:jaypi...@gmail.com>>> > wrote: > > > > > > Sorry for delay, took a week off before starting a new job. > > > Comments inline. > > > > > > On 10/16/2017 12:24 PM, Dmitry Tantsur wrote: > > > > > > Hi all, > > > > > > I promised John to dump my thoughts on traits to the > > ML, so > > > here we go :) > > > > > > I see two roles of traits (or kinds of traits) for > > bare metal: > > > 1. traits that say what the node can do already (e.g. > "the > > > node is > > > doing UEFI boot") > > > 2. traits that say what the node can be *configured* > to do > > > (e.g. "the node can > > > boot in UEFI mode") > > > > > > > > > There's only one role for traits. #2 above. #1 is state > > > information. Traits are not for state information. Traits > are > > > only for communicating capabilities of a resource provider > > > (baremetal node). > > > > > > > > > > > > These are not different, that's what I'm talking about here. No > > > users care about the difference between "this node was put in > UEFI > > > mode by an operator in advance", "this node was put in UEFI > > mode by > > > an ironic driver on demand" and "this node is always in UEFI > mode, > > > because it's AARCH64 and it does not have BIOS". These > situation > > > produce the same result (the node is booted in UEFI mode), and > > thus > > > it's up to ironic to hide this difference. > > > > > > > > > > > > My suggestion with traits is one way to do it, I'm not sure > > what you > > > suggest though. > > > > > > > > > > > > > > > For example, let's say we add the following to the > os-traits > > > library [1] > > > > > > * STORAGE_RAID_0 > > > * STORAGE_RAID_1 > > > * STORAGE_RAID_5 > > > * STORAGE_RAID_6 > > > * STORAGE_RAID_10 > > > > > > The Ironic administrator would add all RAID-related traits > to > > > the baremetal nodes that had the *capability* of > > supporting that > > > particular RAID setup [2] > > > > > > When provisioned, the baremetal node would either have RAID > > > configured in a certain level or not configured at all. > > > > > > > > > A very important note: the Placement API and Nova > > scheduler (or > > > future Ironic scheduler) doesn't care about this. At all. > > I know > > > it sounds like I'm being callous, but I'm not. Placement > and > > > scheduling doesn't care about the state of things. It only > > cares > > > about the capabilities of target destinations. That's it. > > > > > > > > > > > > Yes, because VMs always start with a clean state, and > > hypervisor is > > > there to ensure that. We don't have this luxury in ironic :) > E.g. > > > our SNMP driver is not even aware of boot modes (or RAID, or > BIOS > > > configuration), which does not mean that a node using it > cannot be > > > in UEFI mode (have a RAID or BIOS pre-configured, etc, etc). > > > > > > > > > > > > > > > > > > This seems confusing, but it's actually very useful. > > Say, I > > > have a flavor that > > > requests UEFI boot via a trait. It will match both the > > nodes > > > that are already in > > > UEFI mode, as well as nodes that can be put in UEFI > mode. > > > > > > > > > No :) It will only match nodes that have the UEFI > capability. > > > The set of providers that have the ability to be booted > > via UEFI > > > is *always* a superset of the set of providers that *have > been > > > booted via UEFI*. Placement and scheduling decisions only > care > > > about that superset -- the providers with a particular > > capability. > > > > > > > > > > > > Well, no, it will. Again, you're purely basing on the VM idea, > > where > > > a VM is always *put* in UEFI mode, no matter how the hypervisor > > > looks like. It is simply not the case for us. You have to care > > what > > > state the node is, because many drivers cannot change this > state. > > > > > > > > > > > > > > > > > > This idea goes further with deploy templates (new > concept > > > we've been thinking > > > about). A flavor can request something like > CUSTOM_RAID_5, > > > and it will match the > > > nodes that already have RAID 5, or, more > > interestingly, the > > > nodes on which we > > > can build RAID 5 before deployment. The UEFI example > above > > > can be treated in a > > > similar way. > > > > > > This ends up with two sources of knowledge about > traits in > > > ironic: > > > 1. Operators setting something they know about hardware > > > ("this node is in UEFI > > > mode"), > > > 2. Ironic drivers reporting something they > > > 2.1. know about hardware ("this node is in UEFI > mode" - > > > again) > > > 2.2. can do about hardware ("I can put this node in > > UEFI > > > mode") > > > > > > > > > You're correct that both pieces of information are > important. > > > However, only the "can do about hardware" part is relevant > to > > > Placement and Nova. > > > > > > For case #1 we are planning on a new CRUD API to > set/unset > > > traits for a node. > > > > > > > > > I would *strongly* advise against this. Traits are not for > > state > > > information. > > > > > > Instead, consider having a DB (or JSON) schema that lists > > state > > > information in fields that are explicitly for that state > > > information. > > > > > > For example, a schema that looks like this: > > > > > > { > > > "boot": { > > > "mode": <one of 'bios' or 'uefi'>, > > > "params": <dict> > > > }, > > > "disk": { > > > "raid": { > > > "level": <int>, > > > "controller": <one of 'sw' or 'hw'>, > > > "driver": <string>, > > > "params": <dict> > > > }, ... > > > }, > > > "network": { > > > ... > > > } > > > } > > > > > > etc, etc. > > > > > > Don't use trait strings to represent state information. > > > > > > > > > > > > I don't see an alternative proposal that will satisfy what we > have > > > to solve. > > > > > > > > > > > > > > > Best, > > > -jay > > > > > > Case #2 is more interesting. We have two options, I > think: > > > > > > a) Operators still set traits on nodes, drivers are > simply > > > validating them. E.g. > > > an operators sets CUSTOM_RAID_5, and the node's RAID > > > interface checks if it is > > > possible to do. The downside is obvious - with a lot of > > > deploy templates > > > available it can be a lot of manual work. > > > > > > b) Drivers report the traits, and they get somehow > > added to > > > the traits provided > > > by an operator. Technically, there are sub-cases again: > > > b.1) The new traits API returns a union of > > > operator-provided and > > > driver-provided traits > > > b.2) The new traits API returns only > operator-provided > > > traits; driver-provided > > > traits are returned e.g. via a new field > > > (node.driver_traits). Then nova will > > > have to merge the lists itself. > > > > > > My personal favorite is the last option: I'd like a > clear > > > distinction between > > > different "sources" of traits, but I'd also like to > reduce > > > manual work for > > > operators. > > > > > > A valid counter-argument is: what if an operator wants > to > > > override a > > > driver-provided trait? E.g. a node can do RAID 5, but I > > > don't want this > > > particular node to do it for any reason. I'm not sure > if > > > it's a valid case, and > > > what to do about it. > > > > > > Let me know what you think. > > > > > > Dmitry > > > > > > > > > [1] > > http://git.openstack.org/cgit/openstack/os-traits/tree/ > > <http://git.openstack.org/cgit/openstack/os-traits/tree/> > > > [2] Based on how many attached disks the node had, the > > presence > > > and abilities of a hardware RAID controller, etc > > > > > > > > > > > > > > ___________________________________________________________ > _______________ > > > OpenStack Development Mailing List (not for usage > questions) > > > Unsubscribe: > > > > > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > <http://openstack-dev-requ...@lists.openstack.org?subject: > unsubscribe> > > > > > <http://openstack-dev-requ...@lists.openstack.org? > subject:unsubscribe > > <http://openstack-dev-requ...@lists.openstack.org?subject: > unsubscribe>> > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/ > openstack-dev > > <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev> > > > > > > > > > > > > > > > ___________________________________________________________ > _______________ > > > OpenStack Development Mailing List (not for usage questions) > > > Unsubscribe: > > > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > <http://openstack-dev-requ...@lists.openstack.org?subject: > unsubscribe> > > > > > <http://openstack-dev-requ...@lists.openstack.org? > subject:unsubscribe > > <http://openstack-dev-requ...@lists.openstack.org?subject: > unsubscribe>> > > > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev> > > > > > > > > > > > > > > ____________________________________________________________ > ______________ > > > OpenStack Development Mailing List (not for usage questions) > > > Unsubscribe: > > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > <http://openstack-dev-requ...@lists.openstack.org?subject: > unsubscribe> > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev> > > > > > > > ____________________________________________________________ > ______________ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: > > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > <http://openstack-dev-requ...@lists.openstack.org?subject: > unsubscribe> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev> > > > > > > > > > > ____________________________________________________________ > ______________ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject: > unsubscribe > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev