Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 2014/05/02 15:27, Tzu-Mainn Chen wrote: Hi, In parallel to Jarda's updated wireframes, and based on various discussions over the past weeks, here are the updated Tuskar requirements for Icehouse: https://wiki.openstack.org/wiki/TripleO/TuskarIcehouseRequirements Any feedback is appreciated. Thanks! Tzu-Mainn Chen +1 looks good to me! -- Jarda ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO][Tuskar] Icehouse Requirements
Hi, In parallel to Jarda's updated wireframes, and based on various discussions over the past weeks, here are the updated Tuskar requirements for Icehouse: https://wiki.openstack.org/wiki/TripleO/TuskarIcehouseRequirements Any feedback is appreciated. Thanks! Tzu-Mainn Chen ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 11/12/13 21:42, Robert Collins wrote: > On 12 December 2013 01:17, Jaromir Coufal wrote: >> On 2013/10/12 23:09, Robert Collins wrote: [snip] >>> Thats speculation. We don't know if they will or will not because we >>> haven't given them a working system to test. >> >> Some part of that is speculation, some part of that is feedback from people >> who are doing deployments (of course its just very limited audience). >> Anyway, it is not just pure theory. > > Sure. Let be me more precise. There is a hypothesis that lack of > direct control will be a significant adoption blocker for a primary > group of users. I'm sorry for butting in, but I think I can see where your disagreement comes from and maybe explaining it will help resolving it. It's not a hypothesis, but a well documented and researched fact, that transparency has a huge impact on the ease of use of any information artifact. In particular, the easier you can see what is actually happening and how your actions affect the outcome, the faster you can learn to use it and the more efficient you are in using it and resolving any problems with it. It's no surprise that "closeness of mapping" and "hidden dependencies" are two important congnitive dimensions that are often measured when assesing the usability of an artifact. Humans simply find it nice when they can tell what is happening, even if theoretically they don't need that knowledge when everything works correctly. This doesn't come from any direct requirements of Tuskar itself, and I am sure that all the workarounds that Robert gave will work somehow in every real-world problem that arises. But the whole will not necessarily be easy or pleasant to learn and use. I am aware, that the requirment to be able to see what is happening is a fundamental problem, because it destroys one of the most important rules in system engineering -- separation of concerns. The parts in the upper layers should simply not care how the parts in the lower layers do their jobs, as long as they work properly. I know that it is a kind of a tradition in Open Source software to create software with the assumption, that it's enough for it to do its job, and if every use case can be somehow done, directly or indirectly, then it's good enough. We have a lot of working tools designed with this principle in mind, such as CSS, autotools or our favorite git. They do their job, and they do it well (except when they break horribly). But I think we can put a little bit more effort into also ensuring that the common use cases are not just doable, but also easy to implement and maintain. And that means that we will sometimes have a requirement that comes from how people think, and not from any particular technical need. I know that it sounds like speculation, or theory, but I think we need to tust in Jarda's experience with usability and his judgement about what works better -- unless of course we are willing to learn all that ourselves, which may take quite some time. What is the point of having an expert, if we know better, after all? -- Radomir Dopieralski ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 13/12/13 09:41 -0500, Jay Dobies wrote: * ability to 'preview' changes going to the scheduler What does this give you? How detailed a preview do you need? What information is critical there? Have you seen the proposed designs for a heat template preview feature - would that be sufficient? Will will probably have a better answer to this, but I feel like at very least this goes back to the psychology point raised earlier (I think in this thread, but if not, definitely one of the TripleO ones). A weird parallel is whenever I do a new install of Fedora. I never accept their default disk partitioning without electing to review/modify it. Even if I didn't expect to change anything, I want to see what they are going to give me. And then I compulsively review the summary of what actual changes will be applied in the follow up screen that's displayed after I say I'm happy with the layout. Perhaps that's more a commentary on my own OCD and cynicism that I feel dirty accepting the magic defaults blindly. I love the idea of anaconda doing the heavy lifting of figuring out sane defaults for home/root/swap and so on (similarly, I love the idea of Nova scheduler rationing out where instances are deployed), but I at least want to know I've seen it before it happens. I fully admit to not knowing how common that sort of thing is. I suspect I'm in the majority of geeks and tame by sys admin standards, but I honestly don't know. So I acknowledge that my entire argument for the preview here is based on my own personality. Jay, I mirror your sentiments exactly here, the Fedora example is a good one and is moreso the case when it comes to node allocation/details and proposed changes in a deployment scenario. Though 9/10 times the defaults Nova scheduler will choose will be fine but there's a 'human' need to review them, changing as necessary. -will pgpt6jWvlbElR.pgp Description: PGP signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 13/12/13 19:06 +1300, Robert Collins wrote: On 13 December 2013 06:24, Will Foster wrote: I just wanted to add a few thoughts: Thank you! For some comparative information here "from the field" I work extensively on deployments of large OpenStack implementations, most recently with a ~220node/9rack deployment (scaling up to 42racks / 1024 nodes soon). My primary role is of a Devops/Sysadmin nature, and not a specific development area so rapid provisioning/tooling/automation is an area I almost exclusively work within (mostly using API-driven using Foreman/Puppet). The infrastructure our small team designs/builds supports our development and business. I am the target user base you'd probably want to cater to. Absolutely! I can tell you the philosophy and mechanics of Tuskar/OOO are great, something I'd love to start using extensively but there are some needed aspects in the areas of control that I feel should be added (though arguably less for me and more for my ilk who are looking to expand their OpenStack footprint). * ability to 'preview' changes going to the scheduler What does this give you? How detailed a preview do you need? What information is critical there? Have you seen the proposed designs for a heat template preview feature - would that be sufficient? Thanks for the reply. Preview-wise it'd be useful to see node allocation prior to deployment - nothing too in-depth. I have not seen the heat template preview features, are you referring to the YAML templating[1] or something else[2]? I'd like to learn more. [1] - http://docs.openstack.org/developer/heat/template_guide/hot_guide.html [2] - https://github.com/openstack/heat-templates * ability to override/change some aspects within node assignment What would this be used to do? How often do those situations turn up? Whats the impact if you can't do that? One scenario might be that autodiscovery does not pick up an available node in your pool of resources, or detects incorrectly - you could manually change things as you like it. Another (more common) scenario is that you don't have an isolated, flat network with which to deploy and nodes are picked that you do not want included in the provisioning - you could remove those from the set of resources prior to launching overcloud creation. The impact would be that the tooling would seem inflexible to those lacking a thoughtfully prepared network/infrastructure, or more commonly in cases where the existing network design is too inflexible the usefulness and quick/seamless provisioning benefits would fall short. * ability to view at least minimal logging from within Tuskar UI Logging of what - the deployment engine? The heat event-log? Nova undercloud logs? Logs from the deployed instances? If it's not there in V1, but you can get, or already have credentials for the [instances that hold the logs that you wanted] would that be a big adoption blocker, or just a nuisance? Logging of the deployment engine status during the bootstrapping process initially, and some rudimentary node success/failure indication. It should be simplistic enough to not rival existing monitoring/log systems but at least provide deployment logs as the overcloud is being built and a general node/health 'check-in' that it's complete. Afterwards as you mentioned the logs are available on the deployed systems. Think of it as providing some basic written navigational signs for people crossing a small bridge before they get to the highway, there's continuity from start -> finish and a clear sense of what's occurring. From my perspective, absence of this type of verbosity may impede adoption of new users (who are used to this type of information with deployment tooling). Here's the main reason - most new adopters of OpenStack/IaaS are going to be running legacy/mixed hardware and while they might have an initiative to explore and invest and even a decent budget most of them are not going to have completely identical hardware, isolated/flat networks and things set aside in such a way that blind auto-discovery/deployment will just work all the time. Thats great information (and something I reasonably well expected, to a degree). We have a hard dependency on no wildcard DHCP servers in the environment (or we can't deploy). Autodiscovery is something we don't have yet, but certainly debugging deployment failures is a very important use case and one we need to improve both at the plumbing layer and in the stories around it in the UI. There will be a need to sometimes adjust, and those coming from a more vertically-scaling infrastructure (most large orgs.) will not have 100% matching standards in place of vendor, machine spec and network design which may make Tuscar/OOO seem inflexible and 'one-way'. This may just be a carry-over or fear of the old ways of deployment but nonetheless it is present. I'm not sure what you mean by matching standards here :). Ironic is designed to support
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On Mon Dec 9 15:22:04 2013, Robert Collins wrote: > On 9 December 2013 23:56, Jaromir Coufal wrote: >> >> Ironic today will want IPMI address + MAC for each NIC + disk/cpu/memory >> stats >> >> For registration it is just Management MAC address which is needed right? Or >> does Ironic need also IP? I think that MAC address might be enough, we can >> display IP in details of node later on. > > Ironic needs all the details I listed today. Management MAC is not > currently used at all, but would be needed in future when we tackle > IPMI IP managed by Neutron. I think what happened here is that two separate things we need got conflated. We need the IP address of the management (IPMI) interface, for power control, etc. We also need the MAC of the host system (*not* its IPMI/management interface) for PXE to serve it the appropriate content. -- Matt Wagner Software Engineer, Red Hat signature.asc Description: OpenPGP digital signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
* ability to 'preview' changes going to the scheduler What does this give you? How detailed a preview do you need? What information is critical there? Have you seen the proposed designs for a heat template preview feature - would that be sufficient? Will will probably have a better answer to this, but I feel like at very least this goes back to the psychology point raised earlier (I think in this thread, but if not, definitely one of the TripleO ones). A weird parallel is whenever I do a new install of Fedora. I never accept their default disk partitioning without electing to review/modify it. Even if I didn't expect to change anything, I want to see what they are going to give me. And then I compulsively review the summary of what actual changes will be applied in the follow up screen that's displayed after I say I'm happy with the layout. Perhaps that's more a commentary on my own OCD and cynicism that I feel dirty accepting the magic defaults blindly. I love the idea of anaconda doing the heavy lifting of figuring out sane defaults for home/root/swap and so on (similarly, I love the idea of Nova scheduler rationing out where instances are deployed), but I at least want to know I've seen it before it happens. I fully admit to not knowing how common that sort of thing is. I suspect I'm in the majority of geeks and tame by sys admin standards, but I honestly don't know. So I acknowledge that my entire argument for the preview here is based on my own personality. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 13 December 2013 06:24, Will Foster wrote: > I just wanted to add a few thoughts: Thank you! > For some comparative information here "from the field" I work > extensively on deployments of large OpenStack implementations, > most recently with a ~220node/9rack deployment (scaling up to 42racks / 1024 > nodes soon). My primary role is of a Devops/Sysadmin nature, and not a > specific development area so rapid provisioning/tooling/automation is an > area I almost exclusively work within (mostly using API-driven > using Foreman/Puppet). The infrastructure our small team designs/builds > supports our development and business. > > I am the target user base you'd probably want to cater to. Absolutely! > I can tell you the philosophy and mechanics of Tuskar/OOO are great, > something I'd love to start using extensively but there are some needed > aspects in the areas of control that I feel should be added (though arguably > less for me and more for my ilk who are looking to expand their OpenStack > footprint). > > * ability to 'preview' changes going to the scheduler What does this give you? How detailed a preview do you need? What information is critical there? Have you seen the proposed designs for a heat template preview feature - would that be sufficient? > * ability to override/change some aspects within node assignment What would this be used to do? How often do those situations turn up? Whats the impact if you can't do that? > * ability to view at least minimal logging from within Tuskar UI Logging of what - the deployment engine? The heat event-log? Nova undercloud logs? Logs from the deployed instances? If it's not there in V1, but you can get, or already have credentials for the [instances that hold the logs that you wanted] would that be a big adoption blocker, or just a nuisance? > Here's the main reason - most new adopters of OpenStack/IaaS are going to be > running legacy/mixed hardware and while they might have an initiative to > explore and invest and even a decent budget most of them are not going to > have > completely identical hardware, isolated/flat networks and things set > aside in such a way that blind auto-discovery/deployment will just work all > the time. Thats great information (and something I reasonably well expected, to a degree). We have a hard dependency on no wildcard DHCP servers in the environment (or we can't deploy). Autodiscovery is something we don't have yet, but certainly debugging deployment failures is a very important use case and one we need to improve both at the plumbing layer and in the stories around it in the UI. > There will be a need to sometimes adjust, and those coming from a more > vertically-scaling infrastructure (most large orgs.) will not have > 100% matching standards in place of vendor, machine spec and network design > which may make Tuscar/OOO seem inflexible and 'one-way'. This may just be a > carry-over or fear of the old ways of deployment but nonetheless it > is present. I'm not sure what you mean by matching standards here :). Ironic is designed to support extremely varied environments with arbitrary mixes of IPMI/drac/ilo/what-have-you, and abstract that away for us. From a network perspective I've been arguing the following: - we need routable access to the mgmt cards - if we don't have that (say there are 5 different mgmt domains with no routing between them) then we install 5 deployment layers (5 underclouds) which could be as small as one machine each. - within the machines that are served by one routable region of mgmt cards, we need no wildcard DHCP servers, for our DHCP server to serve PXE to the machines (for the PXE driver in Ironic). - building a single region overcloud from multiple undercloud regions will involve manually injecting well known endpoints (such as the floating virtual IP for API endpoints) into some of the regions, but it's in principle straightforward to do and use with the plumbing layer today. > In my case, we're lucky enough to have dedicated, near-identical > equipment and a flexible network design we've architected prior that > makes Tuskar/OOO a great fit. Most people will not have this > greenfield ability and will use what they have lying around initially > as to not make a big investment until familiarity and trust of > something new is permeated. > > That said, I've been working with Jaromir Coufal on some UI mockups of > Tuskar with some of this 'advanced' functionality included and from > my perspective it looks like something to consider pulling in sooner than > later if you want to maximize the adoption of new users. So, for Tuskar my short term goals are to support RH in shipping a polished product while still architecting and building something sustainable and suitable for integration into the OpenStack ecosystem. (For instance, one of the requirements for integration is that we don't [significantly] overlap other projects - and thats why I've been pushing so hard on the do
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 13 December 2013 10:05, Jay Dobies wrote: >> Maybe this is a valid use case? > You mention three specific nodes, but what you're describing is more likely > three concepts: > - Balanced Nodes > - High Disk I/O Nodes > - Low-End Appliance Nodes > > They may have one node in each, but I think your example of three nodes is > potentially *too* simplified to be considered as proper sample size. I'd > guess there are more than three in play commonly, in which case the concepts > breakdown starts to be more appealing. > > I think the disk flavor in particular has quite a few use cases, especially > until SSDs are ubiquitous. I'd want to flag those (in Jay terminology, "the > disk hotness") as hosting the data-intensive portions, but where I had > previously been viewing that as manual allocation, it sounds like the > approach is to properly categorize them for what they are and teach Nova how > to use them. > > Robert - Please correct me if I misread any of what your intention was, I > don't want to drive people down the wrong path if I'm misinterpretting > anything. You nailed it, no butchering involved at all! -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 13 December 2013 06:13, Keith Basil wrote: > On Dec 11, 2013, at 3:42 PM, Robert Collins wrote: >>> My question is - can't we help them now? To enable users to use our app even >>> when we don't have enough smartness to help them 'auto' way? >> >> I understand the question: but I can't answer it until we have *an* >> example that is both real and not deliverable today. At the moment the >> only one we know of is HA, and thats certainly an important feature on >> the nova scheduled side, so doing manual control to deliver a future >> automatic feature doesn't make a lot of sense to me. Crawl, walk, run. > > Maybe this is a valid use case? > > Cloud operator has several core service nodes of differing configuration > types. > > [node1] <-- balanced mix of disk/cpu/ram for general core services > [node2] <-- lots of disks for Ceilometer data storage > [node3] <-- low-end "appliance like" box for a specialized/custom core > service > (SIEM box for example) > > All nodes[1,2,3] are in the same deployment grouping ("core services)". As > such, > this is a heterogenous deployment grouping. Heterogeneity in this case > defined by > differing roles and hardware configurations. > > This is a real use case. > > How do we handle this? Ok, so node1 gets flavor A, node2 gets flavor B, node3 gets flavor C. We have three disk images, one with general core services on it (imageA), one with ceilometer backend storage (imageB), one with SIEM on it (imageC). And we have three service groups, one that binds imageA to {flavors: [FlavorA], count:1}, one that binds imageB to {flavors:[FlavorB], count:1}, one that binds imageC to {flavors:[FlavorC], count:1} Thats doable by the plumbing today, without any bypass of the Nova scheduler. FlavorB might be the same as the flavor for gluster boxes for instance, in which case you'll get commonality - if one fails, we can schedule onto another. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 13 December 2013 05:35, Keith Basil wrote: > On Dec 10, 2013, at 5:09 PM, Robert Collins wrote: unallocated | aqvailable | undeployed >>> >>> +1 unallocated >> >> I think available is most accurate, but undeployed works too. I really >> don't like unallocated, sorry! > > Would "available" introduce/denote that the service is deployed > and operational? It could lead to that confusion. Jaromir suggested free in the other thread, I think that that would work well and avoid the confusion with 'working service' that available has. >> Brainstorming: role is something like 'KVM compute', but we may have >> two differing only in configuration sets of that role. In a very >> technical sense it's actually: >> image + configuration -> scaling group in Heat. >> So perhaps: >> Role + Service group ? >> e.g. GPU KVM Hypervisor would be a service group, using the KVM >> Compute role aka disk image. >> >> Or perhaps we should actually surface image all the way up: >> >> Image + Service group ? >> image = what things we build into the image >> service group = what runtime configuration we're giving, including how >> many machines we want in the group >> > How about just leaving it as Resource Class? The things you've > brainstormed about are in line with the original thinking around > the resource class concept. > > role (assumes role specific image) + > service/resource grouping + > hardware that can provide that service/resource So, Resource Class as originally communicated really is quite different to me: though obviously there is some overlap. I can drill into that if you want ... however the implications of the words and how folk can map from them back to the plumbing is what really concerns me, so thats what I'll focus on here. Specifically: Resource Class was focused on the resources being offered into the overcloud, but the image + (service config/service group/group config) idea applies to all things we deploy equally - it's relevant to management instances, control plane instances, as well as Nova and Cinder. So the Resource part of it doesn't really fit. Using 'Class' is just jargon - I would expect it to be pretty impenetrable to non-programmers. Ideally I think we want something that: - has a fairly obvious mapping back to Nova/Heat terminology (e.g. if the concepts are the same, lets call them the same) - doesn't overlap other terms unless they are compatible. For instance Heat has a concept 'resourcegroup' where resource means 'the object that heat has created and is managing' and the group refers to scaling to some N of them. This is what we will eventually back a particular image + config onto - that becomes one resourcegroup in heat; using resource class to refer to that when the resource referred to is the delivered service, not 'Instance's (the Nova baremetal instances we create through the resourcegroup) is going to cause significant confusion at minimum :) -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 12/12/2013 04:25 PM, Keith Basil wrote: On Dec 12, 2013, at 4:05 PM, Jay Dobies wrote: Maybe this is a valid use case? Cloud operator has several core service nodes of differing configuration types. [node1] <-- balanced mix of disk/cpu/ram for general core services [node2] <-- lots of disks for Ceilometer data storage [node3] <-- low-end "appliance like" box for a specialized/custom core service (SIEM box for example) All nodes[1,2,3] are in the same deployment grouping ("core services)". As such, this is a heterogenous deployment grouping. Heterogeneity in this case defined by differing roles and hardware configurations. This is a real use case. How do we handle this? This is the sort of thing I had been concerned with, but I think this is just a variation on Robert's GPU example. Rather than butcher it by paraphrasing, I'll just include the relevant part: "The basic stuff we're talking about so far is just about saying each role can run on some set of undercloud flavors. If that new bit of kit has the same coarse metadata as other kit, Nova can't tell it apart. So the way to solve the problem is: - a) teach Ironic about the specialness of the node (e.g. a tag 'GPU') - b) teach Nova that there is a flavor that maps to the presence of that specialness, and c) teach Nova that other flavors may not map to that specialness then in Tuskar whatever Nova configuration is needed to use that GPU is a special role ('GPU compute' for instance) and only that role would be given that flavor to use. That special config is probably being in a host aggregate, with an overcloud flavor that specifies that aggregate, which means at the TripleO level we need to put the aggregate in the config metadata for that role, and the admin does a one-time setup in the Nova Horizon UI to configure their GPU compute flavor." Yes, the core services example is a variation on the above. The idea of _undercloud_ flavor assignment (flavor to role mapping) escaped me when I read that earlier. It appears to be very elegant and provides another attribute for Tuskar's notion of resource classes. So +1 here. You mention three specific nodes, but what you're describing is more likely three concepts: - Balanced Nodes - High Disk I/O Nodes - Low-End Appliance Nodes They may have one node in each, but I think your example of three nodes is potentially *too* simplified to be considered as proper sample size. I'd guess there are more than three in play commonly, in which case the concepts breakdown starts to be more appealing. Correct - definitely more than three, I just wanted to illustrate the use case. I not sure I explained what I was getting at properly. I wasn't implying you thought it was limited to just three. I do the same thing, simplify down for discussion purposes (I've done so in my head about this very topic). But I think this may be a rare case where simplifying actually masks the concept rather than exposes it. Manual feels a bit more desirable in small sample groups but when looking at larger sets of nodes, the flavor concept feels less odd than it does when defining a flavor for a single machine. That's all. :) Maybe that was clear already, but I wanted to make sure I didn't come off as attacking your example. It certainly wasn't my intention. The balanced v. disk machine thing is the sort of thing I'd been thinking for a while but hadn't found a good way to make concrete. I think the disk flavor in particular has quite a few use cases, especially until SSDs are ubiquitous. I'd want to flag those (in Jay terminology, "the disk hotness") as hosting the data-intensive portions, but where I had previously been viewing that as manual allocation, it sounds like the approach is to properly categorize them for what they are and teach Nova how to use them. Robert - Please correct me if I misread any of what your intention was, I don't want to drive people down the wrong path if I'm misinterpretting anything. -k ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On Dec 12, 2013, at 4:05 PM, Jay Dobies wrote: >> Maybe this is a valid use case? >> >> Cloud operator has several core service nodes of differing configuration >> types. >> >> [node1] <-- balanced mix of disk/cpu/ram for general core services >> [node2] <-- lots of disks for Ceilometer data storage >> [node3] <-- low-end "appliance like" box for a specialized/custom core >> service >> (SIEM box for example) >> >> All nodes[1,2,3] are in the same deployment grouping ("core services)". As >> such, >> this is a heterogenous deployment grouping. Heterogeneity in this case >> defined by >> differing roles and hardware configurations. >> >> This is a real use case. >> >> How do we handle this? > > This is the sort of thing I had been concerned with, but I think this is just > a variation on Robert's GPU example. Rather than butcher it by paraphrasing, > I'll just include the relevant part: > > > "The basic stuff we're talking about so far is just about saying each > role can run on some set of undercloud flavors. If that new bit of kit > has the same coarse metadata as other kit, Nova can't tell it apart. > So the way to solve the problem is: > - a) teach Ironic about the specialness of the node (e.g. a tag 'GPU') > - b) teach Nova that there is a flavor that maps to the presence of > that specialness, and > c) teach Nova that other flavors may not map to that specialness > > then in Tuskar whatever Nova configuration is needed to use that GPU > is a special role ('GPU compute' for instance) and only that role > would be given that flavor to use. That special config is probably > being in a host aggregate, with an overcloud flavor that specifies > that aggregate, which means at the TripleO level we need to put the > aggregate in the config metadata for that role, and the admin does a > one-time setup in the Nova Horizon UI to configure their GPU compute > flavor." > Yes, the core services example is a variation on the above. The idea of _undercloud_ flavor assignment (flavor to role mapping) escaped me when I read that earlier. It appears to be very elegant and provides another attribute for Tuskar's notion of resource classes. So +1 here. > You mention three specific nodes, but what you're describing is more likely > three concepts: > - Balanced Nodes > - High Disk I/O Nodes > - Low-End Appliance Nodes > > They may have one node in each, but I think your example of three nodes is > potentially *too* simplified to be considered as proper sample size. I'd > guess there are more than three in play commonly, in which case the concepts > breakdown starts to be more appealing. Correct - definitely more than three, I just wanted to illustrate the use case. > I think the disk flavor in particular has quite a few use cases, especially > until SSDs are ubiquitous. I'd want to flag those (in Jay terminology, "the > disk hotness") as hosting the data-intensive portions, but where I had > previously been viewing that as manual allocation, it sounds like the > approach is to properly categorize them for what they are and teach Nova how > to use them. > > Robert - Please correct me if I misread any of what your intention was, I > don't want to drive people down the wrong path if I'm misinterpretting > anything. -k ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
Maybe this is a valid use case? Cloud operator has several core service nodes of differing configuration types. [node1] <-- balanced mix of disk/cpu/ram for general core services [node2] <-- lots of disks for Ceilometer data storage [node3] <-- low-end "appliance like" box for a specialized/custom core service (SIEM box for example) All nodes[1,2,3] are in the same deployment grouping ("core services)". As such, this is a heterogenous deployment grouping. Heterogeneity in this case defined by differing roles and hardware configurations. This is a real use case. How do we handle this? This is the sort of thing I had been concerned with, but I think this is just a variation on Robert's GPU example. Rather than butcher it by paraphrasing, I'll just include the relevant part: "The basic stuff we're talking about so far is just about saying each role can run on some set of undercloud flavors. If that new bit of kit has the same coarse metadata as other kit, Nova can't tell it apart. So the way to solve the problem is: - a) teach Ironic about the specialness of the node (e.g. a tag 'GPU') - b) teach Nova that there is a flavor that maps to the presence of that specialness, and c) teach Nova that other flavors may not map to that specialness then in Tuskar whatever Nova configuration is needed to use that GPU is a special role ('GPU compute' for instance) and only that role would be given that flavor to use. That special config is probably being in a host aggregate, with an overcloud flavor that specifies that aggregate, which means at the TripleO level we need to put the aggregate in the config metadata for that role, and the admin does a one-time setup in the Nova Horizon UI to configure their GPU compute flavor." You mention three specific nodes, but what you're describing is more likely three concepts: - Balanced Nodes - High Disk I/O Nodes - Low-End Appliance Nodes They may have one node in each, but I think your example of three nodes is potentially *too* simplified to be considered as proper sample size. I'd guess there are more than three in play commonly, in which case the concepts breakdown starts to be more appealing. I think the disk flavor in particular has quite a few use cases, especially until SSDs are ubiquitous. I'd want to flag those (in Jay terminology, "the disk hotness") as hosting the data-intensive portions, but where I had previously been viewing that as manual allocation, it sounds like the approach is to properly categorize them for what they are and teach Nova how to use them. Robert - Please correct me if I misread any of what your intention was, I don't want to drive people down the wrong path if I'm misinterpretting anything. -k ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 12/12/13 09:42 +1300, Robert Collins wrote: On 12 December 2013 01:17, Jaromir Coufal wrote: On 2013/10/12 23:09, Robert Collins wrote: The 'easiest' way is to support bigger companies with huge deployments, tailored infrastructure, everything connected properly. But there are tons of companies/users who are running on old heterogeneous hardware. Very likely even more than the number of companies having already mentioned large deployments. And giving them only the way of 'setting up rules' in order to get the service on the node - this type of user is not gonna use our deployment system. Thats speculation. We don't know if they will or will not because we haven't given them a working system to test. Some part of that is speculation, some part of that is feedback from people who are doing deployments (of course its just very limited audience). Anyway, it is not just pure theory. Sure. Let be me more precise. There is a hypothesis that lack of direct control will be a significant adoption blocker for a primary group of users. I think it's safe to say that some users in the group 'sysadmins having to deploy an OpenStack cloud' will find it a bridge too far and not use a system without direct control. Call this group A. I think it's also safe to say that some users will not care in the slightest, because their deployment is too small for them to be particularly worried (e.g. about occasional downtime (but they would worry a lot about data loss)). Call this group B. I suspect we don't need to consider group C - folk who won't use a system if it *has* manual control, but thats only a suspicion. It may be that the side effect of adding direct control is to reduce usability below the threshold some folk need... To assess 'significant adoption blocker' we basically need to find the % of users who will care sufficiently that they don't use TripleO. How can we do that? We can do questionnaires, and get such folk to come talk with use, but that suffers from selection bias - group B can use the system with or without direct manual control, so have little motivation to argue vigorously in any particular direction. Group A however have to argue because they won't use the system at all without that feature, and they may want to use the system for other reasons, so that because a crucial aspect for them. A much better way IMO is to test it - to get a bunch of volunteers and see who responds positively to a demo *without* direct manual control. To do that we need a demoable thing, which might just be mockups that show a set of workflows (and include things like Jay's shiny-new-hardware use case in the demo). I rather suspect we're building that anyway as part of doing UX work, so maybe what we do is put a tweet or blog post up asking for sysadmins who a) have not yet deployed openstack, b) want to, and c) are willing to spend 20-30 minutes with us, walk them through a demo showing no manual control, and record what questions they ask, and whether they would like to have that product to us, and if not, then (a) what use cases they can't address with the mockups and (b) what other reasons they have for not using it. This is a bunch of work though! So, do we need to do that work? *If* we can layer manual control on later, then we could defer this testing until we are at the point where we can say 'the nova scheduled version is ready, now lets decide if we add the manual control'. OTOH, if we *cannot* layer manual control on later - if it has tentacles through too much of the code base, then we need to decide earlier, because it will be significantly harder to add later and that may be too late of a ship date for vendors shipping on top of TripleO. So with that as a prelude, my technical sense is that we can layer manual scheduling on later: we provide an advanced screen, show the list of N instances we're going to ask for and allow each instance to be directly customised with a node id selected from either the current node it's running on or an available node. It's significant work both UI and plumbing, but it's not going to be made harder by the other work we're doing AFAICT. -> My proposal is that we shelve this discussion until we have the nova/heat scheduled version in 'and now we polish' mode, and then pick it back up and assess user needs. An alternative argument is to say that group A is a majority of the userbase and that doing an automatic version is entirely unnecessary. Thats also possible, but I'm extremely skeptical, given the huge cost of staff time, and the complete lack of interest my sysadmin friends (and my former sysadmin self) have in doing automatable things by hand. I just wanted to add a few thoughts: For some comparative information here "from the field" I work extensively on deployments of large OpenStack implementations, most recently with a ~220node/9rack deployment (scaling up to 42racks / 1024 nodes soon). My primary role is of a Devops/Sysadmin nature, and no
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On Dec 11, 2013, at 3:42 PM, Robert Collins wrote: > On 12 December 2013 01:17, Jaromir Coufal wrote: >> On 2013/10/12 23:09, Robert Collins wrote: > The 'easiest' way is to support bigger companies with huge deployments, tailored infrastructure, everything connected properly. But there are tons of companies/users who are running on old heterogeneous hardware. Very likely even more than the number of companies having already mentioned large deployments. And giving them only the way of 'setting up rules' in order to get the service on the node - this type of user is not gonna use our deployment system. >>> >>> >>> Thats speculation. We don't know if they will or will not because we >>> haven't given them a working system to test. >> >> Some part of that is speculation, some part of that is feedback from people >> who are doing deployments (of course its just very limited audience). >> Anyway, it is not just pure theory. > > Sure. Let be me more precise. There is a hypothesis that lack of > direct control will be a significant adoption blocker for a primary > group of users. > > I think it's safe to say that some users in the group 'sysadmins > having to deploy an OpenStack cloud' will find it a bridge too far and > not use a system without direct control. Call this group A. > > I think it's also safe to say that some users will not care in the > slightest, because their deployment is too small for them to be > particularly worried (e.g. about occasional downtime (but they would > worry a lot about data loss)). Call this group B. > > I suspect we don't need to consider group C - folk who won't use a > system if it *has* manual control, but thats only a suspicion. It may > be that the side effect of adding direct control is to reduce > usability below the threshold some folk need... > > To assess 'significant adoption blocker' we basically need to find the > % of users who will care sufficiently that they don't use TripleO. > > How can we do that? We can do questionnaires, and get such folk to > come talk with use, but that suffers from selection bias - group B can > use the system with or without direct manual control, so have little > motivation to argue vigorously in any particular direction. Group A > however have to argue because they won't use the system at all without > that feature, and they may want to use the system for other reasons, > so that because a crucial aspect for them. > > A much better way IMO is to test it - to get a bunch of volunteers and > see who responds positively to a demo *without* direct manual control. > > To do that we need a demoable thing, which might just be mockups that > show a set of workflows (and include things like Jay's > shiny-new-hardware use case in the demo). > > I rather suspect we're building that anyway as part of doing UX work, > so maybe what we do is put a tweet or blog post up asking for > sysadmins who a) have not yet deployed openstack, b) want to, and c) > are willing to spend 20-30 minutes with us, walk them through a demo > showing no manual control, and record what questions they ask, and > whether they would like to have that product to us, and if not, then > (a) what use cases they can't address with the mockups and (b) what > other reasons they have for not using it. > > This is a bunch of work though! > > So, do we need to do that work? > > *If* we can layer manual control on later, then we could defer this > testing until we are at the point where we can say 'the nova scheduled > version is ready, now lets decide if we add the manual control'. > > OTOH, if we *cannot* layer manual control on later - if it has > tentacles through too much of the code base, then we need to decide > earlier, because it will be significantly harder to add later and that > may be too late of a ship date for vendors shipping on top of TripleO. > > So with that as a prelude, my technical sense is that we can layer > manual scheduling on later: we provide an advanced screen, show the > list of N instances we're going to ask for and allow each instance to > be directly customised with a node id selected from either the current > node it's running on or an available node. It's significant work both > UI and plumbing, but it's not going to be made harder by the other > work we're doing AFAICT. > > -> My proposal is that we shelve this discussion until we have the > nova/heat scheduled version in 'and now we polish' mode, and then pick > it back up and assess user needs. > > An alternative argument is to say that group A is a majority of the > userbase and that doing an automatic version is entirely unnecessary. > Thats also possible, but I'm extremely skeptical, given the huge cost > of staff time, and the complete lack of interest my sysadmin friends > (and my former sysadmin self) have in doing automatable things by > hand. > >>> Lets break the concern into two halves: >>> A) Users who could have t
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On Dec 10, 2013, at 5:09 PM, Robert Collins wrote: > On 11 December 2013 05:42, Jaromir Coufal wrote: >> On 2013/09/12 23:38, Tzu-Mainn Chen wrote: >>> The disagreement comes from whether we need manual node assignment or not. >>> I would argue that we >>> need to step back and take a look at the real use case: heterogeneous >>> nodes. If there are literally >>> no characteristics that differentiate nodes A and B, then why do we care >>> which gets used for what? Why >>> do we need to manually assign one? >> >> >> Ideally, we don't. But with this approach we would take out the possibility >> to change something or decide something from the user. > > So, I think this is where the confusion is. Using the nova scheduler > doesn't prevent change or control. It just ensures the change and > control happen in the right place: the Nova scheduler has had years of > work, of features and facilities being added to support HPC, HA and > other such use cases. It should have everything we need [1], without > going down to manual placement. For clarity: manual placement is when > any of the user, Tuskar, or Heat query Ironic, select a node, and then > use a scheduler hint to bypass the scheduler. > >> The 'easiest' way is to support bigger companies with huge deployments, >> tailored infrastructure, everything connected properly. >> >> But there are tons of companies/users who are running on old heterogeneous >> hardware. Very likely even more than the number of companies having already >> mentioned large deployments. And giving them only the way of 'setting up >> rules' in order to get the service on the node - this type of user is not >> gonna use our deployment system. > > Thats speculation. We don't know if they will or will not because we > haven't given them a working system to test. > > Lets break the concern into two halves: > A) Users who could have their needs met, but won't use TripleO because > meeting their needs in this way is too hard/complex/painful. > > B) Users who have a need we cannot meet with the current approach. > > For category B users, their needs might be specific HA things - like > the oft discussed failure domains angle, where we need to split up HA > clusters across power bars, aircon, switches etc. Clearly long term we > want to support them, and the undercloud Nova scheduler is entirely > capable of being informed about this, and we can evolve to a holistic > statement over time. Lets get a concrete list of the cases we can > think of today that won't be well supported initially, and we can > figure out where to do the work to support them properly. > > For category A users, I think that we should get concrete examples, > and evolve our design (architecture and UX) to make meeting those > needs pleasant. > > What we shouldn't do is plan complex work without concrete examples > that people actually need. Jay's example of some shiny new compute > servers with special parts that need to be carved out was a great one > - we can put that in category A, and figure out if it's easy enough, > or obvious enough - and think about whether we document it or make it > a guided workflow or $whatever. > >> Somebody might argue - why do we care? If user doesn't like TripleO >> paradigm, he shouldn't use the UI and should use another tool. But the UI is >> not only about TripleO. Yes, it is underlying concept, but we are working on >> future *official* OpenStack deployment tool. We should care to enable people >> to deploy OpenStack - large/small scale, homo/heterogeneous hardware, >> typical or a bit more specific use-cases. > > The difficulty I'm having is that the discussion seems to assume that > 'heterogeneous implies manual', but I don't agree that that > implication is necessary! > >> As an underlying paradigm of how to install cloud - awesome idea, awesome >> concept, it works. But user doesn't care about how it is being deployed for >> him. He cares about getting what he wants/needs. And we shouldn't go that >> far that we violently force him to treat his infrastructure as cloud. I >> believe that possibility to change/control - if needed - is very important >> and we should care. > > I propose that we make concrete use cases: 'Fred cannot use TripleO > without manual assignment because XYZ'. Then we can assess how > important XYZ is to our early adopters and go from there. > >> And what is key for us is to *enable* users - not to prevent them from using >> our deployment tool, because it doesn't work for their requirements. > > Totally agreed :) > >>> If we can agree on that, then I think it would be sufficient to say that >>> we want a mechanism to allow >>> UI users to deal with heterogeneous nodes, and that mechanism must use >>> nova-scheduler. In my mind, >>> that's what resource classes and node profiles are intended for. >> >> >> Not arguing on this point. Though that mechanism should support also cases, >> where user specifies a role for a node / removes node fro
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 12 December 2013 01:17, Jaromir Coufal wrote: > On 2013/10/12 23:09, Robert Collins wrote: >>> The 'easiest' way is to support bigger companies with huge deployments, >>> tailored infrastructure, everything connected properly. >>> >>> But there are tons of companies/users who are running on old >>> heterogeneous >>> hardware. Very likely even more than the number of companies having >>> already >>> mentioned large deployments. And giving them only the way of 'setting up >>> rules' in order to get the service on the node - this type of user is not >>> gonna use our deployment system. >> >> >> Thats speculation. We don't know if they will or will not because we >> haven't given them a working system to test. > > Some part of that is speculation, some part of that is feedback from people > who are doing deployments (of course its just very limited audience). > Anyway, it is not just pure theory. Sure. Let be me more precise. There is a hypothesis that lack of direct control will be a significant adoption blocker for a primary group of users. I think it's safe to say that some users in the group 'sysadmins having to deploy an OpenStack cloud' will find it a bridge too far and not use a system without direct control. Call this group A. I think it's also safe to say that some users will not care in the slightest, because their deployment is too small for them to be particularly worried (e.g. about occasional downtime (but they would worry a lot about data loss)). Call this group B. I suspect we don't need to consider group C - folk who won't use a system if it *has* manual control, but thats only a suspicion. It may be that the side effect of adding direct control is to reduce usability below the threshold some folk need... To assess 'significant adoption blocker' we basically need to find the % of users who will care sufficiently that they don't use TripleO. How can we do that? We can do questionnaires, and get such folk to come talk with use, but that suffers from selection bias - group B can use the system with or without direct manual control, so have little motivation to argue vigorously in any particular direction. Group A however have to argue because they won't use the system at all without that feature, and they may want to use the system for other reasons, so that because a crucial aspect for them. A much better way IMO is to test it - to get a bunch of volunteers and see who responds positively to a demo *without* direct manual control. To do that we need a demoable thing, which might just be mockups that show a set of workflows (and include things like Jay's shiny-new-hardware use case in the demo). I rather suspect we're building that anyway as part of doing UX work, so maybe what we do is put a tweet or blog post up asking for sysadmins who a) have not yet deployed openstack, b) want to, and c) are willing to spend 20-30 minutes with us, walk them through a demo showing no manual control, and record what questions they ask, and whether they would like to have that product to us, and if not, then (a) what use cases they can't address with the mockups and (b) what other reasons they have for not using it. This is a bunch of work though! So, do we need to do that work? *If* we can layer manual control on later, then we could defer this testing until we are at the point where we can say 'the nova scheduled version is ready, now lets decide if we add the manual control'. OTOH, if we *cannot* layer manual control on later - if it has tentacles through too much of the code base, then we need to decide earlier, because it will be significantly harder to add later and that may be too late of a ship date for vendors shipping on top of TripleO. So with that as a prelude, my technical sense is that we can layer manual scheduling on later: we provide an advanced screen, show the list of N instances we're going to ask for and allow each instance to be directly customised with a node id selected from either the current node it's running on or an available node. It's significant work both UI and plumbing, but it's not going to be made harder by the other work we're doing AFAICT. -> My proposal is that we shelve this discussion until we have the nova/heat scheduled version in 'and now we polish' mode, and then pick it back up and assess user needs. An alternative argument is to say that group A is a majority of the userbase and that doing an automatic version is entirely unnecessary. Thats also possible, but I'm extremely skeptical, given the huge cost of staff time, and the complete lack of interest my sysadmin friends (and my former sysadmin self) have in doing automatable things by hand. >> Lets break the concern into two halves: >> A) Users who could have their needs met, but won't use TripleO because >> meeting their needs in this way is too hard/complex/painful. >> >> B) Users who have a need we cannot meet with the current approach. >> >> For category B users, their needs might be speci
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
> On 2013/10/12 19:39, Tzu-Mainn Chen wrote: > >> > >> Ideally, we don't. But with this approach we would take out the > >> possibility to change something or decide something from the user. > >> > >> The 'easiest' way is to support bigger companies with huge deployments, > >> tailored infrastructure, everything connected properly. > >> > >> But there are tons of companies/users who are running on old > >> heterogeneous hardware. Very likely even more than the number of > >> companies having already mentioned large deployments. And giving them > >> only the way of 'setting up rules' in order to get the service on the > >> node - this type of user is not gonna use our deployment system. > >> > >> Somebody might argue - why do we care? If user doesn't like TripleO > >> paradigm, he shouldn't use the UI and should use another tool. But the > >> UI is not only about TripleO. Yes, it is underlying concept, but we are > >> working on future *official* OpenStack deployment tool. We should care > >> to enable people to deploy OpenStack - large/small scale, > >> homo/heterogeneous hardware, typical or a bit more specific use-cases. > > > > I think this is a very important clarification, and I'm glad you made it. > > It sounds > > like manual assignment is actually a sub-requirement, and the feature > > you're arguing > > for is: supporting non-TripleO deployments. > > Mostly but not only. The other argument is - keeping control on stuff I > am doing. Note that undercloud user is different from overcloud user. Sure, but again, that argument seems to me to be a non-TripleO approach. I'm not saying that it's not a possible use case, I'm saying that you're advocating for a deployment strategy that fundamentally diverges from the TripleO philosophy - and as such, that strategy will likely require a separate UI, underlying architecture, etc, and should not be planned for in the Icehouse timeframe. > > That might be a worthy goal, but I think it's a distraction for the > > Icehouse timeframe. > > Each new deployment strategy requires not only a new UI, but different > > deployment > > architectures that could have very little common with each other. > > Designing them all > > to work in the same space is a recipe for disaster, a convoluted gnarl of > > code that > > doesn't do any one thing particularly well. To use an analogy: there's a > > reason why > > no one makes a flying boat car. > > > > I'm going to strongly advocate that for Icehouse, we focus exclusively on > > large scale > > TripleO deployments, working to make that UI and architecture as sturdy as > > we can. Future > > deployment strategies should be discussed in the future, and if they're not > > TripleO based, > > they should be discussed with the proper OpenStack group. > One concern here is - it is quite likely that we get people excited > about this approach - it will be a new boom - 'wow', there is automagic > doing everything for me. But then the question would be reality - how > many from that excited users will actually use TripleO for their real > deployments (I mean in the early stages)? Would it be only couple of > them (because of covered use cases, concerns of maturity, lack of > control scarcity)? Can we assure them that if anything goes wrong, they > have control over it? > -- Jarda > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 2013/10/12 19:39, Tzu-Mainn Chen wrote: Ideally, we don't. But with this approach we would take out the possibility to change something or decide something from the user. The 'easiest' way is to support bigger companies with huge deployments, tailored infrastructure, everything connected properly. But there are tons of companies/users who are running on old heterogeneous hardware. Very likely even more than the number of companies having already mentioned large deployments. And giving them only the way of 'setting up rules' in order to get the service on the node - this type of user is not gonna use our deployment system. Somebody might argue - why do we care? If user doesn't like TripleO paradigm, he shouldn't use the UI and should use another tool. But the UI is not only about TripleO. Yes, it is underlying concept, but we are working on future *official* OpenStack deployment tool. We should care to enable people to deploy OpenStack - large/small scale, homo/heterogeneous hardware, typical or a bit more specific use-cases. I think this is a very important clarification, and I'm glad you made it. It sounds like manual assignment is actually a sub-requirement, and the feature you're arguing for is: supporting non-TripleO deployments. Mostly but not only. The other argument is - keeping control on stuff I am doing. Note that undercloud user is different from overcloud user. That might be a worthy goal, but I think it's a distraction for the Icehouse timeframe. Each new deployment strategy requires not only a new UI, but different deployment architectures that could have very little common with each other. Designing them all to work in the same space is a recipe for disaster, a convoluted gnarl of code that doesn't do any one thing particularly well. To use an analogy: there's a reason why no one makes a flying boat car. I'm going to strongly advocate that for Icehouse, we focus exclusively on large scale TripleO deployments, working to make that UI and architecture as sturdy as we can. Future deployment strategies should be discussed in the future, and if they're not TripleO based, they should be discussed with the proper OpenStack group. One concern here is - it is quite likely that we get people excited about this approach - it will be a new boom - 'wow', there is automagic doing everything for me. But then the question would be reality - how many from that excited users will actually use TripleO for their real deployments (I mean in the early stages)? Would it be only couple of them (because of covered use cases, concerns of maturity, lack of control scarcity)? Can we assure them that if anything goes wrong, they have control over it? -- Jarda ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 2013/10/12 23:09, Robert Collins wrote: On 11 December 2013 05:42, Jaromir Coufal wrote: On 2013/09/12 23:38, Tzu-Mainn Chen wrote: The disagreement comes from whether we need manual node assignment or not. I would argue that we need to step back and take a look at the real use case: heterogeneous nodes. If there are literally no characteristics that differentiate nodes A and B, then why do we care which gets used for what? Why do we need to manually assign one? Ideally, we don't. But with this approach we would take out the possibility to change something or decide something from the user. So, I think this is where the confusion is. Using the nova scheduler doesn't prevent change or control. It just ensures the change and control happen in the right place: the Nova scheduler has had years of work, of features and facilities being added to support HPC, HA and other such use cases. It should have everything we need [1], without going down to manual placement. For clarity: manual placement is when any of the user, Tuskar, or Heat query Ironic, select a node, and then use a scheduler hint to bypass the scheduler. This is very well written. I am all for things going to right places. The 'easiest' way is to support bigger companies with huge deployments, tailored infrastructure, everything connected properly. But there are tons of companies/users who are running on old heterogeneous hardware. Very likely even more than the number of companies having already mentioned large deployments. And giving them only the way of 'setting up rules' in order to get the service on the node - this type of user is not gonna use our deployment system. Thats speculation. We don't know if they will or will not because we haven't given them a working system to test. Some part of that is speculation, some part of that is feedback from people who are doing deployments (of course its just very limited audience). Anyway, it is not just pure theory. Lets break the concern into two halves: A) Users who could have their needs met, but won't use TripleO because meeting their needs in this way is too hard/complex/painful. B) Users who have a need we cannot meet with the current approach. For category B users, their needs might be specific HA things - like the oft discussed failure domains angle, where we need to split up HA clusters across power bars, aircon, switches etc. Clearly long term we want to support them, and the undercloud Nova scheduler is entirely capable of being informed about this, and we can evolve to a holistic statement over time. Lets get a concrete list of the cases we can think of today that won't be well supported initially, and we can figure out where to do the work to support them properly. My question is - can't we help them now? To enable users to use our app even when we don't have enough smartness to help them 'auto' way? For category A users, I think that we should get concrete examples, and evolve our design (architecture and UX) to make meeting those needs pleasant. +1... I tried to pull some operators into this discussion thread, will try to get more. What we shouldn't do is plan complex work without concrete examples that people actually need. Jay's example of some shiny new compute servers with special parts that need to be carved out was a great one - we can put that in category A, and figure out if it's easy enough, or obvious enough - and think about whether we document it or make it a guided workflow or $whatever. Somebody might argue - why do we care? If user doesn't like TripleO paradigm, he shouldn't use the UI and should use another tool. But the UI is not only about TripleO. Yes, it is underlying concept, but we are working on future *official* OpenStack deployment tool. We should care to enable people to deploy OpenStack - large/small scale, homo/heterogeneous hardware, typical or a bit more specific use-cases. The difficulty I'm having is that the discussion seems to assume that 'heterogeneous implies manual', but I don't agree that that implication is necessary! No, I don't agree with this either. Heterogeneous hardware can be very well managed automatically as well as homogeneous (classes, node profiles). As an underlying paradigm of how to install cloud - awesome idea, awesome concept, it works. But user doesn't care about how it is being deployed for him. He cares about getting what he wants/needs. And we shouldn't go that far that we violently force him to treat his infrastructure as cloud. I believe that possibility to change/control - if needed - is very important and we should care. I propose that we make concrete use cases: 'Fred cannot use TripleO without manual assignment because XYZ'. Then we can assess how important XYZ is to our early adopters and go from there. +1, yes. I will try to bug more relevant people, who could contribute at this area. And what is key for us is to *enable* users - not to prevent them from using our deployme
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 11 December 2013 05:42, Jaromir Coufal wrote: > On 2013/09/12 23:38, Tzu-Mainn Chen wrote: >> The disagreement comes from whether we need manual node assignment or not. >> I would argue that we >> need to step back and take a look at the real use case: heterogeneous >> nodes. If there are literally >> no characteristics that differentiate nodes A and B, then why do we care >> which gets used for what? Why >> do we need to manually assign one? > > > Ideally, we don't. But with this approach we would take out the possibility > to change something or decide something from the user. So, I think this is where the confusion is. Using the nova scheduler doesn't prevent change or control. It just ensures the change and control happen in the right place: the Nova scheduler has had years of work, of features and facilities being added to support HPC, HA and other such use cases. It should have everything we need [1], without going down to manual placement. For clarity: manual placement is when any of the user, Tuskar, or Heat query Ironic, select a node, and then use a scheduler hint to bypass the scheduler. > The 'easiest' way is to support bigger companies with huge deployments, > tailored infrastructure, everything connected properly. > > But there are tons of companies/users who are running on old heterogeneous > hardware. Very likely even more than the number of companies having already > mentioned large deployments. And giving them only the way of 'setting up > rules' in order to get the service on the node - this type of user is not > gonna use our deployment system. Thats speculation. We don't know if they will or will not because we haven't given them a working system to test. Lets break the concern into two halves: A) Users who could have their needs met, but won't use TripleO because meeting their needs in this way is too hard/complex/painful. B) Users who have a need we cannot meet with the current approach. For category B users, their needs might be specific HA things - like the oft discussed failure domains angle, where we need to split up HA clusters across power bars, aircon, switches etc. Clearly long term we want to support them, and the undercloud Nova scheduler is entirely capable of being informed about this, and we can evolve to a holistic statement over time. Lets get a concrete list of the cases we can think of today that won't be well supported initially, and we can figure out where to do the work to support them properly. For category A users, I think that we should get concrete examples, and evolve our design (architecture and UX) to make meeting those needs pleasant. What we shouldn't do is plan complex work without concrete examples that people actually need. Jay's example of some shiny new compute servers with special parts that need to be carved out was a great one - we can put that in category A, and figure out if it's easy enough, or obvious enough - and think about whether we document it or make it a guided workflow or $whatever. > Somebody might argue - why do we care? If user doesn't like TripleO > paradigm, he shouldn't use the UI and should use another tool. But the UI is > not only about TripleO. Yes, it is underlying concept, but we are working on > future *official* OpenStack deployment tool. We should care to enable people > to deploy OpenStack - large/small scale, homo/heterogeneous hardware, > typical or a bit more specific use-cases. The difficulty I'm having is that the discussion seems to assume that 'heterogeneous implies manual', but I don't agree that that implication is necessary! > As an underlying paradigm of how to install cloud - awesome idea, awesome > concept, it works. But user doesn't care about how it is being deployed for > him. He cares about getting what he wants/needs. And we shouldn't go that > far that we violently force him to treat his infrastructure as cloud. I > believe that possibility to change/control - if needed - is very important > and we should care. I propose that we make concrete use cases: 'Fred cannot use TripleO without manual assignment because XYZ'. Then we can assess how important XYZ is to our early adopters and go from there. > And what is key for us is to *enable* users - not to prevent them from using > our deployment tool, because it doesn't work for their requirements. Totally agreed :) >> If we can agree on that, then I think it would be sufficient to say that >> we want a mechanism to allow >> UI users to deal with heterogeneous nodes, and that mechanism must use >> nova-scheduler. In my mind, >> that's what resource classes and node profiles are intended for. > > > Not arguing on this point. Though that mechanism should support also cases, > where user specifies a role for a node / removes node from a role. The rest > of nodes which I don't care about should be handled by nova-scheduler. Why! What is a use case for removing a role from a node while leaving that node in service? Lets be specific, alway
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
Thanks for the explanation! I'm going to claim that the thread revolves around two main areas of disagreement. Then I'm going to propose a way through: a) Manual Node Assignment I think that everyone is agreed that automated node assignment through nova-scheduler is by far the most ideal case; there's no disagreement there. The disagreement comes from whether we need manual node assignment or not. I would argue that we need to step back and take a look at the real use case: heterogeneous nodes. If there are literally no characteristics that differentiate nodes A and B, then why do we care which gets used for what? Why do we need to manually assign one? This is a better way of verbalizing my concerns. I suspect there are going to be quite a few heterogeneous environments built from legacy pieces in the near term and fewer built from the ground up with all new matching hotness. On the other side of it, instead of handling legacy hardware I was worried about the new hotness (not sure why I keep using that term) specialized for a purpose. This is exactly what Robert described in his GPU example. I think his explanation of how to use the scheduler to accommodate that makes a lot of sense, so I'm much less behind the idea of a strict manual assignment than I previously was. If we can agree on that, then I think it would be sufficient to say that we want a mechanism to allow UI users to deal with heterogeneous nodes, and that mechanism must use nova-scheduler. In my mind, that's what resource classes and node profiles are intended for. One possible objection might be: nova scheduler doesn't have the appropriate filter that we need to separate out two nodes. In that case, I would say that needs to be taken up with nova developers. b) Terminology It feels a bit like some of the disagreement come from people using different words for the same thing. For example, the wireframes already details a UI where Robert's roles come first, but I think that message was confused because I mentioned "node types" in the requirements. So could we come to some agreement on what the most exact terminology would be? I've listed some examples below, but I'm sure there are more. node type | role management node | ? resource node | ? unallocated | available | undeployed create a node distribution | size the deployment resource classes | ? node profiles | ? Mainn - Original Message - On 10 December 2013 09:55, Tzu-Mainn Chen wrote: * created as part of undercloud install process By that note I meant, that Nodes are not resources, Resource instances run on Nodes. Nodes are the generic pool of hardware we can deploy things onto. I don't think "resource nodes" is intended to imply that nodes are resources; rather, it's supposed to indicate that it's a node where a resource instance runs. It's supposed to separate it from "management node" and "unallocated node". So the question is are we looking at /nodes/ that have a /current role/, or are we looking at /roles/ that have some /current nodes/. My contention is that the role is the interesting thing, and the nodes is the incidental thing. That is, as a sysadmin, my hierarchy of concerns is something like: A: are all services running B: are any of them in a degraded state where I need to take prompt action to prevent a service outage [might mean many things: - software update/disk space criticals/a machine failed and we need to scale the cluster back up/too much load] C: are there any planned changes I need to make [new software deploy, feature request from user, replacing a faulty machine] D: are there long term issues sneaking up on me [capacity planning, machine obsolescence] If we take /nodes/ as the interesting thing, and what they are doing right now as the incidental thing, it's much harder to map that onto the sysadmin concerns. If we start with /roles/ then can answer: A: by showing the list of roles and the summary stats (how many machines, service status aggregate), role level alerts (e.g. nova-api is not responding) B: by showing the list of roles and more detailed stats (overall load, response times of services, tickets against services and a list of in trouble instances in each role - instances with alerts against them - low disk, overload, failed service, early-detection alerts from hardware C: probably out of our remit for now in the general case, but we need to enable some things here like replacing faulty machines D: by looking at trend graphs for roles (not machines), but also by looking at the hardware in aggregate - breakdown by age of machines, summary data for tickets filed against instances that were deployed to a particular machine C: and D: are (F) category work, but for all but the very last thing, it seems clear how to approach this from a roles perspective. I've tried to approach this using /nodes/ as the starting point, and after two terrible drafts I've deleted th
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
Thanks for the reply! Comments in-line: > > The disagreement comes from whether we need manual node assignment or not. > > I would argue that we > > need to step back and take a look at the real use case: heterogeneous > > nodes. If there are literally > > no characteristics that differentiate nodes A and B, then why do we care > > which gets used for what? Why > > do we need to manually assign one? > > Ideally, we don't. But with this approach we would take out the > possibility to change something or decide something from the user. > > The 'easiest' way is to support bigger companies with huge deployments, > tailored infrastructure, everything connected properly. > > But there are tons of companies/users who are running on old > heterogeneous hardware. Very likely even more than the number of > companies having already mentioned large deployments. And giving them > only the way of 'setting up rules' in order to get the service on the > node - this type of user is not gonna use our deployment system. > > Somebody might argue - why do we care? If user doesn't like TripleO > paradigm, he shouldn't use the UI and should use another tool. But the > UI is not only about TripleO. Yes, it is underlying concept, but we are > working on future *official* OpenStack deployment tool. We should care > to enable people to deploy OpenStack - large/small scale, > homo/heterogeneous hardware, typical or a bit more specific use-cases. I think this is a very important clarification, and I'm glad you made it. It sounds like manual assignment is actually a sub-requirement, and the feature you're arguing for is: supporting non-TripleO deployments. That might be a worthy goal, but I think it's a distraction for the Icehouse timeframe. Each new deployment strategy requires not only a new UI, but different deployment architectures that could have very little common with each other. Designing them all to work in the same space is a recipe for disaster, a convoluted gnarl of code that doesn't do any one thing particularly well. To use an analogy: there's a reason why no one makes a flying boat car. I'm going to strongly advocate that for Icehouse, we focus exclusively on large scale TripleO deployments, working to make that UI and architecture as sturdy as we can. Future deployment strategies should be discussed in the future, and if they're not TripleO based, they should be discussed with the proper OpenStack group. > As an underlying paradigm of how to install cloud - awesome idea, > awesome concept, it works. But user doesn't care about how it is being > deployed for him. He cares about getting what he wants/needs. And we > shouldn't go that far that we violently force him to treat his > infrastructure as cloud. I believe that possibility to change/control - > if needed - is very important and we should care. > > And what is key for us is to *enable* users - not to prevent them from > using our deployment tool, because it doesn't work for their requirements. > > > > If we can agree on that, then I think it would be sufficient to say that we > > want a mechanism to allow > > UI users to deal with heterogeneous nodes, and that mechanism must use > > nova-scheduler. In my mind, > > that's what resource classes and node profiles are intended for. > > Not arguing on this point. Though that mechanism should support also > cases, where user specifies a role for a node / removes node from a > role. The rest of nodes which I don't care about should be handled by > nova-scheduler. > > > One possible objection might be: nova scheduler doesn't have the > > appropriate filter that we need to > > separate out two nodes. In that case, I would say that needs to be taken > > up with nova developers. > > Give it to Nova guys to fix it... What if that user's need would be > undercloud specific requirement? Why should Nova guys care? What should > our unhappy user do until then? Use other tool? Will he be willing to > get back to use our tool once it is ready? > > I can also see other use-cases. It can be distribution based on power > sockets, networking connections, etc. We can't think about all the ways > which our user will need. In this case - it would be our job to make the Nova guys care and to work with them to develop the feature. Creating parallel services with the same fundamental purpose - I think that runs counter to what OpenStack is designed for. > > > b) Terminology > > > > It feels a bit like some of the disagreement come from people using > > different words for the same thing. > > For example, the wireframes already details a UI where Robert's roles come > > first, but I think that message > > was confused because I mentioned "node types" in the requirements. > > > > So could we come to some agreement on what the most exact terminology would > > be? I've listed some examples below, > > but I'm sure there are more. > > > > node type | role > +1 role > > > management node | ? > > resource
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 2013/09/12 23:38, Tzu-Mainn Chen wrote: Thanks for the explanation! I'm going to claim that the thread revolves around two main areas of disagreement. Then I'm going to propose a way through: a) Manual Node Assignment I think that everyone is agreed that automated node assignment through nova-scheduler is by far the most ideal case; there's no disagreement there. +1 The disagreement comes from whether we need manual node assignment or not. I would argue that we need to step back and take a look at the real use case: heterogeneous nodes. If there are literally no characteristics that differentiate nodes A and B, then why do we care which gets used for what? Why do we need to manually assign one? Ideally, we don't. But with this approach we would take out the possibility to change something or decide something from the user. The 'easiest' way is to support bigger companies with huge deployments, tailored infrastructure, everything connected properly. But there are tons of companies/users who are running on old heterogeneous hardware. Very likely even more than the number of companies having already mentioned large deployments. And giving them only the way of 'setting up rules' in order to get the service on the node - this type of user is not gonna use our deployment system. Somebody might argue - why do we care? If user doesn't like TripleO paradigm, he shouldn't use the UI and should use another tool. But the UI is not only about TripleO. Yes, it is underlying concept, but we are working on future *official* OpenStack deployment tool. We should care to enable people to deploy OpenStack - large/small scale, homo/heterogeneous hardware, typical or a bit more specific use-cases. As an underlying paradigm of how to install cloud - awesome idea, awesome concept, it works. But user doesn't care about how it is being deployed for him. He cares about getting what he wants/needs. And we shouldn't go that far that we violently force him to treat his infrastructure as cloud. I believe that possibility to change/control - if needed - is very important and we should care. And what is key for us is to *enable* users - not to prevent them from using our deployment tool, because it doesn't work for their requirements. If we can agree on that, then I think it would be sufficient to say that we want a mechanism to allow UI users to deal with heterogeneous nodes, and that mechanism must use nova-scheduler. In my mind, that's what resource classes and node profiles are intended for. Not arguing on this point. Though that mechanism should support also cases, where user specifies a role for a node / removes node from a role. The rest of nodes which I don't care about should be handled by nova-scheduler. One possible objection might be: nova scheduler doesn't have the appropriate filter that we need to separate out two nodes. In that case, I would say that needs to be taken up with nova developers. Give it to Nova guys to fix it... What if that user's need would be undercloud specific requirement? Why should Nova guys care? What should our unhappy user do until then? Use other tool? Will he be willing to get back to use our tool once it is ready? I can also see other use-cases. It can be distribution based on power sockets, networking connections, etc. We can't think about all the ways which our user will need. b) Terminology It feels a bit like some of the disagreement come from people using different words for the same thing. For example, the wireframes already details a UI where Robert's roles come first, but I think that message was confused because I mentioned "node types" in the requirements. So could we come to some agreement on what the most exact terminology would be? I've listed some examples below, but I'm sure there are more. node type | role +1 role management node | ? resource node | ? unallocated | aqvailable | undeployed +1 unallocated ceate a node distribution | size the deployment * Distribute nodes resource classes | ? Service classes? node profiles | ? So when we talk about 'unallocated Nodes', the implication is that users 'allocate Nodes', but they don't: they size roles, and after doing all that there may be some Nodes that are - yes - unallocated, or have nothing scheduled to them. So... I'm not debating that we should have a list of free hardware - we totally should - I'm debating how we frame it. 'Available Nodes' or 'Undeployed machines' or whatever. The allocation can happen automatically, so from my point of view I don't see big problem with 'allocate' term. I just want to get away from talking about something ([manual] allocation) that we don't offer. We don't at the moment but we should :) -- Jarda ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 2013/09/12 21:22, Robert Collins wrote: Ironic today will want IPMI address + MAC for each NIC + disk/cpu/memory stats For registration it is just Management MAC address which is needed right? Or does Ironic need also IP? I think that MAC address might be enough, we can display IP in details of node later on. Ironic needs all the details I listed today. Management MAC is not currently used at all, but would be needed in future when we tackle IPMI IP managed by Neutron. OK, I will reflect that in wireframes for UI. > * Auto-discovery during undercloud install process (M) * Monitoring * assignment, availability, status * capacity, historical statistics (M) Why is this under 'nodes'? I challenge the idea that it should be there. We will need to surface some stuff about nodes, but the underlying idea is to take a cloud approach here - so we're monitoring services, that happen to be on nodes. There is room to monitor nodes, as an undercloud feature set, but lets be very very specific about what is sitting at what layer. We need both - we need to track services but also state of nodes (CPU, RAM, Network bandwidth, etc). So in node detail you should be able to track both. Those are instance characteristics, not node characteristics. An instance is software running on a Node, and the amount of CPU/RAM/NIC utilisation is specific to that software while it's on that Node, not to future or past instances running on that Node. I think this is minor detail. Node has certain CPU/RAM/NIC capacity and instance is consuming it. Either way it is important for us to display this utilization in the UI as well as service statistics. * Resource nodes ^ nodes is again confusing layers - nodes are what things are deployed to, but they aren't the entry point Can you, please be a bit more specific here? I don't understand this note. By the way, can you get your email client to insert > before the text you are replying to rather than HTML | marks? Hard to tell what I wrote and what you did :). Oh right, sure, sorry. Should be fixed ;) By that note I meant, that Nodes are not resources, Resource instances run on Nodes. Nodes are the generic pool of hardware we can deploy things onto. Well right, this is the terminology. From my point of view, resources for overcloud are the instances which are running on Nodes. Once we deploy the nodes with appropriate software they become Resource Nodes (from unallocated pool). If this terminology is confusing already then we should fix it. Any suggestions for improvements? * Unallocated nodes This implies an 'allocation' step, that we don't have - how about 'Idle nodes' or something. It can be auto-allocation. I don't see problem with 'unallocated' term. Ok, it's not a biggy. I do think it will frame things poorly and lead to an expectation about how TripleO works that doesn't match how it does, but we can change it later if I'm right, and if I'm wrong, well it won't be the first time :). I think we will figure it out in the other thread (where we talk about allocation). Anyway - I am interested in how differently would you formulate Unallocated / Resource / Management Nodes? Maybe your is better :) -- Jarda ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 2013/09/12 17:15, Tzu-Mainn Chen wrote: - As an infrastructure administrator, Anna wants to be able to unallocate a node from a deployment. Why? Whats her motivation. One plausible one for me is 'a machine needs to be serviced so Anna wants to remove it from the deployment to avoid causing user visible downtime.' So lets say that: Anna needs to be able to take machines out of service so they can be maintained or disposed of. Node being serviced is a different user story for me. I believe we are still 'fighting' here with two approaches and I believe we need both. We can't only provide a way 'give us resources we will do a magic'. Yes this is preferred way - especially for large deployments, but we also need a fallback so that user can say - no, this node doesn't belong to the class, I don't want it there - unassign. Or I need to have this node there - assign. Just for clarification - the wireframes don't cover individual nodes being manually assigned, do they? I thought the concession to manual control was entirely through resource classes and node profiles, which are still parameters to be passed through to the nova-scheduler filter. To me, that's very different from manual assignment. Mainn It's all doable and wireframes are prepared for the manual assignment as well, Mainn. I just was not designing details for now, since we are going to focus on auto-distribution first. But I will cover this use case in later iterations of wireframes. Cheers -- Jarda ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
Thanks for the explanation! I'm going to claim that the thread revolves around two main areas of disagreement. Then I'm going to propose a way through: a) Manual Node Assignment I think that everyone is agreed that automated node assignment through nova-scheduler is by far the most ideal case; there's no disagreement there. The disagreement comes from whether we need manual node assignment or not. I would argue that we need to step back and take a look at the real use case: heterogeneous nodes. If there are literally no characteristics that differentiate nodes A and B, then why do we care which gets used for what? Why do we need to manually assign one? If we can agree on that, then I think it would be sufficient to say that we want a mechanism to allow UI users to deal with heterogeneous nodes, and that mechanism must use nova-scheduler. In my mind, that's what resource classes and node profiles are intended for. One possible objection might be: nova scheduler doesn't have the appropriate filter that we need to separate out two nodes. In that case, I would say that needs to be taken up with nova developers. b) Terminology It feels a bit like some of the disagreement come from people using different words for the same thing. For example, the wireframes already details a UI where Robert's roles come first, but I think that message was confused because I mentioned "node types" in the requirements. So could we come to some agreement on what the most exact terminology would be? I've listed some examples below, but I'm sure there are more. node type | role management node | ? resource node | ? unallocated | available | undeployed create a node distribution | size the deployment resource classes | ? node profiles | ? Mainn - Original Message - > On 10 December 2013 09:55, Tzu-Mainn Chen wrote: > >> >* created as part of undercloud install process > > >> By that note I meant, that Nodes are not resources, Resource instances > >> run on Nodes. Nodes are the generic pool of hardware we can deploy > >> things onto. > > > > I don't think "resource nodes" is intended to imply that nodes are > > resources; rather, it's supposed to > > indicate that it's a node where a resource instance runs. It's supposed to > > separate it from "management node" > > and "unallocated node". > > So the question is are we looking at /nodes/ that have a /current > role/, or are we looking at /roles/ that have some /current nodes/. > > My contention is that the role is the interesting thing, and the nodes > is the incidental thing. That is, as a sysadmin, my hierarchy of > concerns is something like: > A: are all services running > B: are any of them in a degraded state where I need to take prompt > action to prevent a service outage [might mean many things: - software > update/disk space criticals/a machine failed and we need to scale the > cluster back up/too much load] > C: are there any planned changes I need to make [new software deploy, > feature request from user, replacing a faulty machine] > D: are there long term issues sneaking up on me [capacity planning, > machine obsolescence] > > If we take /nodes/ as the interesting thing, and what they are doing > right now as the incidental thing, it's much harder to map that onto > the sysadmin concerns. If we start with /roles/ then can answer: > A: by showing the list of roles and the summary stats (how many > machines, service status aggregate), role level alerts (e.g. nova-api > is not responding) > B: by showing the list of roles and more detailed stats (overall > load, response times of services, tickets against services > and a list of in trouble instances in each role - instances with > alerts against them - low disk, overload, failed service, > early-detection alerts from hardware > C: probably out of our remit for now in the general case, but we need > to enable some things here like replacing faulty machines > D: by looking at trend graphs for roles (not machines), but also by > looking at the hardware in aggregate - breakdown by age of machines, > summary data for tickets filed against instances that were deployed to > a particular machine > > C: and D: are (F) category work, but for all but the very last thing, > it seems clear how to approach this from a roles perspective. > > I've tried to approach this using /nodes/ as the starting point, and > after two terrible drafts I've deleted the section. I'd love it if > someone could show me how it would work:) > > >> > * Unallocated nodes > >> > > >> > This implies an 'allocation' step, that we don't have - how about > >> > 'Idle nodes' or something. > >> > > >> > It can be auto-allocation. I don't see problem with 'unallocated' term. > >> > >> Ok, it's not a biggy. I do think it will frame things poorly and lead > >> to an expectation about how TripleO works that doesn't match how it > >> does, but we can change it later if I'm right, and if I'm wrong, wel
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 10 December 2013 10:57, Jay Dobies wrote: >> >> So we have: >> - node - a physical general purpose machine capable of running in >> many roles. Some nodes may have hardware layout that is particularly >> useful for a given role. >> - role - a specific workload we want to map onto one or more nodes. >> Examples include 'undercloud control plane', 'overcloud control >> plane', 'overcloud storage', 'overcloud compute' etc. >> - instance - A role deployed on a node - this is where work actually >> happens. >> - scheduling - the process of deciding which role is deployed on which >> node. > > > This glossary is really handy to make sure we're all speaking the same > language. > > >> The way TripleO works is that we defined a Heat template that lays out >> policy: 5 instances of 'overcloud control plane please', '20 >> hypervisors' etc. Heat passes that to Nova, which pulls the image for >> the role out of Glance, picks a node, and deploys the image to the >> node. >> >> Note in particular the order: Heat -> Nova -> Scheduler -> Node chosen. >> >> The user action is not 'allocate a Node to 'overcloud control plane', >> it is 'size the control plane through heat'. >> >> So when we talk about 'unallocated Nodes', the implication is that >> users 'allocate Nodes', but they don't: they size roles, and after >> doing all that there may be some Nodes that are - yes - unallocated, > > > I'm not sure if I should ask this here or to your point above, but what > about multi-role nodes? Is there any piece in here that says "The policy > wants 5 instances but I can fit two of them on this existing underutilized > node and three of them on unallocated nodes" or since it's all at the image > level you get just what's in the image and that's the finest-level of > granularity? The way we handle that today is to create a composite role that says 'overcloud-compute+cinder storage', for instance - because image is the level of granularity. If/when we get automatic container subdivision - see the other really interesting long-term thread - we could subdivide, but I'd still do that using image as the level of granularity, it's just that we'd have the host image + the container images. >> or have nothing scheduled to them. So... I'm not debating that we >> should have a list of free hardware - we totally should - I'm debating >> how we frame it. 'Available Nodes' or 'Undeployed machines' or >> whatever. I just want to get away from talking about something >> ([manual] allocation) that we don't offer. > > > My only concern here is that we're not talking about cloud users, we're > talking about admins adminning (we'll pretend it's a word, come with me) a > cloud. To a cloud user, "give me some power so I can do some stuff" is a > safe use case if I trust the cloud I'm running on. I trust that the cloud > provider has taken the proper steps to ensure that my CPU isn't in New York > and my storage in Tokyo. Sure :) > To the admin setting up an overcloud, they are the ones providing that trust > to eventual cloud users. That's where I feel like more visibility and > control are going to be desired/appreciated. > > I admit what I just said isn't at all concrete. Might even be flat out > wrong. I was never an admin, I've just worked on sys management software > long enough to have the opinion that their levels of OCD are legendary. I > can't shake this feeling that someone is going to slap some fancy new > jacked-up piece of hardware onto the network and have a specific purpose > they are going to want to use it for. But maybe that's antiquated thinking > on my part. I think concrete use cases are the only way we'll get light at the end of the tunnel. So lets say someone puts a new bit of fancy kit onto their network and wants it for e.g. GPU VM instances only. Thats a reasonable desire. The basic stuff we're talking about so far is just about saying each role can run on some set of undercloud flavors. If that new bit of kit has the same coarse metadata as other kit, Nova can't tell it apart. So the way to solve the problem is: - a) teach Ironic about the specialness of the node (e.g. a tag 'GPU') - b) teach Nova that there is a flavor that maps to the presence of that specialness, and c) teach Nova that other flavors may not map to that specialness then in Tuskar whatever Nova configuration is needed to use that GPU is a special role ('GPU compute' for instance) and only that role would be given that flavor to use. That special config is probably being in a host aggregate, with an overcloud flavor that specifies that aggregate, which means at the TripleO level we need to put the aggregate in the config metadata for that role, and the admin does a one-time setup in the Nova Horizon UI to configure their GPU compute flavor. This isn't 'manual allocation' to me - it's surfacing the capabilities from the bottom ('has GPU') and the constraints from the top ('needs GPU') and letting Nova and Heat sort it out. -Rob --
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
So the question is are we looking at /nodes/ that have a /current role/, or are we looking at /roles/ that have some /current nodes/. My contention is that the role is the interesting thing, and the nodes is the incidental thing. That is, as a sysadmin, my hierarchy of concerns is something like: A: are all services running B: are any of them in a degraded state where I need to take prompt action to prevent a service outage [might mean many things: - software update/disk space criticals/a machine failed and we need to scale the cluster back up/too much load] C: are there any planned changes I need to make [new software deploy, feature request from user, replacing a faulty machine] D: are there long term issues sneaking up on me [capacity planning, machine obsolescence] If we take /nodes/ as the interesting thing, and what they are doing right now as the incidental thing, it's much harder to map that onto the sysadmin concerns. If we start with /roles/ then can answer: A: by showing the list of roles and the summary stats (how many machines, service status aggregate), role level alerts (e.g. nova-api is not responding) B: by showing the list of roles and more detailed stats (overall load, response times of services, tickets against services and a list of in trouble instances in each role - instances with alerts against them - low disk, overload, failed service, early-detection alerts from hardware C: probably out of our remit for now in the general case, but we need to enable some things here like replacing faulty machines D: by looking at trend graphs for roles (not machines), but also by looking at the hardware in aggregate - breakdown by age of machines, summary data for tickets filed against instances that were deployed to a particular machine C: and D: are (F) category work, but for all but the very last thing, it seems clear how to approach this from a roles perspective. I've tried to approach this using /nodes/ as the starting point, and after two terrible drafts I've deleted the section. I'd love it if someone could show me how it would work:) * Unallocated nodes This implies an 'allocation' step, that we don't have - how about 'Idle nodes' or something. It can be auto-allocation. I don't see problem with 'unallocated' term. Ok, it's not a biggy. I do think it will frame things poorly and lead to an expectation about how TripleO works that doesn't match how it does, but we can change it later if I'm right, and if I'm wrong, well it won't be the first time :). I'm interested in what the distinction you're making here is. I'd rather get things defined correctly the first time, and it's very possible that I'm missing a fundamental definition here. So we have: - node - a physical general purpose machine capable of running in many roles. Some nodes may have hardware layout that is particularly useful for a given role. - role - a specific workload we want to map onto one or more nodes. Examples include 'undercloud control plane', 'overcloud control plane', 'overcloud storage', 'overcloud compute' etc. - instance - A role deployed on a node - this is where work actually happens. - scheduling - the process of deciding which role is deployed on which node. This glossary is really handy to make sure we're all speaking the same language. The way TripleO works is that we defined a Heat template that lays out policy: 5 instances of 'overcloud control plane please', '20 hypervisors' etc. Heat passes that to Nova, which pulls the image for the role out of Glance, picks a node, and deploys the image to the node. Note in particular the order: Heat -> Nova -> Scheduler -> Node chosen. The user action is not 'allocate a Node to 'overcloud control plane', it is 'size the control plane through heat'. So when we talk about 'unallocated Nodes', the implication is that users 'allocate Nodes', but they don't: they size roles, and after doing all that there may be some Nodes that are - yes - unallocated, I'm not sure if I should ask this here or to your point above, but what about multi-role nodes? Is there any piece in here that says "The policy wants 5 instances but I can fit two of them on this existing underutilized node and three of them on unallocated nodes" or since it's all at the image level you get just what's in the image and that's the finest-level of granularity? or have nothing scheduled to them. So... I'm not debating that we should have a list of free hardware - we totally should - I'm debating how we frame it. 'Available Nodes' or 'Undeployed machines' or whatever. I just want to get away from talking about something ([manual] allocation) that we don't offer. My only concern here is that we're not talking about cloud users, we're talking about admins adminning (we'll pretend it's a word, come with me) a cloud. To a cloud user, "give me some power so I can do some stuff" is a safe use case if I trust the cloud I'm running on. I t
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 10 December 2013 09:55, Tzu-Mainn Chen wrote: >> >* created as part of undercloud install process >> By that note I meant, that Nodes are not resources, Resource instances >> run on Nodes. Nodes are the generic pool of hardware we can deploy >> things onto. > > I don't think "resource nodes" is intended to imply that nodes are resources; > rather, it's supposed to > indicate that it's a node where a resource instance runs. It's supposed to > separate it from "management node" > and "unallocated node". So the question is are we looking at /nodes/ that have a /current role/, or are we looking at /roles/ that have some /current nodes/. My contention is that the role is the interesting thing, and the nodes is the incidental thing. That is, as a sysadmin, my hierarchy of concerns is something like: A: are all services running B: are any of them in a degraded state where I need to take prompt action to prevent a service outage [might mean many things: - software update/disk space criticals/a machine failed and we need to scale the cluster back up/too much load] C: are there any planned changes I need to make [new software deploy, feature request from user, replacing a faulty machine] D: are there long term issues sneaking up on me [capacity planning, machine obsolescence] If we take /nodes/ as the interesting thing, and what they are doing right now as the incidental thing, it's much harder to map that onto the sysadmin concerns. If we start with /roles/ then can answer: A: by showing the list of roles and the summary stats (how many machines, service status aggregate), role level alerts (e.g. nova-api is not responding) B: by showing the list of roles and more detailed stats (overall load, response times of services, tickets against services and a list of in trouble instances in each role - instances with alerts against them - low disk, overload, failed service, early-detection alerts from hardware C: probably out of our remit for now in the general case, but we need to enable some things here like replacing faulty machines D: by looking at trend graphs for roles (not machines), but also by looking at the hardware in aggregate - breakdown by age of machines, summary data for tickets filed against instances that were deployed to a particular machine C: and D: are (F) category work, but for all but the very last thing, it seems clear how to approach this from a roles perspective. I've tried to approach this using /nodes/ as the starting point, and after two terrible drafts I've deleted the section. I'd love it if someone could show me how it would work:) >> > * Unallocated nodes >> > >> > This implies an 'allocation' step, that we don't have - how about >> > 'Idle nodes' or something. >> > >> > It can be auto-allocation. I don't see problem with 'unallocated' term. >> >> Ok, it's not a biggy. I do think it will frame things poorly and lead >> to an expectation about how TripleO works that doesn't match how it >> does, but we can change it later if I'm right, and if I'm wrong, well >> it won't be the first time :). >> > > I'm interested in what the distinction you're making here is. I'd rather get > things > defined correctly the first time, and it's very possible that I'm missing a > fundamental > definition here. So we have: - node - a physical general purpose machine capable of running in many roles. Some nodes may have hardware layout that is particularly useful for a given role. - role - a specific workload we want to map onto one or more nodes. Examples include 'undercloud control plane', 'overcloud control plane', 'overcloud storage', 'overcloud compute' etc. - instance - A role deployed on a node - this is where work actually happens. - scheduling - the process of deciding which role is deployed on which node. The way TripleO works is that we defined a Heat template that lays out policy: 5 instances of 'overcloud control plane please', '20 hypervisors' etc. Heat passes that to Nova, which pulls the image for the role out of Glance, picks a node, and deploys the image to the node. Note in particular the order: Heat -> Nova -> Scheduler -> Node chosen. The user action is not 'allocate a Node to 'overcloud control plane', it is 'size the control plane through heat'. So when we talk about 'unallocated Nodes', the implication is that users 'allocate Nodes', but they don't: they size roles, and after doing all that there may be some Nodes that are - yes - unallocated, or have nothing scheduled to them. So... I'm not debating that we should have a list of free hardware - we totally should - I'm debating how we frame it. 'Available Nodes' or 'Undeployed machines' or whatever. I just want to get away from talking about something ([manual] allocation) that we don't offer. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.o
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
> >* created as part of undercloud install process > >* can create additional management nodes (F) > > * Resource nodes > > > > ^ nodes is again confusing layers - nodes are > > what things are deployed to, but they aren't the entry point > > > > Can you, please be a bit more specific here? I don't understand this note. > > By the way, can you get your email client to insert > before the text > you are replying to rather than HTML | marks? Hard to tell what I > wrote and what you did :). > > By that note I meant, that Nodes are not resources, Resource instances > run on Nodes. Nodes are the generic pool of hardware we can deploy > things onto. I don't think "resource nodes" is intended to imply that nodes are resources; rather, it's supposed to indicate that it's a node where a resource instance runs. It's supposed to separate it from "management node" and "unallocated node". > > * Unallocated nodes > > > > This implies an 'allocation' step, that we don't have - how about > > 'Idle nodes' or something. > > > > It can be auto-allocation. I don't see problem with 'unallocated' term. > > Ok, it's not a biggy. I do think it will frame things poorly and lead > to an expectation about how TripleO works that doesn't match how it > does, but we can change it later if I'm right, and if I'm wrong, well > it won't be the first time :). > I'm interested in what the distinction you're making here is. I'd rather get things defined correctly the first time, and it's very possible that I'm missing a fundamental definition here. Mainn ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 9 December 2013 23:56, Jaromir Coufal wrote: > > On 2013/07/12 01:59, Robert Collins wrote: > >* Creation > * Manual registration > * hardware specs from Ironic based on mac address (M) > > Ironic today will want IPMI address + MAC for each NIC + disk/cpu/memory > stats > > For registration it is just Management MAC address which is needed right? Or > does Ironic need also IP? I think that MAC address might be enough, we can > display IP in details of node later on. Ironic needs all the details I listed today. Management MAC is not currently used at all, but would be needed in future when we tackle IPMI IP managed by Neutron. > * Auto-discovery during undercloud install process (M) >* Monitoring >* assignment, availability, status >* capacity, historical statistics (M) > > Why is this under 'nodes'? I challenge the idea that it should be > there. We will need to surface some stuff about nodes, but the > underlying idea is to take a cloud approach here - so we're monitoring > services, that happen to be on nodes. There is room to monitor nodes, > as an undercloud feature set, but lets be very very specific about > what is sitting at what layer. > > We need both - we need to track services but also state of nodes (CPU, RAM, > Network bandwidth, etc). So in node detail you should be able to track both. Those are instance characteristics, not node characteristics. An instance is software running on a Node, and the amount of CPU/RAM/NIC utilisation is specific to that software while it's on that Node, not to future or past instances running on that Node. >* created as part of undercloud install process >* can create additional management nodes (F) > * Resource nodes > > ^ nodes is again confusing layers - nodes are > what things are deployed to, but they aren't the entry point > > Can you, please be a bit more specific here? I don't understand this note. By the way, can you get your email client to insert > before the text you are replying to rather than HTML | marks? Hard to tell what I wrote and what you did :). By that note I meant, that Nodes are not resources, Resource instances run on Nodes. Nodes are the generic pool of hardware we can deploy things onto. > * searchable by status, name, cpu, memory, and all attributes from > ironic > * can be allocated as one of four node types > > Not by users though. We need to stop thinking of this as 'what we do > to nodes' - Nova/Ironic operate on nodes, we operate on Heat > templates. > > Discussed in other threads, but I still believe (and I am not alone) that we > need to allow 'force nodes'. I'll respond in the other thread :). > * Unallocated nodes > > This implies an 'allocation' step, that we don't have - how about > 'Idle nodes' or something. > > It can be auto-allocation. I don't see problem with 'unallocated' term. Ok, it's not a biggy. I do think it will frame things poorly and lead to an expectation about how TripleO works that doesn't match how it does, but we can change it later if I'm right, and if I'm wrong, well it won't be the first time :). -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
> > - As an infrastructure administrator, Anna wants to be able to unallocate a > > node from a deployment. > > > Why? Whats her motivation. One plausible one for me is 'a machine > > > needs to be serviced so Anna wants to remove it from the deployment to > > > avoid causing user visible downtime.' So lets say that: Anna needs to > > > be able to take machines out of service so they can be maintained or > > > disposed of. > > Node being serviced is a different user story for me. > I believe we are still 'fighting' here with two approaches and I believe we > need both. We can't only provide a way 'give us resources we will do a > magic'. Yes this is preferred way - especially for large deployments, but we > also need a fallback so that user can say - no, this node doesn't belong to > the class, I don't want it there - unassign. Or I need to have this node > there - assign. Just for clarification - the wireframes don't cover individual nodes being manually assigned, do they? I thought the concession to manual control was entirely through resource classes and node profiles, which are still parameters to be passed through to the nova-scheduler filter. To me, that's very different from manual assignment. Mainn ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 09/12/13 18:01, Jay Dobies wrote: >> I believe we are still 'fighting' here with two approaches and I believe >> we need both. We can't only provide a way 'give us resources we will do >> a magic'. Yes this is preferred way - especially for large deployments, >> but we also need a fallback so that user can say - no, this node doesn't >> belong to the class, I don't want it there - unassign. Or I need to have >> this node there - assign. > > +1 to this. I think there are still a significant amount of admins out > there that are really opposed to magic and want that fine-grained > control. Even if they don't use it that frequently, in my experience > they want to know it's there in the event they need it (and will often > dream up a case that they'll need it). +1 to the responses to the 'automagic' vs 'manual' discussion. The latter is in fact only really possible in small deployments. But that's not to say it is not a valid use case. Perhaps we need to split it altogether into two use cases. At least we should have a level of agreement here and register blueprints for both: for Icehouse the auto selection of which services go onto which nodes (i.e. allocation of services to nodes is entirely transparent). For post Icehouse allow manual allocation of services to nodes. This last bit may also coincide with any work being done in Ironic/Nova scheduler which will make this allocation prettier than the current force_nodes situation. > > I'm absolutely for pushing the magic approach as the preferred use. And > in large deployments that's where people are going to see the biggest > gain. The fine-grained approach can even be pushed off as a future > feature. But I wouldn't be surprised to see people asking for it and I'd > like to at least be able to say it's been talked about. > - As an infrastructure administrator, Anna wants to be able to view the history of nodes that have been in a deployment. >>> Why? This is super generic and could mean anything. >> I believe this has something to do with 'archived nodes'. But correct me >> if I am wrong. >> >> -- Jarda >> >> >> ___ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 12/06/2013 09:39 PM, Tzu-Mainn Chen wrote: Thanks for the comments and questions! I fully expect that this list of requirements will need to be fleshed out, refined, and heavily modified, so the more the merrier. Comments inline: *** Requirements are assumed to be targeted for Icehouse, unless marked otherwise: (M) - Maybe Icehouse, dependency on other in-development features (F) - Future requirement, after Icehouse * NODES Note that everything in this section should be Ironic API calls. * Creation * Manual registration * hardware specs from Ironic based on mac address (M) Ironic today will want IPMI address + MAC for each NIC + disk/cpu/memory stats * IP auto populated from Neutron (F) Do you mean IPMI IP ? I'd say IPMI address managed by Neutron here. * Auto-discovery during undercloud install process (M) * Monitoring * assignment, availability, status * capacity, historical statistics (M) Why is this under 'nodes'? I challenge the idea that it should be there. We will need to surface some stuff about nodes, but the underlying idea is to take a cloud approach here - so we're monitoring services, that happen to be on nodes. There is room to monitor nodes, as an undercloud feature set, but lets be very very specific about what is sitting at what layer. That's a fair point. At the same time, the UI does want to monitor both services and the nodes that the services are running on, correct? I would think that a user would want this. Would it be better to explicitly split this up into two separate requirements? That was my understanding as well, that Tuskar would not only care about the services of the undercloud but the health of the actual hardware on which it's running. As I write that I think you're correct, two separate requirements feels much more explicit in how that's different from elsewhere in OpenStack. * Management node (where triple-o is installed) This should be plural :) - TripleO isn't a single service to be installed - We've got Tuskar, Ironic, Nova, Glance, Keystone, Neutron, etc. I misspoke here - this should be "where the undercloud is installed". My current understanding is that our initial release will only support the undercloud being installed onto a single node, but my understanding could very well be flawed. * created as part of undercloud install process * can create additional management nodes (F) * Resource nodes ^ nodes is again confusing layers - nodes are what things are deployed to, but they aren't the entry point * searchable by status, name, cpu, memory, and all attributes from ironic * can be allocated as one of four node types Not by users though. We need to stop thinking of this as 'what we do to nodes' - Nova/Ironic operate on nodes, we operate on Heat templates. Right, I didn't mean to imply that users would be doing this allocation. But once Nova does this allocation, the UI does want to be aware of how the allocation is done, right? That's what this requirement meant. * compute * controller * object storage * block storage * Resource class - allows for further categorization of a node type * each node type specifies a single default resource class * allow multiple resource classes per node type (M) Whats a node type? Compute/controller/object storage/block storage. Is another term besides "node type" more accurate? * optional node profile for a resource class (M) * acts as filter for nodes that can be allocated to that class (M) I'm not clear on this - you can list the nodes that have had a particular thing deployed on them; we probably can get a good answer to being able to see what nodes a particular flavor can deploy to, but we don't want to be second guessing the scheduler.. Correct; the goal here is to provide a way through the UI to send additional filtering requirements that will eventually be passed into the scheduler, allowing the scheduler to apply additional filters. * nodes can be viewed by node types * additional group by status, hardware specification *Instances* - e.g. hypervisors, storage, block storage etc. * controller node type Again, need to get away from node type here. * each controller node will run all openstack services * allow each node to run specified service (F) * breakdown by workload (percentage of cpu used per node) (M) * Unallocated nodes This implies an 'allocation' step, that we don't have - how about 'Idle nodes' or something. Is it imprecise to say that nodes are allocated by the scheduler? Would something like 'active/idle' be better? * Archived nodes (F) * Will be se
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
I believe we are still 'fighting' here with two approaches and I believe we need both. We can't only provide a way 'give us resources we will do a magic'. Yes this is preferred way - especially for large deployments, but we also need a fallback so that user can say - no, this node doesn't belong to the class, I don't want it there - unassign. Or I need to have this node there - assign. +1 to this. I think there are still a significant amount of admins out there that are really opposed to magic and want that fine-grained control. Even if they don't use it that frequently, in my experience they want to know it's there in the event they need it (and will often dream up a case that they'll need it). I'm absolutely for pushing the magic approach as the preferred use. And in large deployments that's where people are going to see the biggest gain. The fine-grained approach can even be pushed off as a future feature. But I wouldn't be surprised to see people asking for it and I'd like to at least be able to say it's been talked about. - As an infrastructure administrator, Anna wants to be able to view the history of nodes that have been in a deployment. Why? This is super generic and could mean anything. I believe this has something to do with 'archived nodes'. But correct me if I am wrong. -- Jarda ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 07/12/13 04:42, Tzu-Mainn Chen wrote: >> On 7 December 2013 08:15, Jay Dobies wrote: >>> Disclaimer: I'm very new to the project, so apologies if some of my >>> questions have been already answered or flat out don't make sense. >> >> >> NP :) >> >> * optional node profile for a resource class (M) * acts as filter for nodes that can be allocated to that class (M) >>> >>> >>> To my understanding, once this is in Icehouse, we'll have to support >>> upgrades. If this filtering is pushed off, could we get into a situation >>> where an allocation created in Icehouse would no longer be valid in >>> Icehouse+1 once these filters are in place? If so, we might want to make it >>> more of a priority to get them in place earlier and not eat the headache of >>> addressing these sorts of integrity issues later. >> >> We need to be wary of over-implementing now; a lot of the long term >> picture is moving Tuskar prototype features into proper homes like >> Heat and Nova; so the more we implement now the more we have to move. >> * Unallocated nodes >>> >>> >>> Is there more still being flushed out here? Things like: >>> * Listing unallocated nodes >>> * Unallocating a previously allocated node (does this make it a vanilla >>> resource or does it retain the resource type? is this the only way to >>> change >>> a node's resource type?) >> >> Nodes don't have resource types. Nodes are machines Ironic knows >> about, and thats all they are. > > Once nodes are assigned by nova scheduler, would it be accurate to say that > they > have an implicit resource type? Or am I missing the point entirely? > >>> * Unregistering nodes from Tuskar's inventory (I put this under >>> unallocated >>> under the assumption that the workflow will be an explicit unallocate >>> before >>> unregister; I'm not sure if this is the same as "archive" below). >> >> Tuskar shouldn't have an inventory of nodes. > > Would it be correct to say that Ironic has an inventory of nodes, and that we > may > want to remove a node from Ironic's inventory? right, in which case (needs to be clarified): Tuskar doesn't store info about nodes BUT Tuskar (??) the Tuskar UI (??) uses a client to fetch info directly from Ironic on demand (from the UI). ?? > > Mainn > >> -Rob >> >> >> -- >> Robert Collins >> Distinguished Technologist >> HP Converged Cloud >> >> ___ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On Dec 6, 2013, at 8:20 PM, Robert Collins wrote: > On 7 December 2013 09:31, Liz Blanchard wrote: >> This list is great, thanks very much for taking the time to write this up! I >> think a big part of the User Experience design is to take a step back and >> understand the requirements from an end user's point of view…what would they >> want to accomplish by using this UI? This might influence the design in >> certain ways, so I've taken a cut at a set of user stories for the Icehouse >> timeframe based on these requirements that I hope will be useful during >> discussions. >> >> Based on the OpenStack Personas[1], I think that Anna would be the main >> consumer of the TripleO UI, but please let me know if you think otherwise. >> >> - As an infrastructure administrator, Anna needs to deploy or update a set >> of resources that will run OpenStack (This isn't a very specific use case, >> but more of the larger end goal of Anna coming into the UI.) >> - As an infrastructure administrator, Anna expects that the management node >> for the deployment services is already up and running and the status of this >> node is shown in the UI. >> - As an infrastructure administrator, Anna wants to be able to quickly see >> the set of unallocated nodes that she could use for her deployment of >> OpenStack. Ideally, she would not have to manually tell the system about >> these nodes. If she needs to manually register nodes for whatever reason, >> Anna would only want to have to define the essential data needed to register >> these nodes. > > I want to challenge this one. There are two concerns conflated. A) > seeing available resources for scaling up her cloud. B) minimising > effort to enroll additional resources. B) is a no-brainer. For A) > though, as phrased, we're talking about seeing a set of individual > items: but actually, wouldn't aggregated capacity being more useful, > with optional drill down - '400 cores, 2TB RAM, 1PB of disk' Good point. I will update this to read that the user wants to see the available capacity and have the option to drill in further. [1] > >> - As an infrastructure administrator, Anna needs to assign a role to each of >> the necessary nodes in her OpenStack deployment. The nodes could be either >> controller, compute, networking, or storage resources depending on the needs >> of this deployment. > > Definitely not: she needs to deliver a running cloud. Manually saying > 'machine X is a compute node' is confusing an implementation with a > need. She needs to know that her cloud will have enough capacity to > meet her users needs; she needs to know that it will be resilient > against a wide set of failures (and this might be a dial with > different clouds having different uptime guarantees); she may need to > ensure that some specific hardware configuration is used for storage, > as a performance optimisation. None of those needs imply assigning > roles to machines. > >> - As an infrastructure administrator, Anna wants to review the distribution >> of the nodes that she has assigned before kicking off the "Deploy" task. > > If by distribution you mean the top level stats (15 control nodes, 200 > hypervisors, etc) - then I agree. If you mean 'node X will be a > hypervisor' - I thoroughly disagree. What does that do for her? We are in agreement, I'd expect the former. I've updated the use case to be more specific. [1] > >> - As an infrastructure administrator, Anna wants to monitor the deployment >> process of all of the nodes that she has assigned. > > I don't think she wants to do that. I think she wants to be told if > there is a problem that needs her intervention to solve - e.g. bad > IPMI details for a node, or a node not responding when asked to boot > via PXE. > >> - As an infrastructure administrator, Anna needs to be able to troubleshoot >> any errors that may occur during the deployment of nodes process. > > Definitely. > >> - As an infrastructure administrator, Anna wants to monitor the availability >> and status of each node in her deployment. > > Yes, with the caveat that I think instance is the key thing here for > now; there is a lifecycle aspect where being able to say 'machine X is > having persistent network issues' is very important, as a long term > thing we should totally aim at that. > >> - As an infrastructure administrator, Anna wants to be able to unallocate a >> node from a deployment. > > Why? Whats her motivation. One plausible one for me is 'a machine > needs to be serviced so Anna wants to remove it from the deployment to > avoid causing user visible downtime.' So lets say that: Anna needs to > be able to take machines out of service so they can be maintained or > disposed of. > >> - As an infrastructure administrator, Anna wants to be able to view the >> history of nodes that have been in a deployment. > > Why? This is super generic and could mean anything. > >> - As an infrastructure administrator, Anna ne
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On Dec 9, 2013, at 8:58 AM, James Slagle wrote: > On Fri, Dec 6, 2013 at 4:55 PM, Matt Wagner wrote: >>> - As an infrastructure administrator, Anna expects that the >>> management node for the deployment services is already up and running >>> and the status of this node is shown in the UI. >> >> The 'management node' here is the undercloud node that Anna is >> interacting with, as I understand it. (Someone correct me if I'm wrong.) >> So it's not a bad idea to show its status, but I guess the mere fact >> that she's using it will indicate that it's operational. > > That's how I read it as well, which assumes that you're using the > undercloud to manage itself. > > FWIW, based on the OpenStack personas I think that Anna would be the > one doing the undercloud setup. So, maybe this use case should be: > > - As an infrastructure administrator, Anna wants to install the > undercloud so she can use the UI. > > That piece is going to be a pretty big part of the entire deployment > process, so I think having a use case for it makes sense. +1. I've added this as the very first use case. > > Nice work on the use cases Liz, thanks for pulling them together. Thanks to all for the great discussion on these use cases. The questions/comments that they've generated is exactly what I was hoping for. I will continue to make updates and refine these[1] based on discussions. Of course, feel free to add to/change these yourself as well. Liz [1] https://wiki.openstack.org/wiki/TripleO/Tuskar/IcehouseUserStories > > -- > -- James Slagle > -- > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 06/12/13 04:31, Tzu-Mainn Chen wrote: > Hey all, > > I've attempted to spin out the requirements behind Jarda's excellent > wireframes > (http://lists.openstack.org/pipermail/openstack-dev/2013-December/020944.html). > Hopefully this can add some perspective on both the wireframes and the needed > changes to the tuskar-api. > > All comments are welcome! > > Thanks, > Tzu-Mainn Chen > > > > *** Requirements are assumed to be targeted for Icehouse, unless marked > otherwise: >(M) - Maybe Icehouse, dependency on other in-development features >(F) - Future requirement, after Icehouse > > * NODES >* Creation > * Manual registration > * hardware specs from Ironic based on mac address (M) > * IP auto populated from Neutron (F) > * Auto-discovery during undercloud install process (M) >* Monitoring >* assignment, availability, status >* capacity, historical statistics (M) >* Management node (where triple-o is installed) >* created as part of undercloud install process >* can create additional management nodes (F) > * Resource nodes > * searchable by status, name, cpu, memory, and all attributes from > ironic > * can be allocated as one of four node types > * compute > * controller > * object storage > * block storage > * Resource class - allows for further categorization of a node type > * each node type specifies a single default resource class > * allow multiple resource classes per node type (M) > * optional node profile for a resource class (M) > * acts as filter for nodes that can be allocated to that > class (M) > * nodes can be viewed by node types > * additional group by status, hardware specification > * controller node type >* each controller node will run all openstack services > * allow each node to run specified service (F) >* breakdown by workload (percentage of cpu used per node) (M) > * Unallocated nodes > * Archived nodes (F) > * Will be separate openstack service (F) > > * DEPLOYMENT >* multiple deployments allowed (F) > * initially just one >* deployment specifies a node distribution across node types > * node distribution can be updated after creation >* deployment configuration, used for initial creation only > * defaulted, with no option to change > * allow modification (F) >* review distribution map (F) >* notification when a deployment is ready to go or whenever something > changes > > * DEPLOYMENT ACTION >* Heat template generated on the fly > * hardcoded images > * allow image selection (F) > * pre-created template fragments for each node type > * node type distribution affects generated template sorry am a bit late to the discussion - fyi: there are two sides to these previous points 1) temp solution using merge.py from tuskar and the tripleo-heat-templates repo. (Icehouse, imo) and 2) doing it 'properly' with the merge functionality pushed into heat. (F, imo). For 1) various bits are in play: fyi/if interested: /#/c/56947/ (Make merge.py invokable), /#/c/58823/ (Make merge.py installable) and /#/c/52045/ (WIP : sketch of what using merge.py looks like for tuskar) this last one needs updating and thought. Also /#/c/58229/ and /#/c/57210/ which need some more thought, >* nova scheduler allocates nodes > * filters based on resource class and node profile information (M) >* Deployment action can create or update >* status indicator to determine overall state of deployment > * status indicator for nodes as well > * status includes 'time left' (F) > > * NETWORKS (F) > * IMAGES (F) > * LOGS (F) > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On Dec 9, 2013, at 4:57 AM, Jaromir Coufal wrote: > > On 2013/07/12 02:20, Robert Collins wrote: >>> - As an infrastructure administrator, Anna needs to assign a role to each >>> of the necessary nodes in her OpenStack deployment. The nodes could be >>> either controller, compute, networking, or storage resources depending on >>> the needs of this deployment. >> Definitely not: she needs to deliver a running cloud. Manually saying >> 'machine X is a compute node' is confusing an implementation with a >> need. She needs to know that her cloud will have enough capacity to >> meet her users needs; she needs to know that it will be resilient >> against a wide set of failures (and this might be a dial with >> different clouds having different uptime guarantees); she may need to >> ensure that some specific hardware configuration is used for storage, >> as a performance optimisation. None of those needs imply assigning >> roles to machines. > Yes, in ideal world and large deployments. But there might be cases when Anna > will need to say - deploy storage to this specific node. Not arguing that we > want to have policy based approach, but we need to cover also manual control > (forcing node to take some role). Perhaps the use case is that Anna would want to define the different capacities that her cloud deployment will need? You both a right though, we don't want to force the user to manually select which nodes will run which services, but we should allow it for cases in which it's needed. I've updated the use case as an attempt to clear this up. [1] > >>> - As an infrastructure administrator, Anna wants to monitor the deployment >>> process of all of the nodes that she has assigned. >> I don't think she wants to do that. I think she wants to be told if >> there is a problem that needs her intervention to solve - e.g. bad >> IPMI details for a node, or a node not responding when asked to boot >> via PXE. > I think by this user story Liz wanted to capture that Anna wants to see if > the deployment process is still being in progress or if it has > finished/failed, etc. Which I agree with. I don't think that she will sit and > watch what is happening. Yes, definitely. I've updated this use case to reflect reality in that Anna would not sit there and actively monitor, but rather she would want to ultimately make sure that there weren't any errors during the deployment process. [1] > >> - As an infrastructure administrator, Anna wants to be able to unallocate a >> node from a deployment. >> Why? Whats her motivation. One plausible one for me is 'a machine >> needs to be serviced so Anna wants to remove it from the deployment to >> avoid causing user visible downtime.' So lets say that: Anna needs to >> be able to take machines out of service so they can be maintained or >> disposed of. > Node being serviced is a different user story for me. > > I believe we are still 'fighting' here with two approaches and I believe we > need both. We can't only provide a way 'give us resources we will do a > magic'. Yes this is preferred way - especially for large deployments, but we > also need a fallback so that user can say - no, this node doesn't belong to > the class, I don't want it there - unassign. Or I need to have this node > there - assign. This is a great question, Robert. I think the reason you bring up for Anna wanting to remove a node is actually more of a "Disable node" action. This way she could potentially bring it back up after the maintenance is done. I will add some more details to this use case to try to clarify. [1] > >>> - As an infrastructure administrator, Anna wants to be able to view the >>> history of nodes that have been in a deployment. >> Why? This is super generic and could mean anything. > I believe this has something to do with 'archived nodes'. But correct me if I > am wrong. I was assuming it would be incase the user wants to go back to view the history of a certain node. Potentially the user could bring an archived node back online? Although maybe at this point it would just be rediscovered? Thanks, Liz [1] https://wiki.openstack.org/wiki/TripleO/Tuskar/IcehouseUserStories > > -- Jarda > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On Dec 9, 2013, at 4:29 AM, Jaromir Coufal wrote: > > On 2013/06/12 22:55, Matt Wagner wrote: >>> - As an infrastructure administrator, Anna wants to review the >>> distribution of the nodes that she has assigned before kicking off >>> the "Deploy" task. >> What does she expect to see here on the review screen that she didn't >> see on the previous screens, if anything? Is this just a summation, or >> is she expecting to see things like which node will get which role? (I'd >> argue for the former; I don't know that we can predict the latter.) > At the beginning, just summation. Later (when we have nova-scheduler > reservation) we can get the real distribution of which node is taking which > role. Yes, the idea is that Anna wants to see some representation of what the distribution of nodes would be (how many would be assigned to each profile) before kicking off the "deploy" action. > >>> - As an infrastructure administrator, Anna wants to monitor the >>> deployment process of all of the nodes that she has assigned. >> I think there's an implied "...through the UI" here, versus tailing log >> files to watch state. Does she just expect to see states like "Pending", >> "Deploying", or "Finished", versus, say, having the full logs shown in >> the UI? (I'd vote 'yes'.) > For simplified view - yes, only change of states and progress bar. However > log should be available. I'd vote 'yes' as well. These are definitely design decisions we should be making based on what we know of our end user. Although some use cases like troubleshooting might point towards using logs, this one definitely seems like a UI addition. I'll update the use case to be more specific. [1] > >>> - As an infrastructure administrator, Anna needs to be able to >>> troubleshoot any errors that may occur during the deployment of nodes >>> process. >> I'm not sure that the "...through the UI" implication I mentioned above >> extends here. (IMHO) I assume that if things fail, Anna might be okay >> with us showing a message that $foo failed on $bar, and she should try >> looking in /var/log/$baz for full details. Does that seem fair? (At >> least early on.) > As said above, for simplified views, it is ok to say $foo failed on $bar, but > she should be able to track the problem - logs section in the UI. Yes, this is meant to be through the UI. I've updated the use case. [1] > >>> - As an infrastructure administrator, Anna wants to be able to view >>> the history of nodes that have been in a deployment. >> Why does she want to view history of past nodes? >> >> Note that I'm not arguing against this; it's just not abundantly clear >> to me what she'll be using this information for. Does she want a history >> to check off an "Audit log" checkbox, or will she be looking to extract >> certain data from this history? > Short answer is Graphs - history of utilization of the class etc. I've updated this one to be more specific about the reasons why historic nodes is important to Anna. [1] Thanks for all of the feedback, Liz [1] https://wiki.openstack.org/wiki/TripleO/Tuskar/IcehouseUserStories > > -- Jarda > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On Fri, Dec 6, 2013 at 4:55 PM, Matt Wagner wrote: >> - As an infrastructure administrator, Anna expects that the >> management node for the deployment services is already up and running >> and the status of this node is shown in the UI. > > The 'management node' here is the undercloud node that Anna is > interacting with, as I understand it. (Someone correct me if I'm wrong.) > So it's not a bad idea to show its status, but I guess the mere fact > that she's using it will indicate that it's operational. That's how I read it as well, which assumes that you're using the undercloud to manage itself. FWIW, based on the OpenStack personas I think that Anna would be the one doing the undercloud setup. So, maybe this use case should be: - As an infrastructure administrator, Anna wants to install the undercloud so she can use the UI. That piece is going to be a pretty big part of the entire deployment process, so I think having a use case for it makes sense. Nice work on the use cases Liz, thanks for pulling them together. -- -- James Slagle -- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
Mainn, Thanks for pulling this together. > * NODES >* Management node (where triple-o is installed) >* created as part of undercloud install process I think getting the undercloud installed/deployed should be a requirement for Icehouse. I'm not sure if you meant that or were assuming that it would already be done :). I'd like to see a simpler process than building the seed vm, starting it, deploying undercloud, etc. But, that's something we can work to define if others agree as well. >* can create additional management nodes (F) By this, do you mean using the undercloud to scale itself? e.g., using nova on the undercloud to launch an additional undercloud compute node, etc. I like that concept, and don't see any reason why that wouldn't be technically possible. > * DEPLOYMENT ACTION >* Heat template generated on the fly > * hardcoded images > * allow image selection (F) So, I think this may be what Robert was getting at, but I think this one should be M or possibly even committed to Icehouse. I think it's very likely we're going to need to update which image is used to do the deployment, e.g., if you build a new image to pick up a security update. IIRC, the image is just referenced by name in the template. So, maybe the process is just: * build the new image * rename/delete the old image * upload the new image with the required name (overcloud-compute, overcloud-control) However, having a nicer image selection process would be nice. -- -- James Slagle -- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 12/09/2013 11:56 AM, Jaromir Coufal wrote: On 2013/07/12 01:59, Robert Collins wrote: * Monitoring * assignment, availability, status * capacity, historical statistics (M) Why is this under 'nodes'? I challenge the idea that it should be there. We will need to surface some stuff about nodes, but the underlying idea is to take a cloud approach here - so we're monitoring services, that happen to be on nodes. There is room to monitor nodes, as an undercloud feature set, but lets be very very specific about what is sitting at what layer. We need both - we need to track services but also state of nodes (CPU, RAM, Network bandwidth, etc). So in node detail you should be able to track both. I agree. Monitoring services and monitoring nodes are both valid features for Tuskar. I think splitting it into two separate requirements as Mainn suggested would make a lot of sense. * searchable by status, name, cpu, memory, and all attributes from ironic * can be allocated as one of four node types Not by users though. We need to stop thinking of this as 'what we do to nodes' - Nova/Ironic operate on nodes, we operate on Heat templates. Discussed in other threads, but I still believe (and I am not alone) that we need to allow 'force nodes'. Yeah, having both approaches would be nice to have. Instead of using the existing 'force nodes' implementation, wouldn't it be better/cleaner to implement support for it in Nova and Heat? Imre ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 2013/07/12 01:59, Robert Collins wrote: * Creation * Manual registration * hardware specs from Ironic based on mac address (M) Ironic today will want IPMI address + MAC for each NIC + disk/cpu/memory stats For registration it is just Management MAC address which is needed right? Or does Ironic need also IP? I think that MAC address might be enough, we can display IP in details of node later on. * IP auto populated from Neutron (F) Do you mean IPMI IP ? I'd say IPMI address managed by Neutron here. +1 * Auto-discovery during undercloud install process (M) * Monitoring * assignment, availability, status * capacity, historical statistics (M) Why is this under 'nodes'? I challenge the idea that it should be there. We will need to surface some stuff about nodes, but the underlying idea is to take a cloud approach here - so we're monitoring services, that happen to be on nodes. There is room to monitor nodes, as an undercloud feature set, but lets be very very specific about what is sitting at what layer. We need both - we need to track services but also state of nodes (CPU, RAM, Network bandwidth, etc). So in node detail you should be able to track both. * Management node (where triple-o is installed) This should be plural :) - TripleO isn't a single service to be installed - We've got Tuskar, Ironic, Nova, Glance, Keystone, Neutron, etc. * created as part of undercloud install process * can create additional management nodes (F) * Resource nodes ^ nodes is again confusing layers - nodes are what things are deployed to, but they aren't the entry point Can you, please be a bit more specific here? I don't understand this note. * searchable by status, name, cpu, memory, and all attributes from ironic * can be allocated as one of four node types Not by users though. We need to stop thinking of this as 'what we do to nodes' - Nova/Ironic operate on nodes, we operate on Heat templates. Discussed in other threads, but I still believe (and I am not alone) that we need to allow 'force nodes'. * Unallocated nodes This implies an 'allocation' step, that we don't have - how about 'Idle nodes' or something. It can be auto-allocation. I don't see problem with 'unallocated' term. * defaulted, with no option to change * allow modification (F) * review distribution map (F) * notification when a deployment is ready to go or whenever something changes Is this an (M) ? Might be M but with higher priority. I see it in the middle. But if we have to decide, it can be M. -- Jarda ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 2013/07/12 02:20, Robert Collins wrote: - As an infrastructure administrator, Anna needs to assign a role to each of the necessary nodes in her OpenStack deployment. The nodes could be either controller, compute, networking, or storage resources depending on the needs of this deployment. Definitely not: she needs to deliver a running cloud. Manually saying 'machine X is a compute node' is confusing an implementation with a need. She needs to know that her cloud will have enough capacity to meet her users needs; she needs to know that it will be resilient against a wide set of failures (and this might be a dial with different clouds having different uptime guarantees); she may need to ensure that some specific hardware configuration is used for storage, as a performance optimisation. None of those needs imply assigning roles to machines. Yes, in ideal world and large deployments. But there might be cases when Anna will need to say - deploy storage to this specific node. Not arguing that we want to have policy based approach, but we need to cover also manual control (forcing node to take some role). - As an infrastructure administrator, Anna wants to monitor the deployment process of all of the nodes that she has assigned. I don't think she wants to do that. I think she wants to be told if there is a problem that needs her intervention to solve - e.g. bad IPMI details for a node, or a node not responding when asked to boot via PXE. I think by this user story Liz wanted to capture that Anna wants to see if the deployment process is still being in progress or if it has finished/failed, etc. Which I agree with. I don't think that she will sit and watch what is happening. - As an infrastructure administrator, Anna wants to be able to unallocate a node from a deployment. Why? Whats her motivation. One plausible one for me is 'a machine needs to be serviced so Anna wants to remove it from the deployment to avoid causing user visible downtime.' So lets say that: Anna needs to be able to take machines out of service so they can be maintained or disposed of. Node being serviced is a different user story for me. I believe we are still 'fighting' here with two approaches and I believe we need both. We can't only provide a way 'give us resources we will do a magic'. Yes this is preferred way - especially for large deployments, but we also need a fallback so that user can say - no, this node doesn't belong to the class, I don't want it there - unassign. Or I need to have this node there - assign. - As an infrastructure administrator, Anna wants to be able to view the history of nodes that have been in a deployment. Why? This is super generic and could mean anything. I believe this has something to do with 'archived nodes'. But correct me if I am wrong. -- Jarda ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 2013/06/12 22:55, Matt Wagner wrote: - As an infrastructure administrator, Anna wants to review the distribution of the nodes that she has assigned before kicking off the "Deploy" task. What does she expect to see here on the review screen that she didn't see on the previous screens, if anything? Is this just a summation, or is she expecting to see things like which node will get which role? (I'd argue for the former; I don't know that we can predict the latter.) At the beginning, just summation. Later (when we have nova-scheduler reservation) we can get the real distribution of which node is taking which role. - As an infrastructure administrator, Anna wants to monitor the deployment process of all of the nodes that she has assigned. I think there's an implied "...through the UI" here, versus tailing log files to watch state. Does she just expect to see states like "Pending", "Deploying", or "Finished", versus, say, having the full logs shown in the UI? (I'd vote 'yes'.) For simplified view - yes, only change of states and progress bar. However log should be available. - As an infrastructure administrator, Anna needs to be able to troubleshoot any errors that may occur during the deployment of nodes process. I'm not sure that the "...through the UI" implication I mentioned above extends here. (IMHO) I assume that if things fail, Anna might be okay with us showing a message that $foo failed on $bar, and she should try looking in /var/log/$baz for full details. Does that seem fair? (At least early on.) As said above, for simplified views, it is ok to say $foo failed on $bar, but she should be able to track the problem - logs section in the UI. - As an infrastructure administrator, Anna wants to be able to view the history of nodes that have been in a deployment. Why does she want to view history of past nodes? Note that I'm not arguing against this; it's just not abundantly clear to me what she'll be using this information for. Does she want a history to check off an "Audit log" checkbox, or will she be looking to extract certain data from this history? Short answer is Graphs - history of utilization of the class etc. -- Jarda ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 2013/06/12 21:26, Tzu-Mainn Chen wrote: * can be allocated as one of four node types It's pretty clear by the current verbiage but I'm going to ask anyway: "one and only one"? Yep, that's right! Confirming. One and only one. My gut reaction is that we want to bite this off sooner rather than later. This will have data model and API implications that, even if we don't commit to it for Icehouse, should still be in our minds during it, so it might make sense to make it a first class thing to just nail down now. That is entirely correct, which is one reason it's on the list of requirements. The forthcoming API design will have to account for it. Not recreating the entire data model between releases is a key goal :) Well yeah, that's why we should try to think in a longer-term and wireframes are covering also a bit more than might land in Icehouse. So that we are aware of future direction and we don't have to completely rebuild underlying models later on. * optional node profile for a resource class (M) * acts as filter for nodes that can be allocated to that class (M) To my understanding, once this is in Icehouse, we'll have to support upgrades. If this filtering is pushed off, could we get into a situation where an allocation created in Icehouse would no longer be valid in Icehouse+1 once these filters are in place? If so, we might want to make it more of a priority to get them in place earlier and not eat the headache of addressing these sorts of integrity issues later. Hm, can you be a bit more specific about how the allocation created in I might no longer be valid in I+1? That's true. The problem is that to my understanding, the filters we'd need in nova-scheduler are not yet fully in place. I think at the moment there are 'extra params' which we might use to some level. But yes, AFAIK there is missing part for filtered scheduling in nova. I also think that this is an issue that we'll need to address no matter what. Even once filters exist, if a user applies a filter *after* nodes are allocated, we'll need to do something clever if the already-allocated nodes don't meet the filter criteria. Well here is a thing. Once nodes are allocated, you can get warning, that those nodes in the resource class are not fulfilling the criteria (if they were changed) but that's all. It will be up to user's decision if he wants to keep them in or unallocate them. The profiles are important when a decision 'which node can get in' is being made. * nodes can be viewed by node types * additional group by status, hardware specification * controller node type * each controller node will run all openstack services * allow each node to run specified service (F) * breakdown by workload (percentage of cpu used per node) (M) * Unallocated nodes Is there more still being flushed out here? Things like: * Listing unallocated nodes * Unallocating a previously allocated node (does this make it a vanilla resource or does it retain the resource type? is this the only way to change a node's resource type?) If we use policy based approach then yes this is correct. First unallocate a node and then increase number of resources in other class. But I believe that we need keep control over your infrastructure and not to relay only on policies. So I hope we can get into something like 'reallocate'/'allocate manually' which will force a node to be part of specific class. * Unregistering nodes from Tuskar's inventory (I put this under unallocated under the assumption that the workflow will be an explicit unallocate before unregister; I'm not sure if this is the same as "archive" below). Ah, you're entirely right. I'll add these to the list. * Archived nodes (F) Can you elaborate a bit more on what this is? To be honest, I'm a bit fuzzy about this myself; Jarda mentioned that there was an OpenStack service in the process of being planned that would handle this requirement. Jarda, can you detail a bit? So the thing is based on historical data. At the moment, there is no service which would keep this type of data (might be new project?). Since Tuskar will not be only deploying but also monitoring your deployment, it is important to have historical data available. If user removes some nodes from infrastructure, he would lose all the data and we would not be able to generate graphs.That's why archived nodes = nodes which were registered in past but are no longer available. -- Jarda ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
> On 7 December 2013 08:15, Jay Dobies wrote: > > Disclaimer: I'm very new to the project, so apologies if some of my > > questions have been already answered or flat out don't make sense. > > > NP :) > > > >> * optional node profile for a resource class (M) > >> * acts as filter for nodes that can be allocated to that > >> class (M) > > > > > > To my understanding, once this is in Icehouse, we'll have to support > > upgrades. If this filtering is pushed off, could we get into a situation > > where an allocation created in Icehouse would no longer be valid in > > Icehouse+1 once these filters are in place? If so, we might want to make it > > more of a priority to get them in place earlier and not eat the headache of > > addressing these sorts of integrity issues later. > > We need to be wary of over-implementing now; a lot of the long term > picture is moving Tuskar prototype features into proper homes like > Heat and Nova; so the more we implement now the more we have to move. > > >> * Unallocated nodes > > > > > > Is there more still being flushed out here? Things like: > > * Listing unallocated nodes > > * Unallocating a previously allocated node (does this make it a vanilla > > resource or does it retain the resource type? is this the only way to > > change > > a node's resource type?) > > Nodes don't have resource types. Nodes are machines Ironic knows > about, and thats all they are. Once nodes are assigned by nova scheduler, would it be accurate to say that they have an implicit resource type? Or am I missing the point entirely? > > * Unregistering nodes from Tuskar's inventory (I put this under > > unallocated > > under the assumption that the workflow will be an explicit unallocate > > before > > unregister; I'm not sure if this is the same as "archive" below). > > Tuskar shouldn't have an inventory of nodes. Would it be correct to say that Ironic has an inventory of nodes, and that we may want to remove a node from Ironic's inventory? Mainn > -Rob > > > -- > Robert Collins > Distinguished Technologist > HP Converged Cloud > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
Thanks for the comments and questions! I fully expect that this list of requirements will need to be fleshed out, refined, and heavily modified, so the more the merrier. Comments inline: > > > > *** Requirements are assumed to be targeted for Icehouse, unless marked > > otherwise: > >(M) - Maybe Icehouse, dependency on other in-development features > >(F) - Future requirement, after Icehouse > > > > * NODES > > Note that everything in this section should be Ironic API calls. > > >* Creation > > * Manual registration > > * hardware specs from Ironic based on mac address (M) > > Ironic today will want IPMI address + MAC for each NIC + disk/cpu/memory > stats > > > * IP auto populated from Neutron (F) > > Do you mean IPMI IP ? I'd say IPMI address managed by Neutron here. > > > * Auto-discovery during undercloud install process (M) > >* Monitoring > >* assignment, availability, status > >* capacity, historical statistics (M) > > Why is this under 'nodes'? I challenge the idea that it should be > there. We will need to surface some stuff about nodes, but the > underlying idea is to take a cloud approach here - so we're monitoring > services, that happen to be on nodes. There is room to monitor nodes, > as an undercloud feature set, but lets be very very specific about > what is sitting at what layer. That's a fair point. At the same time, the UI does want to monitor both services and the nodes that the services are running on, correct? I would think that a user would want this. Would it be better to explicitly split this up into two separate requirements? > >* Management node (where triple-o is installed) > > This should be plural :) - TripleO isn't a single service to be > installed - We've got Tuskar, Ironic, Nova, Glance, Keystone, Neutron, > etc. I misspoke here - this should be "where the undercloud is installed". My current understanding is that our initial release will only support the undercloud being installed onto a single node, but my understanding could very well be flawed. > >* created as part of undercloud install process > >* can create additional management nodes (F) > > * Resource nodes > > ^ nodes is again confusing layers - nodes are > what things are deployed to, but they aren't the entry point > > > * searchable by status, name, cpu, memory, and all attributes from > > ironic > > * can be allocated as one of four node types > > Not by users though. We need to stop thinking of this as 'what we do > to nodes' - Nova/Ironic operate on nodes, we operate on Heat > templates. Right, I didn't mean to imply that users would be doing this allocation. But once Nova does this allocation, the UI does want to be aware of how the allocation is done, right? That's what this requirement meant. > > * compute > > * controller > > * object storage > > * block storage > > * Resource class - allows for further categorization of a node type > > * each node type specifies a single default resource class > > * allow multiple resource classes per node type (M) > > Whats a node type? Compute/controller/object storage/block storage. Is another term besides "node type" more accurate? > > > * optional node profile for a resource class (M) > > * acts as filter for nodes that can be allocated to that > > class (M) > > I'm not clear on this - you can list the nodes that have had a > particular thing deployed on them; we probably can get a good answer > to being able to see what nodes a particular flavor can deploy to, but > we don't want to be second guessing the scheduler.. Correct; the goal here is to provide a way through the UI to send additional filtering requirements that will eventually be passed into the scheduler, allowing the scheduler to apply additional filters. > > * nodes can be viewed by node types > > * additional group by status, hardware specification > > *Instances* - e.g. hypervisors, storage, block storage etc. > > > * controller node type > > Again, need to get away from node type here. > > >* each controller node will run all openstack services > > * allow each node to run specified service (F) > >* breakdown by workload (percentage of cpu used per node) (M) > > * Unallocated nodes > > This implies an 'allocation' step, that we don't have - how about > 'Idle nodes' or something. Is it imprecise to say that nodes are allocated by the scheduler? Would something like 'active/idle' be better? > > * Archived nodes (F) > > * Will be separate openstack service (F) > > > > * DEPLOYMENT > >* multiple deployments allowed (F) > > * initially just one > >* deployment specifies a node distribution across no
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 7 December 2013 10:55, Matt Wagner wrote: > The 'management node' here is the undercloud node that Anna is > interacting with, as I understand it. (Someone correct me if I'm wrong.) > So it's not a bad idea to show its status, but I guess the mere fact > that she's using it will indicate that it's operational. There are potentially many such nodes, and Anna will be interacting with some of them; I don't think we can make too many assumptions about what the UI working implies. >> - As an infrastructure administrator, Anna needs to be able to >> troubleshoot any errors that may occur during the deployment of nodes >> process. > > I'm not sure that the "...through the UI" implication I mentioned above > extends here. (IMHO) I assume that if things fail, Anna might be okay > with us showing a message that $foo failed on $bar, and she should try > looking in /var/log/$baz for full details. Does that seem fair? (At > least early on.) I don't think we necessarily need to do anything here other than make sure the system is a) well documented and b) Anna has all the normal sysadmin access to the infrastructure. Her needs can be met by us getting out of the way gracefully; at least in the short term. -Rob ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 7 December 2013 09:31, Liz Blanchard wrote: > This list is great, thanks very much for taking the time to write this up! I > think a big part of the User Experience design is to take a step back and > understand the requirements from an end user's point of view…what would they > want to accomplish by using this UI? This might influence the design in > certain ways, so I've taken a cut at a set of user stories for the Icehouse > timeframe based on these requirements that I hope will be useful during > discussions. > > Based on the OpenStack Personas[1], I think that Anna would be the main > consumer of the TripleO UI, but please let me know if you think otherwise. > > - As an infrastructure administrator, Anna needs to deploy or update a set of > resources that will run OpenStack (This isn't a very specific use case, but > more of the larger end goal of Anna coming into the UI.) > - As an infrastructure administrator, Anna expects that the management node > for the deployment services is already up and running and the status of this > node is shown in the UI. > - As an infrastructure administrator, Anna wants to be able to quickly see > the set of unallocated nodes that she could use for her deployment of > OpenStack. Ideally, she would not have to manually tell the system about > these nodes. If she needs to manually register nodes for whatever reason, > Anna would only want to have to define the essential data needed to register > these nodes. I want to challenge this one. There are two concerns conflated. A) seeing available resources for scaling up her cloud. B) minimising effort to enroll additional resources. B) is a no-brainer. For A) though, as phrased, we're talking about seeing a set of individual items: but actually, wouldn't aggregated capacity being more useful, with optional drill down - '400 cores, 2TB RAM, 1PB of disk' > - As an infrastructure administrator, Anna needs to assign a role to each of > the necessary nodes in her OpenStack deployment. The nodes could be either > controller, compute, networking, or storage resources depending on the needs > of this deployment. Definitely not: she needs to deliver a running cloud. Manually saying 'machine X is a compute node' is confusing an implementation with a need. She needs to know that her cloud will have enough capacity to meet her users needs; she needs to know that it will be resilient against a wide set of failures (and this might be a dial with different clouds having different uptime guarantees); she may need to ensure that some specific hardware configuration is used for storage, as a performance optimisation. None of those needs imply assigning roles to machines. > - As an infrastructure administrator, Anna wants to review the distribution > of the nodes that she has assigned before kicking off the "Deploy" task. If by distribution you mean the top level stats (15 control nodes, 200 hypervisors, etc) - then I agree. If you mean 'node X will be a hypervisor' - I thoroughly disagree. What does that do for her? > - As an infrastructure administrator, Anna wants to monitor the deployment > process of all of the nodes that she has assigned. I don't think she wants to do that. I think she wants to be told if there is a problem that needs her intervention to solve - e.g. bad IPMI details for a node, or a node not responding when asked to boot via PXE. > - As an infrastructure administrator, Anna needs to be able to troubleshoot > any errors that may occur during the deployment of nodes process. Definitely. > - As an infrastructure administrator, Anna wants to monitor the availability > and status of each node in her deployment. Yes, with the caveat that I think instance is the key thing here for now; there is a lifecycle aspect where being able to say 'machine X is having persistent network issues' is very important, as a long term thing we should totally aim at that. > - As an infrastructure administrator, Anna wants to be able to unallocate a > node from a deployment. Why? Whats her motivation. One plausible one for me is 'a machine needs to be serviced so Anna wants to remove it from the deployment to avoid causing user visible downtime.' So lets say that: Anna needs to be able to take machines out of service so they can be maintained or disposed of. > - As an infrastructure administrator, Anna wants to be able to view the > history of nodes that have been in a deployment. Why? This is super generic and could mean anything. > - As an infrastructure administrator, Anna needs to be notified of any > important changes to nodes that are in the OpenStack deployment. She does not > want to be spammed with non-important notifications. What sort of changes do you mean here? Thanks for putting this together, I love Personas as a way to make designs concrete and connected to user needs. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud _
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 7 December 2013 09:26, Tzu-Mainn Chen wrote: >> > * Archived nodes (F) >> >> Can you elaborate a bit more on what this is? > > To be honest, I'm a bit fuzzy about this myself; Jarda mentioned that there > was > an OpenStack service in the process of being planned that would handle this > requirement. Jarda, can you detail a bit? Ironic is a hypervisor service, roughly like libvirt+kvm for virtual machines : so it doesn't keep a deep history of whats been deployed where and other similar things : it's not a CMDB. Historical reporting is something to push data for into ceilometer, for instance. Nova has some support for historical data, and it's possible that that minimal approach might fit in Ironic, but I'm super skeptical - Ironic would have nothing to *do* for historical data, so why keep it there at all? -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On 7 December 2013 08:15, Jay Dobies wrote: > Disclaimer: I'm very new to the project, so apologies if some of my > questions have been already answered or flat out don't make sense. NP :) >> * optional node profile for a resource class (M) >> * acts as filter for nodes that can be allocated to that >> class (M) > > > To my understanding, once this is in Icehouse, we'll have to support > upgrades. If this filtering is pushed off, could we get into a situation > where an allocation created in Icehouse would no longer be valid in > Icehouse+1 once these filters are in place? If so, we might want to make it > more of a priority to get them in place earlier and not eat the headache of > addressing these sorts of integrity issues later. We need to be wary of over-implementing now; a lot of the long term picture is moving Tuskar prototype features into proper homes like Heat and Nova; so the more we implement now the more we have to move. >> * Unallocated nodes > > > Is there more still being flushed out here? Things like: > * Listing unallocated nodes > * Unallocating a previously allocated node (does this make it a vanilla > resource or does it retain the resource type? is this the only way to change > a node's resource type?) Nodes don't have resource types. Nodes are machines Ironic knows about, and thats all they are. > * Unregistering nodes from Tuskar's inventory (I put this under unallocated > under the assumption that the workflow will be an explicit unallocate before > unregister; I'm not sure if this is the same as "archive" below). Tuskar shouldn't have an inventory of nodes. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
Thanks for doing this! On 6 December 2013 15:31, Tzu-Mainn Chen wrote: > Hey all, > > I've attempted to spin out the requirements behind Jarda's excellent > wireframes > (http://lists.openstack.org/pipermail/openstack-dev/2013-December/020944.html). > Hopefully this can add some perspective on both the wireframes and the needed > changes to the tuskar-api. > > All comments are welcome! > > Thanks, > Tzu-Mainn Chen > > > > *** Requirements are assumed to be targeted for Icehouse, unless marked > otherwise: >(M) - Maybe Icehouse, dependency on other in-development features >(F) - Future requirement, after Icehouse > > * NODES Note that everything in this section should be Ironic API calls. >* Creation > * Manual registration > * hardware specs from Ironic based on mac address (M) Ironic today will want IPMI address + MAC for each NIC + disk/cpu/memory stats > * IP auto populated from Neutron (F) Do you mean IPMI IP ? I'd say IPMI address managed by Neutron here. > * Auto-discovery during undercloud install process (M) >* Monitoring >* assignment, availability, status >* capacity, historical statistics (M) Why is this under 'nodes'? I challenge the idea that it should be there. We will need to surface some stuff about nodes, but the underlying idea is to take a cloud approach here - so we're monitoring services, that happen to be on nodes. There is room to monitor nodes, as an undercloud feature set, but lets be very very specific about what is sitting at what layer. >* Management node (where triple-o is installed) This should be plural :) - TripleO isn't a single service to be installed - We've got Tuskar, Ironic, Nova, Glance, Keystone, Neutron, etc. >* created as part of undercloud install process >* can create additional management nodes (F) > * Resource nodes ^ nodes is again confusing layers - nodes are what things are deployed to, but they aren't the entry point > * searchable by status, name, cpu, memory, and all attributes from > ironic > * can be allocated as one of four node types Not by users though. We need to stop thinking of this as 'what we do to nodes' - Nova/Ironic operate on nodes, we operate on Heat templates. > * compute > * controller > * object storage > * block storage > * Resource class - allows for further categorization of a node type > * each node type specifies a single default resource class > * allow multiple resource classes per node type (M) Whats a node type? > * optional node profile for a resource class (M) > * acts as filter for nodes that can be allocated to that > class (M) I'm not clear on this - you can list the nodes that have had a particular thing deployed on them; we probably can get a good answer to being able to see what nodes a particular flavor can deploy to, but we don't want to be second guessing the scheduler.. > * nodes can be viewed by node types > * additional group by status, hardware specification *Instances* - e.g. hypervisors, storage, block storage etc. > * controller node type Again, need to get away from node type here. >* each controller node will run all openstack services > * allow each node to run specified service (F) >* breakdown by workload (percentage of cpu used per node) (M) > * Unallocated nodes This implies an 'allocation' step, that we don't have - how about 'Idle nodes' or something. > * Archived nodes (F) > * Will be separate openstack service (F) > > * DEPLOYMENT >* multiple deployments allowed (F) > * initially just one >* deployment specifies a node distribution across node types I can't parse this. Deployments specify how many instances to deploy in what roles (e.g. 2 control, 2 storage, 4 block storage, 20 hypervisors), some minor metadata about the instances (such as 'kvm' for the hypervisor, and what undercloud flavors to deploy on). > * node distribution can be updated after creation >* deployment configuration, used for initial creation only Can you enlarge on what you mean here? > * defaulted, with no option to change > * allow modification (F) >* review distribution map (F) >* notification when a deployment is ready to go or whenever something > changes Is this an (M) ? > * DEPLOYMENT ACTION >* Heat template generated on the fly > * hardcoded images > * allow image selection (F) We'll be spinning images up as part of the deployment, I presume - so this is really M, isn't it? or do you mean 'allow supplying images rather than building just in time' ? Or --- I dunno, but lets get some clarity here. > * pre-created template fragments for each node type > * nod
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
The relevant wiki page is here: https://wiki.openstack.org/wiki/TripleO/Tuskar#Icehouse_Planning - Original Message - > That looks really good, thanks for putting that together! > > I'm going to put together a wiki page that consolidates the various Tuskar > planning documents - requirements, user stories, wireframes, etc - so it's > easier to see the whole planning picture. > > Mainn > > - Original Message - > > > > On Dec 5, 2013, at 9:31 PM, Tzu-Mainn Chen wrote: > > > > > Hey all, > > > > > > I've attempted to spin out the requirements behind Jarda's excellent > > > wireframes > > > (http://lists.openstack.org/pipermail/openstack-dev/2013-December/020944.html). > > > Hopefully this can add some perspective on both the wireframes and the > > > needed changes to the tuskar-api. > > > > This list is great, thanks very much for taking the time to write this up! > > I > > think a big part of the User Experience design is to take a step back and > > understand the requirements from an end user's point of view…what would > > they > > want to accomplish by using this UI? This might influence the design in > > certain ways, so I've taken a cut at a set of user stories for the Icehouse > > timeframe based on these requirements that I hope will be useful during > > discussions. > > > > Based on the OpenStack Personas[1], I think that Anna would be the main > > consumer of the TripleO UI, but please let me know if you think otherwise. > > > > - As an infrastructure administrator, Anna needs to deploy or update a set > > of > > resources that will run OpenStack (This isn't a very specific use case, but > > more of the larger end goal of Anna coming into the UI.) > > - As an infrastructure administrator, Anna expects that the management node > > for the deployment services is already up and running and the status of > > this > > node is shown in the UI. > > - As an infrastructure administrator, Anna wants to be able to quickly see > > the set of unallocated nodes that she could use for her deployment of > > OpenStack. Ideally, she would not have to manually tell the system about > > these nodes. If she needs to manually register nodes for whatever reason, > > Anna would only want to have to define the essential data needed to > > register > > these nodes. > > - As an infrastructure administrator, Anna needs to assign a role to each > > of > > the necessary nodes in her OpenStack deployment. The nodes could be either > > controller, compute, networking, or storage resources depending on the > > needs > > of this deployment. > > - As an infrastructure administrator, Anna wants to review the distribution > > of the nodes that she has assigned before kicking off the "Deploy" task. > > - As an infrastructure administrator, Anna wants to monitor the deployment > > process of all of the nodes that she has assigned. > > - As an infrastructure administrator, Anna needs to be able to troubleshoot > > any errors that may occur during the deployment of nodes process. > > - As an infrastructure administrator, Anna wants to monitor the > > availability > > and status of each node in her deployment. > > - As an infrastructure administrator, Anna wants to be able to unallocate a > > node from a deployment. > > - As an infrastructure administrator, Anna wants to be able to view the > > history of nodes that have been in a deployment. > > - As an infrastructure administrator, Anna needs to be notified of any > > important changes to nodes that are in the OpenStack deployment. She does > > not want to be spammed with non-important notifications. > > > > Please feel free to comment, change, or add to this list. > > > > [1]https://docs.google.com/document/d/16rkiXWxxgzGT47_Wc6hzIPzO2-s2JWAPEKD0gP2mt7E/edit?pli=1# > > > > Thanks, > > Liz > > > > > > > > All comments are welcome! > > > > > > Thanks, > > > Tzu-Mainn Chen > > > > > > > > > > > > *** Requirements are assumed to be targeted for Icehouse, unless marked > > > otherwise: > > > (M) - Maybe Icehouse, dependency on other in-development features > > > (F) - Future requirement, after Icehouse > > > > > > * NODES > > > * Creation > > > * Manual registration > > > * hardware specs from Ironic based on mac address (M) > > > * IP auto populated from Neutron (F) > > > * Auto-discovery during undercloud install process (M) > > > * Monitoring > > > * assignment, availability, status > > > * capacity, historical statistics (M) > > > * Management node (where triple-o is installed) > > > * created as part of undercloud install process > > > * can create additional management nodes (F) > > >* Resource nodes > > >* searchable by status, name, cpu, memory, and all attributes from > > >ironic > > >* can be allocated as one of four node types > > >* compute > > >* controller > > >* object storage > > >
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
Thanks, Liz! Seeing things this way is really helpful. (I actually feel like wireframes -> requirements -> user stories is exactly the opposite of how this normally goes, but hitting all of the steps either way makes things much clearer.) I've raised some questions below. I think many of them aren't aimed at you per se, but are more general things that seeing the user stories has helped me realize we could clarify. On Fri Dec 6 15:31:36 2013, Liz Blanchard wrote: > - As an infrastructure administrator, Anna expects that the > management node for the deployment services is already up and running > and the status of this node is shown in the UI. The 'management node' here is the undercloud node that Anna is interacting with, as I understand it. (Someone correct me if I'm wrong.) So it's not a bad idea to show its status, but I guess the mere fact that she's using it will indicate that it's operational. > - As an infrastructure administrator, Anna wants to review the > distribution of the nodes that she has assigned before kicking off > the "Deploy" task. What does she expect to see here on the review screen that she didn't see on the previous screens, if anything? Is this just a summation, or is she expecting to see things like which node will get which role? (I'd argue for the former; I don't know that we can predict the latter.) > - As an infrastructure administrator, Anna wants to monitor the > deployment process of all of the nodes that she has assigned. I think there's an implied "...through the UI" here, versus tailing log files to watch state. Does she just expect to see states like "Pending", "Deploying", or "Finished", versus, say, having the full logs shown in the UI? (I'd vote 'yes'.) > - As an infrastructure administrator, Anna needs to be able to > troubleshoot any errors that may occur during the deployment of nodes > process. I'm not sure that the "...through the UI" implication I mentioned above extends here. (IMHO) I assume that if things fail, Anna might be okay with us showing a message that $foo failed on $bar, and she should try looking in /var/log/$baz for full details. Does that seem fair? (At least early on.) > - As an infrastructure administrator, Anna wants to be able to view > the history of nodes that have been in a deployment. Why does she want to view history of past nodes? Note that I'm not arguing against this; it's just not abundantly clear to me what she'll be using this information for. Does she want a history to check off an "Audit log" checkbox, or will she be looking to extract certain data from this history? Thanks again for creating these user stories, Liz! -- Matt Wagner Software Engineer, Red Hat signature.asc Description: OpenPGP digital signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
That looks really good, thanks for putting that together! I'm going to put together a wiki page that consolidates the various Tuskar planning documents - requirements, user stories, wireframes, etc - so it's easier to see the whole planning picture. Mainn - Original Message - > > On Dec 5, 2013, at 9:31 PM, Tzu-Mainn Chen wrote: > > > Hey all, > > > > I've attempted to spin out the requirements behind Jarda's excellent > > wireframes > > (http://lists.openstack.org/pipermail/openstack-dev/2013-December/020944.html). > > Hopefully this can add some perspective on both the wireframes and the > > needed changes to the tuskar-api. > > This list is great, thanks very much for taking the time to write this up! I > think a big part of the User Experience design is to take a step back and > understand the requirements from an end user's point of view…what would they > want to accomplish by using this UI? This might influence the design in > certain ways, so I've taken a cut at a set of user stories for the Icehouse > timeframe based on these requirements that I hope will be useful during > discussions. > > Based on the OpenStack Personas[1], I think that Anna would be the main > consumer of the TripleO UI, but please let me know if you think otherwise. > > - As an infrastructure administrator, Anna needs to deploy or update a set of > resources that will run OpenStack (This isn't a very specific use case, but > more of the larger end goal of Anna coming into the UI.) > - As an infrastructure administrator, Anna expects that the management node > for the deployment services is already up and running and the status of this > node is shown in the UI. > - As an infrastructure administrator, Anna wants to be able to quickly see > the set of unallocated nodes that she could use for her deployment of > OpenStack. Ideally, she would not have to manually tell the system about > these nodes. If she needs to manually register nodes for whatever reason, > Anna would only want to have to define the essential data needed to register > these nodes. > - As an infrastructure administrator, Anna needs to assign a role to each of > the necessary nodes in her OpenStack deployment. The nodes could be either > controller, compute, networking, or storage resources depending on the needs > of this deployment. > - As an infrastructure administrator, Anna wants to review the distribution > of the nodes that she has assigned before kicking off the "Deploy" task. > - As an infrastructure administrator, Anna wants to monitor the deployment > process of all of the nodes that she has assigned. > - As an infrastructure administrator, Anna needs to be able to troubleshoot > any errors that may occur during the deployment of nodes process. > - As an infrastructure administrator, Anna wants to monitor the availability > and status of each node in her deployment. > - As an infrastructure administrator, Anna wants to be able to unallocate a > node from a deployment. > - As an infrastructure administrator, Anna wants to be able to view the > history of nodes that have been in a deployment. > - As an infrastructure administrator, Anna needs to be notified of any > important changes to nodes that are in the OpenStack deployment. She does > not want to be spammed with non-important notifications. > > Please feel free to comment, change, or add to this list. > > [1]https://docs.google.com/document/d/16rkiXWxxgzGT47_Wc6hzIPzO2-s2JWAPEKD0gP2mt7E/edit?pli=1# > > Thanks, > Liz > > > > > All comments are welcome! > > > > Thanks, > > Tzu-Mainn Chen > > > > > > > > *** Requirements are assumed to be targeted for Icehouse, unless marked > > otherwise: > > (M) - Maybe Icehouse, dependency on other in-development features > > (F) - Future requirement, after Icehouse > > > > * NODES > > * Creation > > * Manual registration > > * hardware specs from Ironic based on mac address (M) > > * IP auto populated from Neutron (F) > > * Auto-discovery during undercloud install process (M) > > * Monitoring > > * assignment, availability, status > > * capacity, historical statistics (M) > > * Management node (where triple-o is installed) > > * created as part of undercloud install process > > * can create additional management nodes (F) > >* Resource nodes > >* searchable by status, name, cpu, memory, and all attributes from > >ironic > >* can be allocated as one of four node types > >* compute > >* controller > >* object storage > >* block storage > >* Resource class - allows for further categorization of a node type > >* each node type specifies a single default resource class > >* allow multiple resource classes per node type (M) > >* optional node profile for a resource class (M) > >* acts as filter for nodes that can b
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
On Dec 5, 2013, at 9:31 PM, Tzu-Mainn Chen wrote: > Hey all, > > I've attempted to spin out the requirements behind Jarda's excellent > wireframes > (http://lists.openstack.org/pipermail/openstack-dev/2013-December/020944.html). > Hopefully this can add some perspective on both the wireframes and the needed > changes to the tuskar-api. This list is great, thanks very much for taking the time to write this up! I think a big part of the User Experience design is to take a step back and understand the requirements from an end user's point of view…what would they want to accomplish by using this UI? This might influence the design in certain ways, so I've taken a cut at a set of user stories for the Icehouse timeframe based on these requirements that I hope will be useful during discussions. Based on the OpenStack Personas[1], I think that Anna would be the main consumer of the TripleO UI, but please let me know if you think otherwise. - As an infrastructure administrator, Anna needs to deploy or update a set of resources that will run OpenStack (This isn't a very specific use case, but more of the larger end goal of Anna coming into the UI.) - As an infrastructure administrator, Anna expects that the management node for the deployment services is already up and running and the status of this node is shown in the UI. - As an infrastructure administrator, Anna wants to be able to quickly see the set of unallocated nodes that she could use for her deployment of OpenStack. Ideally, she would not have to manually tell the system about these nodes. If she needs to manually register nodes for whatever reason, Anna would only want to have to define the essential data needed to register these nodes. - As an infrastructure administrator, Anna needs to assign a role to each of the necessary nodes in her OpenStack deployment. The nodes could be either controller, compute, networking, or storage resources depending on the needs of this deployment. - As an infrastructure administrator, Anna wants to review the distribution of the nodes that she has assigned before kicking off the "Deploy" task. - As an infrastructure administrator, Anna wants to monitor the deployment process of all of the nodes that she has assigned. - As an infrastructure administrator, Anna needs to be able to troubleshoot any errors that may occur during the deployment of nodes process. - As an infrastructure administrator, Anna wants to monitor the availability and status of each node in her deployment. - As an infrastructure administrator, Anna wants to be able to unallocate a node from a deployment. - As an infrastructure administrator, Anna wants to be able to view the history of nodes that have been in a deployment. - As an infrastructure administrator, Anna needs to be notified of any important changes to nodes that are in the OpenStack deployment. She does not want to be spammed with non-important notifications. Please feel free to comment, change, or add to this list. [1]https://docs.google.com/document/d/16rkiXWxxgzGT47_Wc6hzIPzO2-s2JWAPEKD0gP2mt7E/edit?pli=1# Thanks, Liz > > All comments are welcome! > > Thanks, > Tzu-Mainn Chen > > > > *** Requirements are assumed to be targeted for Icehouse, unless marked > otherwise: > (M) - Maybe Icehouse, dependency on other in-development features > (F) - Future requirement, after Icehouse > > * NODES > * Creation > * Manual registration > * hardware specs from Ironic based on mac address (M) > * IP auto populated from Neutron (F) > * Auto-discovery during undercloud install process (M) > * Monitoring > * assignment, availability, status > * capacity, historical statistics (M) > * Management node (where triple-o is installed) > * created as part of undercloud install process > * can create additional management nodes (F) >* Resource nodes >* searchable by status, name, cpu, memory, and all attributes from > ironic >* can be allocated as one of four node types >* compute >* controller >* object storage >* block storage >* Resource class - allows for further categorization of a node type >* each node type specifies a single default resource class >* allow multiple resource classes per node type (M) >* optional node profile for a resource class (M) >* acts as filter for nodes that can be allocated to that class > (M) >* nodes can be viewed by node types >* additional group by status, hardware specification >* controller node type > * each controller node will run all openstack services > * allow each node to run specified service (F) > * breakdown by workload (percentage of cpu used per node) (M) >* Unallocated nodes >* Archived nodes (F) >* Will be separate o
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
Thanks for the comments! Responses inline: > Disclaimer: I'm very new to the project, so apologies if some of my > questions have been already answered or flat out don't make sense. > > As I proofread, some of my comments may drift a bit past basic > requirements, so feel free to tell me to take certain questions out of > this thread into specific discussion threads if I'm getting too detailed. > > > > > > > *** Requirements are assumed to be targeted for Icehouse, unless marked > > otherwise: > > (M) - Maybe Icehouse, dependency on other in-development features > > (F) - Future requirement, after Icehouse > > > > * NODES > > * Creation > >* Manual registration > > * hardware specs from Ironic based on mac address (M) > > * IP auto populated from Neutron (F) > >* Auto-discovery during undercloud install process (M) > > * Monitoring > > * assignment, availability, status > > * capacity, historical statistics (M) > > * Management node (where triple-o is installed) > > * created as part of undercloud install process > > * can create additional management nodes (F) > > * Resource nodes > > * searchable by status, name, cpu, memory, and all attributes from > > ironic > > * can be allocated as one of four node types > > It's pretty clear by the current verbiage but I'm going to ask anyway: > "one and only one"? Yep, that's right! > > * compute > > * controller > > * object storage > > * block storage > > * Resource class - allows for further categorization of a node > > type > > * each node type specifies a single default resource class > > * allow multiple resource classes per node type (M) > > My gut reaction is that we want to bite this off sooner rather than > later. This will have data model and API implications that, even if we > don't commit to it for Icehouse, should still be in our minds during it, > so it might make sense to make it a first class thing to just nail down now. That is entirely correct, which is one reason it's on the list of requirements. The forthcoming API design will have to account for it. Not recreating the entire data model between releases is a key goal :) > > * optional node profile for a resource class (M) > > * acts as filter for nodes that can be allocated to that > > class (M) > > To my understanding, once this is in Icehouse, we'll have to support > upgrades. If this filtering is pushed off, could we get into a situation > where an allocation created in Icehouse would no longer be valid in > Icehouse+1 once these filters are in place? If so, we might want to make > it more of a priority to get them in place earlier and not eat the > headache of addressing these sorts of integrity issues later. That's true. The problem is that to my understanding, the filters we'd need in nova-scheduler are not yet fully in place. I also think that this is an issue that we'll need to address no matter what. Even once filters exist, if a user applies a filter *after* nodes are allocated, we'll need to do something clever if the already-allocated nodes don't meet the filter criteria. > > * nodes can be viewed by node types > > * additional group by status, hardware specification > > * controller node type > > * each controller node will run all openstack services > >* allow each node to run specified service (F) > > * breakdown by workload (percentage of cpu used per node) (M) > > * Unallocated nodes > > Is there more still being flushed out here? Things like: > * Listing unallocated nodes > * Unallocating a previously allocated node (does this make it a > vanilla resource or does it retain the resource type? is this the only > way to change a node's resource type?) > * Unregistering nodes from Tuskar's inventory (I put this under > unallocated under the assumption that the workflow will be an explicit > unallocate before unregister; I'm not sure if this is the same as > "archive" below). Ah, you're entirely right. I'll add these to the list. > > * Archived nodes (F) > > Can you elaborate a bit more on what this is? To be honest, I'm a bit fuzzy about this myself; Jarda mentioned that there was an OpenStack service in the process of being planned that would handle this requirement. Jarda, can you detail a bit? Thanks again for the comments! Mainn ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements
Disclaimer: I'm very new to the project, so apologies if some of my questions have been already answered or flat out don't make sense. As I proofread, some of my comments may drift a bit past basic requirements, so feel free to tell me to take certain questions out of this thread into specific discussion threads if I'm getting too detailed. *** Requirements are assumed to be targeted for Icehouse, unless marked otherwise: (M) - Maybe Icehouse, dependency on other in-development features (F) - Future requirement, after Icehouse * NODES * Creation * Manual registration * hardware specs from Ironic based on mac address (M) * IP auto populated from Neutron (F) * Auto-discovery during undercloud install process (M) * Monitoring * assignment, availability, status * capacity, historical statistics (M) * Management node (where triple-o is installed) * created as part of undercloud install process * can create additional management nodes (F) * Resource nodes * searchable by status, name, cpu, memory, and all attributes from ironic * can be allocated as one of four node types It's pretty clear by the current verbiage but I'm going to ask anyway: "one and only one"? * compute * controller * object storage * block storage * Resource class - allows for further categorization of a node type * each node type specifies a single default resource class * allow multiple resource classes per node type (M) My gut reaction is that we want to bite this off sooner rather than later. This will have data model and API implications that, even if we don't commit to it for Icehouse, should still be in our minds during it, so it might make sense to make it a first class thing to just nail down now. * optional node profile for a resource class (M) * acts as filter for nodes that can be allocated to that class (M) To my understanding, once this is in Icehouse, we'll have to support upgrades. If this filtering is pushed off, could we get into a situation where an allocation created in Icehouse would no longer be valid in Icehouse+1 once these filters are in place? If so, we might want to make it more of a priority to get them in place earlier and not eat the headache of addressing these sorts of integrity issues later. * nodes can be viewed by node types * additional group by status, hardware specification * controller node type * each controller node will run all openstack services * allow each node to run specified service (F) * breakdown by workload (percentage of cpu used per node) (M) * Unallocated nodes Is there more still being flushed out here? Things like: * Listing unallocated nodes * Unallocating a previously allocated node (does this make it a vanilla resource or does it retain the resource type? is this the only way to change a node's resource type?) * Unregistering nodes from Tuskar's inventory (I put this under unallocated under the assumption that the workflow will be an explicit unallocate before unregister; I'm not sure if this is the same as "archive" below). * Archived nodes (F) Can you elaborate a bit more on what this is? * Will be separate openstack service (F) * DEPLOYMENT * multiple deployments allowed (F) * initially just one * deployment specifies a node distribution across node types * node distribution can be updated after creation * deployment configuration, used for initial creation only * defaulted, with no option to change * allow modification (F) * review distribution map (F) * notification when a deployment is ready to go or whenever something changes * DEPLOYMENT ACTION * Heat template generated on the fly * hardcoded images * allow image selection (F) * pre-created template fragments for each node type * node type distribution affects generated template * nova scheduler allocates nodes * filters based on resource class and node profile information (M) * Deployment action can create or update * status indicator to determine overall state of deployment * status indicator for nodes as well * status includes 'time left' (F) * NETWORKS (F) * IMAGES (F) * LOGS (F) ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO][Tuskar] Icehouse Requirements
Hey all, I've attempted to spin out the requirements behind Jarda's excellent wireframes (http://lists.openstack.org/pipermail/openstack-dev/2013-December/020944.html). Hopefully this can add some perspective on both the wireframes and the needed changes to the tuskar-api. All comments are welcome! Thanks, Tzu-Mainn Chen *** Requirements are assumed to be targeted for Icehouse, unless marked otherwise: (M) - Maybe Icehouse, dependency on other in-development features (F) - Future requirement, after Icehouse * NODES * Creation * Manual registration * hardware specs from Ironic based on mac address (M) * IP auto populated from Neutron (F) * Auto-discovery during undercloud install process (M) * Monitoring * assignment, availability, status * capacity, historical statistics (M) * Management node (where triple-o is installed) * created as part of undercloud install process * can create additional management nodes (F) * Resource nodes * searchable by status, name, cpu, memory, and all attributes from ironic * can be allocated as one of four node types * compute * controller * object storage * block storage * Resource class - allows for further categorization of a node type * each node type specifies a single default resource class * allow multiple resource classes per node type (M) * optional node profile for a resource class (M) * acts as filter for nodes that can be allocated to that class (M) * nodes can be viewed by node types * additional group by status, hardware specification * controller node type * each controller node will run all openstack services * allow each node to run specified service (F) * breakdown by workload (percentage of cpu used per node) (M) * Unallocated nodes * Archived nodes (F) * Will be separate openstack service (F) * DEPLOYMENT * multiple deployments allowed (F) * initially just one * deployment specifies a node distribution across node types * node distribution can be updated after creation * deployment configuration, used for initial creation only * defaulted, with no option to change * allow modification (F) * review distribution map (F) * notification when a deployment is ready to go or whenever something changes * DEPLOYMENT ACTION * Heat template generated on the fly * hardcoded images * allow image selection (F) * pre-created template fragments for each node type * node type distribution affects generated template * nova scheduler allocates nodes * filters based on resource class and node profile information (M) * Deployment action can create or update * status indicator to determine overall state of deployment * status indicator for nodes as well * status includes 'time left' (F) * NETWORKS (F) * IMAGES (F) * LOGS (F) ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev