On 13/12/13 19:06 +1300, Robert Collins wrote:
On 13 December 2013 06:24, Will Foster <[email protected]> wrote:I just wanted to add a few thoughts:Thank you!For some comparative information here "from the field" I work extensively on deployments of large OpenStack implementations, most recently with a ~220node/9rack deployment (scaling up to 42racks / 1024 nodes soon). My primary role is of a Devops/Sysadmin nature, and not a specific development area so rapid provisioning/tooling/automation is an area I almost exclusively work within (mostly using API-driven using Foreman/Puppet). The infrastructure our small team designs/builds supports our development and business. I am the target user base you'd probably want to cater to.Absolutely!I can tell you the philosophy and mechanics of Tuskar/OOO are great, something I'd love to start using extensively but there are some needed aspects in the areas of control that I feel should be added (though arguably less for me and more for my ilk who are looking to expand their OpenStack footprint). * ability to 'preview' changes going to the schedulerWhat does this give you? How detailed a preview do you need? What information is critical there? Have you seen the proposed designs for a heat template preview feature - would that be sufficient?
Thanks for the reply. Preview-wise it'd be useful to see node allocation prior to deployment - nothing too in-depth. I have not seen the heat template preview features, are you referring to the YAML templating[1] or something else[2]? I'd like to learn more. [1] - http://docs.openstack.org/developer/heat/template_guide/hot_guide.html [2] - https://github.com/openstack/heat-templates
* ability to override/change some aspects within node assignmentWhat would this be used to do? How often do those situations turn up? Whats the impact if you can't do that?
One scenario might be that autodiscovery does not pick up an available node in your pool of resources, or detects incorrectly - you could manually change things as you like it. Another (more common) scenario is that you don't have an isolated, flat network with which to deploy and nodes are picked that you do not want included in the provisioning - you could remove those from the set of resources prior to launching overcloud creation. The impact would be that the toolingwould seem inflexible to those lacking a thoughtfully prepared network/infrastructure, or more commonly in cases where the existing
network design is too inflexible the usefulness and quick/seamless provisioning benefits would fall short.
* ability to view at least minimal logging from within Tuskar UILogging of what - the deployment engine? The heat event-log? Nova undercloud logs? Logs from the deployed instances? If it's not there in V1, but you can get, or already have credentials for the [instances that hold the logs that you wanted] would that be a big adoption blocker, or just a nuisance?
Logging of the deployment engine status during the bootstrapping process initially, and some rudimentary node success/failure indication. It should be simplistic enough to not rival existing monitoring/log systems but at least provide deployment logs as the overcloud is being built and a general node/health 'check-in' that it's complete. Afterwards as you mentioned the logs are available on the deployedsystems. Think of it as providing some basic written navigational signs for people crossing a small bridge before they get to the highway,
there's continuity from start -> finish and a clear sense of what's occurring. From my perspective, absence of this type of verbosity may impede adoption of new users (who are used to this type of information with deployment tooling).
Here's the main reason - most new adopters of OpenStack/IaaS are going to be running legacy/mixed hardware and while they might have an initiative to explore and invest and even a decent budget most of them are not going to have completely identical hardware, isolated/flat networks and things set aside in such a way that blind auto-discovery/deployment will just work all the time.Thats great information (and something I reasonably well expected, to a degree). We have a hard dependency on no wildcard DHCP servers in the environment (or we can't deploy). Autodiscovery is something we don't have yet, but certainly debugging deployment failures is a very important use case and one we need to improve both at the plumbing layer and in the stories around it in the UI.There will be a need to sometimes adjust, and those coming from a more vertically-scaling infrastructure (most large orgs.) will not have 100% matching standards in place of vendor, machine spec and network design which may make Tuscar/OOO seem inflexible and 'one-way'. This may just be a carry-over or fear of the old ways of deployment but nonetheless it is present.I'm not sure what you mean by matching standards here :). Ironic is designed to support extremely varied environments with arbitrary mixes of IPMI/drac/ilo/what-have-you, and abstract that away for us. From a network perspective I've been arguing the following: - we need routable access to the mgmt cards - if we don't have that (say there are 5 different mgmt domains with no routing between them) then we install 5 deployment layers (5 underclouds) which could be as small as one machine each. - within the machines that are served by one routable region of mgmt cards, we need no wildcard DHCP servers, for our DHCP server to serve PXE to the machines (for the PXE driver in Ironic). - building a single region overcloud from multiple undercloud regions will involve manually injecting well known endpoints (such as the floating virtual IP for API endpoints) into some of the regions, but it's in principle straightforward to do and use with the plumbing layer today.
Ah yes, this would be a great thing alas not all IPMI/OOB manufacturers are created equal with various degrees of lacking implementation in their specs. There is no way to fix that really, but creating/maintaining 'profiles' for the various ways hardware vendors want to do their OOB with ironic seems the way to go. The layered approach would seem to work around the deficiencies in design traversal as well.
In my case, we're lucky enough to have dedicated, near-identical equipment and a flexible network design we've architected prior that makes Tuskar/OOO a great fit. Most people will not have this greenfield ability and will use what they have lying around initially as to not make a big investment until familiarity and trust of something new is permeated. That said, I've been working with Jaromir Coufal on some UI mockups of Tuskar with some of this 'advanced' functionality included and from my perspective it looks like something to consider pulling in sooner than later if you want to maximize the adoption of new users.So, for Tuskar my short term goals are to support RH in shipping a polished product while still architecting and building something sustainable and suitable for integration into the OpenStack ecosystem. (For instance, one of the requirements for integration is that we don't [significantly] overlap other projects - and thats why I've been pushing so hard on the don't-reimplement-the-scheduler aspect of the discussion). Medium term I think we need to look at surfacing all the capabilities of the plumbing we have today, and long term we need to look at making it vendor-and-local-user-extensible. -Rob
I understand the perspective here, it makes sense to me. It's also important that folks don't see this as an overlap with file-based provisioning (Foreman/kickstart/puppet etc): there are occassions where I'd prefer a mostly hands-off, imaged based deployment metholody using Tuskar/OOO instead of the former and vice-versa. Looking forward to further progress and providing some tangible feedback once we start using it for mass deployment. Thanks, -will
pgpWDtJ4ESTNf.pgp
Description: PGP signature
_______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
