Hey folks, As we decided on today's IRC meeting in #fuel-dev, FFE exception is granted on the following conditions (if get them right):
* the feature is marked as experimental * patches should be merged by the end of next week Thanks, igor On Tue, Dec 1, 2015 at 10:01 PM, Vladimir Kuklin <vkuk...@mirantis.com> wrote: > Hi, Folks > > * Intro > > During Iteration 3 our Enhancements Team as long as other folks worked on > the feature called "Task Based Deployment with Astute". Here is a link to > its blueprint: > https://blueprints.launchpad.net/fuel/+spec/task-based-deployment-astute > > Major implication of this feature complition is that our deployment process > will be drastically optimized allowing us to decrease deployment time of > typical clusters at least by 2,5 times (for BVT/CI cases) and by order of > magnitude for 100-node clusters. > > This is achieved by real parallelization of deployment tasks execution which > assumes that we do not wait for the whole 'deployment group/role' to deploy, > but we only wait for particular tasks to finish. For example, we could > deploy 'database' task on secondary controllers as soon as 'database' task > is ready on the first controller. As our deployment workflow consists only > of a small amount of such synchronization points as 'database' task, we will > be able to deploy majority of deployment tasks in parallel shrinking > deployment time to "time-of-deployment-of-the-longest-node". This actually > means that our standard deployment case for development and testing will > take 30 minutes on our CI servers thus drastically improving developers and > users experience, as well as shrinking down time of overall acceptance > testing, time for bug reproducing and so on. This feature also allows one to > use 7.0 role-as-a-plugin feature in much more effective way as current > split-services-with-plugins feature may lead to very inoptimal deployment > flow which might take up to 6 hours even for the simplest HA cluster, while > it would take again 30 minutes with Task-Based approach. > Also, when multi-roles were used we ran several tasks for each role each > time it was used, making deployment suboptimal again. > > > * Short List of Work Items > > As we started a little bit lately during iteration 3 we worked on design and > specification of this feature in a way so that its introduction will bring > in almost zero chance of regression with ability to disable it. Here is the > summary > > So far we introduce several pieces of code: > 1. New version of tasks format introducing cross-node dependencies between > tasks > 2. Changes to Nailgun > a. deduplication of tasks for roles [In Progress] > b. support for new tasks format [In Progress] > c. new engine that generates an array of hashes of tasks info consumable > by new Astute engine [In Progress]. > 3. Changes to Astute > a. Tasks dependencies parser and visualizer [Ready for review] > b. Deployment engine capable of graph traversing and reporting [Read for > Review] > c. Async wrapper for shell-based tasks [Ready for review] > 4. Changes to Fuel Library > a. Add additional fields into existing Fuel Library deployment tasks for > cross-dependencies [In Progress]. > > * Ensurance of Little Regression and Backward Compatibility > > As we worked on being backward-compatible from the day one, this engine is > enabled ONLY when 2 requirements are met: > > 1. It is globally enabled in Nailgun settings.yaml > 2. ALL tasks scheduled for deployment execution have v2.0.0 > > This list seems a little bit huge, but this changes are isolated and > granular and actually affect the sequence in which tasks are executed on the > nodes. This means that there will be actually no difference from the view of > resulting functioning of the cluster. This feature can be safely disabled if > user does not want to use it. > > But if user wants to work with it, he can gain enormous improvement in > speed, his own engineering/development/testing velocity as well as in Fuel > user experience. > > * Additional Cons of the Feature > > Moreover, this feature improves how the following use cases are also > addressed: > > 1. When user deploys a specific set of nodes or tasks > It will be possible to introduce additional flag for deploy/task run handler > for Nailgun to pick up dependencies of specified tasks, even if they are > currently not in place in current deployment graph. This means that instead > of running > > fuel nodes --node-id 2,3 --deploy > > and see how it fails as node-1 contains some of the tasks that are required > by nodes 2 and 3, user will be calm about it as he will be able to specify > an option to populate deployment flow with needed tasks. No more > > fuel nodes --node-id 2 --tasks netconfig -> Fail, because you forgot to > specify some of the required tasks, e.g. hiera, globals. > > 2. Post-deployment plugin installation > > This feature also makes post-deployment plugin installation much easier as > plugin installation will happen almost in matter of minutes instead of > hours. > > 3. Cluster re-deployment for some of LCM cases support > > Whenever user can change settings on the nodes and trigger full cluster > redeployment or whenever he wants to get tainted cluster converge back to > the previous state deployed by Fuel, he will get his cluster back into > operational state in 30 minutes. > > 4. Better capabilities for separated services plugins > > Task-based approach allows one to deploy things with separate services in > much more flexible ways. E.g one will not have to introduce 2 roles in the > plugin for controller to detach keystone services, e.g. > pre-keystone-controller-tasks and post-keystone-controller-tasks. All he > will need is to introduce "skipped" keystone task for controllers so that > keystone is deployed only on the node with keystone role. > > * Merge plan > > Merge Astute changes - ETA Dec 4rd > Merge Nailgun changes - ETA Dec 4rd > Prepare Fuel Library changes - ETA Dec 3rd > Test this feature on Scale Lab and against swarm - ETA SCF > Make decision whether to enable task-based deployment engine by default - > SCF > > * Summary > > This feature brings a lot of benefits for everyone. Its current > implementation introduces 0 chances for regressions as it will be disabled > by default and it will require specific actions for a user to start using > this feature. In meanwhile we will test this feature at Scale Lab and > against swarm and custom tests. And by SCF we may decide whether to switch > to it based on the reported results. If it happens before SCF, we will be > able to significantly ramp up our development and bugfixing velocity. > > -- > Yours Faithfully, > Vladimir Kuklin, > Fuel Library Tech Lead, > Mirantis, Inc. > +7 (495) 640-49-04 > +7 (926) 702-39-68 > Skype kuklinvv > 35bk3, Vorontsovskaya Str. > Moscow, Russia, > www.mirantis.com > www.mirantis.ru > vkuk...@mirantis.com > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev