-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hi neutron folks,
I'd like to discuss a plan on getting support for online db schema upgrades in neutron. *What is it even about?* Currently, any major version upgrade, or master-to-master upgrade, requires neutron-server shutdown. After shutdown, operators apply db migration rules to their database (if any), and when it's complete, are able to start their neutron-server service(s). It has several drawbacks: - - while db is upgraded, API endpoints are not available (user visible out-of-service period); - - db upgrade may take a significant time, and the out-of-service period can become quite long. For rolling master-based environments, it's especially painful, since you get the scheduled offline time more often than once per 6 months. (Though even once per 6 months is not ideal.) *Proposal* Make neutron-server resilient to under-the-hood db schema changes. How can we achieve this? There are multiple things to touch both code- and culture-wise: - - if we want old neutron-server to continue working with db that is potentially upgraded to a newer schema, it means that we should stop applying non-additive changes to schema in migration rules. (Note that we still have a way to collect fossils once they are unused, e.g. during the next cycle). - - we should stop applying live data changes to database as part of migration rules. The only changes that should be allowed should touch schema but not insert/update/delete actual records. (I know neutron is especially guilty of it in the past, but I believe we can stop doing it. ) - - instead of migrating data with alembic rules, migrate it in runtime. There should be a abstraction layer that will make sure that data is migrated into new schema fields and objects, while preserving data originally stored in 'old' schema elements. That would allow old neutron-server code to run against new schema (it will just ignore new additions); and new neutron-server code to gradually migrate data into new columns/fields/tables while serving user s. Note that all neutron-server instances are still expected to restart at the same time. There should be no neutron-servers of different versions running, otherwise older instances will undo migration work applied by new ones, and it may result in data loss, db conflicts, hell raise. We may think of how to support iterative controller restart without any downtime, but that's out of scope of the proposal. *Isn't it too crazy?* Not really. Other projects achieved this already. Specifically, Nova does it since Liberty. Heat, Cinder are considering it now. Nova needed to stop doing data migrations or non-additive changes to schema in Kilo already. It suggests that the nearest possible time we get actual online migration in neutron is M; that's assuming we adopt stricter rules for migrations *now*, before anything incompatible is merged in Liberty. Also note that I haven't checked *aas migration rules yet: if there are incompatible changes there, it means that for setups that rely on those services, online migrations will become reality in Nausea only. Since neutron joins the game late, we are in better position than nova was, since a lot of tooling and practices are already implemented. Specifically, I mean oslo.versionedobjects that would serve as an abstraction object middleware in between db and the rest of neutron. *The plan for Liberty* We can't technically achieve online migrations in Liberty, for reasons stated above. It does not mean that we have nothing to do this cycle though. We should prepare ourselves doing the following: - - adopt stricter rules for migrations; - - adopt oslo.versionedobjects to represent neutron resources. (It will buy us more benefits, like object interface instead of passing dicts around; clear versioning on RPC side of things; potentially, assuming we apply corresponding practices, transparent remote calls to controller from agent side using the same objects defined on neutron-server side). === So, keeping in mind that there can be concerns or conflicts with existing efforts (f.e. plugin decomp part 2) that I don't fully realize, or maybe some architectural issues that would not allow us to start on the road just now, I'd like to hear from others on whether the strict rules even make sense in context of neutron. Of course, I especially look forward to hear from our db gods: Henry, Ann, and others. Ihar -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAEBCAAGBQJVgEOpAAoJEC5aWaUY1u57rzIIAKg6tgJ23OUzEx9WEWly8Evy YCRRSYAPjgX5rQ8UY1BLIPEH1j/FAdbE7RKuHuW+b2fcsKafFbh7EqW0HkCy75w7 5cja5VKZMoZ8MzR4A3TyLfR0C1IQ6FB9U+ISgaaDyqjrp/2pmr6Sobv+f9gtT6IR viLASdvsFyC8fQOGPNNG4Q2I5mnl+q1l8oji6jxp1uL49PETdStH6R88h6LWYBJg lGztStcVcAq1l0WVVdhgnJU8UaSJVYzlkUkTxzWiHscd8JSelCgR+Zq7rc6bx6RY +5uDmk8ZGVXDZIz9TEZbP2KgaF9tcIhYCPajCqS5wHFoJj/8UTz1MdsaqjHBv6w= =J+iJ -----END PGP SIGNATURE----- __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev