On 20/06/17 12:56, Curtis wrote: > On Sun, Jun 18, 2017 at 5:35 AM, Amrith Kumar <[email protected]> wrote: >> Trove has evolved rapidly over the past several years, since integration in >> IceHouse when it only supported single instances of a few databases. Today >> it supports a dozen databases including clusters and replication. >> >> The user survey [1] indicates that while there is strong interest in the >> project, there are few large production deployments that are known of (by >> the development team). >> >> Recent changes in the OpenStack community at large (company realignments, >> acquisitions, layoffs) and the Trove community in particular, coupled with a >> mounting burden of technical debt have prompted me to make this proposal to >> re-architect Trove. >> >> This email summarizes several of the issues that face the project, both >> structurally and architecturally. This email does not claim to include a >> detailed specification for what the new Trove would look like, merely the >> recommendation that the community should come together and develop one so >> that the project can be sustainable and useful to those who wish to use it >> in the future. >> >> TL;DR >> >> Trove, with support for a dozen or so databases today, finds itself in a >> bind because there are few developers, and a code-base with a significant >> amount of technical debt. >> >> Some architectural choices which the team made over the years have >> consequences which make the project less than ideal for deployers. >> >> Given that there are no major production deployments of Trove at present, >> this provides us an opportunity to reset the project, learn from our v1 and >> come up with a strong v2. >> >> An important aspect of making this proposal work is that we seek to >> eliminate the effort (planning, and coding) involved in migrating existing >> Trove v1 deployments to the proposed Trove v2. Effectively, with work >> beginning on Trove v2 as proposed here, Trove v1 as released with Pike will >> be marked as deprecated and users will have to migrate to Trove v2 when it >> becomes available. >> >> While I would very much like to continue to support the users on Trove v1 >> through this transition, the simple fact is that absent community >> participation this will be impossible. Furthermore, given that there are no >> production deployments of Trove at this time, it seems pointless to build >> that upgrade path from Trove v1 to Trove v2; it would be the proverbial >> bridge from nowhere. >> >> This (previous) statement is, I realize, contentious. There are those who >> have told me that an upgrade path must be provided, and there are those who >> have told me of unnamed deployments of Trove that would suffer. To this, all >> I can say is that if an upgrade path is of value to you, then please commit >> the development resources to participate in the community to make that >> possible. But equally, preventing a v2 of Trove or delaying it will only >> make the v1 that we have today less valuable. >> >> We have learned a lot from v1, and the hope is that we can address that in >> v2. Some of the more significant things that I have learned are: >> >> - We should adopt a versioned front-end API from the very beginning; making >> the REST API versioned is not a ‘v2 feature’ >> >> - A guest agent running on a tenant instance, with connectivity to a shared >> management message bus is a security loophole; encrypting traffic, >> per-tenant-passwords, and any other scheme is merely lipstick on a security >> hole >> >> - Reliance on Nova for compute resources is fine, but dependence on Nova VM >> specific capabilities (like instance rebuild) is not; it makes things like >> containers or bare-metal second class citizens >> >> - A fair portion of what Trove does is resource orchestration; don’t >> reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as far >> along when Trove got started but that’s not the case today and we have an >> opportunity to fix that now >> >> - A similarly significant portion of what Trove does is to implement a >> state-machine that will perform specific workflows involved in implementing >> database specific operations. This makes the Trove taskmanager a stateful >> entity. Some of the operations could take a fair amount of time. This is a >> serious architectural flaw >> >> - Tenants should not ever be able to directly interact with the underlying >> storage and compute used by database instances; that should be the default >> configuration, not an untested deployment alternative >> > As an operator I wouldn't run Trove as it is, unless I absolutely had to. > > I think it is a good idea to reboot the project. I really think the > concept of "service VMs" should be a thing. I'm not sure where the > OpenStack community has landed on that, my fault for not paying close > attention, but we should be able to create VMs for a tenant that are > not managed by the tenant but that could be billed to them in some > fashion. At least that's my opinion.
Re the 'service VMs', yep, it could be very useful. And in Zaqar, we're working on a spec to support 'service queue', similar like the 'service VMs', so that the service user can create queues in user's tenant. And I can imagine Trove could benefit from that feature as well. > >> - The CI should test all databases that are considered to be ‘supported’ >> without excessive use of resources in the gate; better code modularization >> will help determine the tests which can safely be skipped in testing changes >> >> - Clusters should be first class citizens not an afterthought, single >> instance databases may be the ‘special case’, not the other way around > Definitely agree on that. Cluster first model. > >> - The project must provide guest images (or at least complete tooling for >> deployers to build these); while the project can’t distribute operating >> systems and database software, the current deployment model merely impedes >> adoption >> >> - Clusters spanning OpenStack deployments are a real thing that must be >> supported >> > I'm curious as to how this will be done. This is a requirement in > NFV-land as well for other services. Would be very powerful and is > needed in other areas. > > Thanks, > Curtis. > >> This might sound harsh, that isn’t the intent. Each of these is the >> consequence of one or more perfectly rational decisions. Some of those >> decisions have had unintended consequences, and others were made knowing >> that we would be incurring some technical debt; debt we have not had the >> time or resources to address. Fixing all these is not impossible, it just >> takes the dedication of resources by the community. >> >> I do not have a complete design for what the new Trove would look like. For >> example, I don’t know how we will interact with other projects (like Heat). >> Many questions remain to be explored and answered. >> >> Would it suffice to just use the existing Heat resources and build templates >> around those, or will it be better to implement custom Trove resources and >> then orchestrate things based on those resources? >> >> Would Trove implement the workflows required for multi-stage database >> operations by itself, or would it rely on some other project (say Mistral) >> for this? Is Mistral really a workflow service, or just cron on steroids? I >> don’t know the answer but I would like to find out. >> >> While we don’t have the answers to these questions, I think this is a >> conversation that we must have, one that we must decide on, and then as a >> community commit the resources required to make a Trove v2 which delivers on >> the mission of the project; “To provide scalable and reliable Cloud Database >> as a Service provisioning functionality for both relational and >> non-relational database engines, and to continue to improve its >> fully-featured and extensible open source framework.”[2] >> >> Thanks, >> >> -amrith >> >> >> [1] https://www.openstack.org/assets/survey/April2017SurveyReport.pdf >> [2] https://wiki.openstack.org/wiki/Trove#Mission_Statement >> >> >> >> __________________________________________________________________________ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: [email protected]?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > > -- Cheers & Best regards, Feilong Wang (王飞龙) -------------------------------------------------------------------------- Senior Cloud Software Engineer Tel: +64-48032246 Email: [email protected] Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington -------------------------------------------------------------------------- __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
