Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
Doug, I agree, VM/baremetal, shared VM, object for backup storage or block device for backup storage, those are implementation choices. We should not have quota/billing depend on those; Trove should expose counters and space-time-products of the things that Trove users should be billed for. At issue is the old habit of taking compute out of the tenants compute quota (or not), storage out of the tenants storage quota (or not). We did it one way, and in retrospect I think it was the wrong way. Trove should consume resources it needs and users should be billed for databasey things (not compute, block and object storage, network traffic but database cluster time, backups, data, queries, etc.,). Thanks, -amrith On Wed, Jul 12, 2017 at 9:57 AM, Doug Hellmann wrote: > Excerpts from Amrith Kumar's message of 2017-07-12 06:14:28 -0500: > > All: > > > > First, let me thank all of you who responded and provided feedback > > on what I wrote. I've summarized what I heard below and am posting > > it as one consolidated response rather than responding to each > > of your messages and making this thread even deeper. > > > > As I say at the end of this email, I will be setting up a session at > > the Denver PTG to specifically continue this conversation and hope > > you will all be able to attend. As soon as time slots for PTG are > > announced, I will try and pick this slot and request that you please > > attend. > > > > > > > > Thierry: naming issue; call it Hoard if it does not have a migration > > path. > > > > > > > > Kevin: use a container approach with k8s as the orchestration > > mechanism, addresses multiple issues including performance. Trove to > > provide containers for multiple components which cooperate to provide > > a single instance of a database or cluster. Don't put all components > > (agent, monitoring, database) in a single VM, decoupling makes > > migraiton and upgrades easier and allows trove to reuse database > > vendor supplied containers. Performance of databases in VM's poor > > compared to databases on bare-metal. > > > > > > > > Doug Hellmann: > > > > > Does "service VM" need to be a first-class thing? Akanda creates > > > them, using a service user. The VMs are tied to a "router" which is > > > the billable resource that the user understands and interacts with > > > through the API. > > > > Amrith: Doug, yes because we're looking not just for service VM's but all > > resources provisioned by a service. So, to Matt's comment about a > > blackbox DBaaS, the VM's, storage, snapshots, ... they should all be > > owned by the service, charged to a users quota but not visible to the > > user directly. > > I still don't understand. If you have entities that represent the > DBaaS "host" or "database" or "database backup" or whatever, then > you put a quota on those entities and you bill for them. If the > database actually runs in a VM or the backup is a snapshot, those > are implementation details. You don't want to have to rewrite your > quota management or billing integration if those details change. > > Doug > > > > > > > > > Jay: > > > > > Frankly, I believe all of these types of services should be built > > > as applications that run on OpenStack (or other) > > > infrastructure. In other words, they should not be part of the > > > infrastructure itself. > > > > > > There's really no need for a user of a DBaaS to have access to the > > > host or hosts the DB is running on. If the user really wanted > > > that, they would just spin up a VM/baremetal server and install > > > the thing themselves. > > > > and subsequently in follow-up with Zane: > > > > > Think only in terms of what a user of a DBaaS really wants. At the > > > end of the day, all they want is an address in the cloud where they > > > can point their application to write and read data from. > > > ... > > > At the end of the day, I think Trove is best implemented as a hosted > > > application that exposes an API to its users that is entirely > > > separate from the underlying infrastructure APIs like > > > Cinder/Nova/Neutron. > > > > Amrith: Yes, I agree, +1000 > > > > > > > > Clint (in response to Jay's proposal regarding the service making all > > resources multi-tenant) raised a concern about having multi-tenant > > shared resources. The issue is with ensuring separation between > > tenants (don't want to use the word isolation because this is database > > related). > > > > Amrith: yes, definitely a concern and one that we don't have today > > because each DB is a VM of its own. Personally, I'd rather stick with > > that construct, one DB per VM/container/baremetal and leave that be > > the separation boundary. > > > > > > > > Zane: Discomfort over throwing out working code, grass is greener on > > the other side, is there anything to salvage? > > > > Amrith: Yes, there is certainly a 'grass is greener with a rewrite' > > fallacy. But, there is stuff that can be salvaged. The elements are > > st
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
Yeah. Understood. Just was responding to the question, why you would ever want to do X. There are reasons. Being out of scope is an ok answer though. Thanks, Kevin From: Amrith Kumar [amrith.ku...@gmail.com] Sent: Thursday, July 13, 2017 9:22 PM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove Kevin, In interests of 'keeping it simple', I'm going to try and prioritize the use-cases and pick implementation strategies which target the higher priority ones without needlessly excluding other (lower priority) ones. Thanks, -amrith -- Amrith Kumar P.S. Verizon is hiring OpenStack engineers nationwide. If you are interested, please contact me or visit https://t.co/gGoUzYvqbE On Wed, Jul 12, 2017 at 5:46 PM, Fox, Kevin M mailto:kevin@pnnl.gov>> wrote: There is a use case where some sites have folks buy whole bricks of compute nodes that get added to the overarching cloud, but using AZ's or HostAggregates/Flavors to dedicate the hardware to the users. You might want to land the db vm on the hardware for that project and one would expect the normal quota would be dinged for it rather then a special trove quota. Otherwise they may have more quota then the hosts can actually handle. Thanks, Kevin From: Doug Hellmann [d...@doughellmann.com<mailto:d...@doughellmann.com>] Sent: Wednesday, July 12, 2017 6:57 AM To: openstack-dev Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove Excerpts from Amrith Kumar's message of 2017-07-12 06:14:28 -0500: > All: > > First, let me thank all of you who responded and provided feedback > on what I wrote. I've summarized what I heard below and am posting > it as one consolidated response rather than responding to each > of your messages and making this thread even deeper. > > As I say at the end of this email, I will be setting up a session at > the Denver PTG to specifically continue this conversation and hope > you will all be able to attend. As soon as time slots for PTG are > announced, I will try and pick this slot and request that you please > attend. > > > > Thierry: naming issue; call it Hoard if it does not have a migration > path. > > > > Kevin: use a container approach with k8s as the orchestration > mechanism, addresses multiple issues including performance. Trove to > provide containers for multiple components which cooperate to provide > a single instance of a database or cluster. Don't put all components > (agent, monitoring, database) in a single VM, decoupling makes > migraiton and upgrades easier and allows trove to reuse database > vendor supplied containers. Performance of databases in VM's poor > compared to databases on bare-metal. > > > > Doug Hellmann: > > > Does "service VM" need to be a first-class thing? Akanda creates > > them, using a service user. The VMs are tied to a "router" which is > > the billable resource that the user understands and interacts with > > through the API. > > Amrith: Doug, yes because we're looking not just for service VM's but all > resources provisioned by a service. So, to Matt's comment about a > blackbox DBaaS, the VM's, storage, snapshots, ... they should all be > owned by the service, charged to a users quota but not visible to the > user directly. I still don't understand. If you have entities that represent the DBaaS "host" or "database" or "database backup" or whatever, then you put a quota on those entities and you bill for them. If the database actually runs in a VM or the backup is a snapshot, those are implementation details. You don't want to have to rewrite your quota management or billing integration if those details change. Doug > > > > Jay: > > > Frankly, I believe all of these types of services should be built > > as applications that run on OpenStack (or other) > > infrastructure. In other words, they should not be part of the > > infrastructure itself. > > > > There's really no need for a user of a DBaaS to have access to the > > host or hosts the DB is running on. If the user really wanted > > that, they would just spin up a VM/baremetal server and install > > the thing themselves. > > and subsequently in follow-up with Zane: > > > Think only in terms of what a user of a DBaaS really wants. At the > > end of the day, all they want is an address in the cloud where they > > can point their application to write and read data from. > > ... > > At the end of the day, I think Trove is best implemen
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
Kevin, In interests of 'keeping it simple', I'm going to try and prioritize the use-cases and pick implementation strategies which target the higher priority ones without needlessly excluding other (lower priority) ones. Thanks, -amrith -- Amrith Kumar P.S. Verizon is hiring OpenStack engineers nationwide. If you are interested, please contact me or visit https://t.co/gGoUzYvqbE On Wed, Jul 12, 2017 at 5:46 PM, Fox, Kevin M wrote: > There is a use case where some sites have folks buy whole bricks of > compute nodes that get added to the overarching cloud, but using AZ's or > HostAggregates/Flavors to dedicate the hardware to the users. > > You might want to land the db vm on the hardware for that project and one > would expect the normal quota would be dinged for it rather then a special > trove quota. Otherwise they may have more quota then the hosts can actually > handle. > > Thanks, > Kevin > > From: Doug Hellmann [d...@doughellmann.com] > Sent: Wednesday, July 12, 2017 6:57 AM > To: openstack-dev > Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect > Trove > > Excerpts from Amrith Kumar's message of 2017-07-12 06:14:28 -0500: > > All: > > > > First, let me thank all of you who responded and provided feedback > > on what I wrote. I've summarized what I heard below and am posting > > it as one consolidated response rather than responding to each > > of your messages and making this thread even deeper. > > > > As I say at the end of this email, I will be setting up a session at > > the Denver PTG to specifically continue this conversation and hope > > you will all be able to attend. As soon as time slots for PTG are > > announced, I will try and pick this slot and request that you please > > attend. > > > > > > > > Thierry: naming issue; call it Hoard if it does not have a migration > > path. > > > > > > > > Kevin: use a container approach with k8s as the orchestration > > mechanism, addresses multiple issues including performance. Trove to > > provide containers for multiple components which cooperate to provide > > a single instance of a database or cluster. Don't put all components > > (agent, monitoring, database) in a single VM, decoupling makes > > migraiton and upgrades easier and allows trove to reuse database > > vendor supplied containers. Performance of databases in VM's poor > > compared to databases on bare-metal. > > > > > > > > Doug Hellmann: > > > > > Does "service VM" need to be a first-class thing? Akanda creates > > > them, using a service user. The VMs are tied to a "router" which is > > > the billable resource that the user understands and interacts with > > > through the API. > > > > Amrith: Doug, yes because we're looking not just for service VM's but all > > resources provisioned by a service. So, to Matt's comment about a > > blackbox DBaaS, the VM's, storage, snapshots, ... they should all be > > owned by the service, charged to a users quota but not visible to the > > user directly. > > I still don't understand. If you have entities that represent the > DBaaS "host" or "database" or "database backup" or whatever, then > you put a quota on those entities and you bill for them. If the > database actually runs in a VM or the backup is a snapshot, those > are implementation details. You don't want to have to rewrite your > quota management or billing integration if those details change. > > Doug > > > > > > > > > Jay: > > > > > Frankly, I believe all of these types of services should be built > > > as applications that run on OpenStack (or other) > > > infrastructure. In other words, they should not be part of the > > > infrastructure itself. > > > > > > There's really no need for a user of a DBaaS to have access to the > > > host or hosts the DB is running on. If the user really wanted > > > that, they would just spin up a VM/baremetal server and install > > > the thing themselves. > > > > and subsequently in follow-up with Zane: > > > > > Think only in terms of what a user of a DBaaS really wants. At the > > > end of the day, all they want is an address in the cloud where they > > > can point their application to write and read data from. > > > ... > > > At the end of the day, I think Trove is best implemented as a hosted > > > applic
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
There is a use case where some sites have folks buy whole bricks of compute nodes that get added to the overarching cloud, but using AZ's or HostAggregates/Flavors to dedicate the hardware to the users. You might want to land the db vm on the hardware for that project and one would expect the normal quota would be dinged for it rather then a special trove quota. Otherwise they may have more quota then the hosts can actually handle. Thanks, Kevin From: Doug Hellmann [d...@doughellmann.com] Sent: Wednesday, July 12, 2017 6:57 AM To: openstack-dev Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove Excerpts from Amrith Kumar's message of 2017-07-12 06:14:28 -0500: > All: > > First, let me thank all of you who responded and provided feedback > on what I wrote. I've summarized what I heard below and am posting > it as one consolidated response rather than responding to each > of your messages and making this thread even deeper. > > As I say at the end of this email, I will be setting up a session at > the Denver PTG to specifically continue this conversation and hope > you will all be able to attend. As soon as time slots for PTG are > announced, I will try and pick this slot and request that you please > attend. > > > > Thierry: naming issue; call it Hoard if it does not have a migration > path. > > > > Kevin: use a container approach with k8s as the orchestration > mechanism, addresses multiple issues including performance. Trove to > provide containers for multiple components which cooperate to provide > a single instance of a database or cluster. Don't put all components > (agent, monitoring, database) in a single VM, decoupling makes > migraiton and upgrades easier and allows trove to reuse database > vendor supplied containers. Performance of databases in VM's poor > compared to databases on bare-metal. > > > > Doug Hellmann: > > > Does "service VM" need to be a first-class thing? Akanda creates > > them, using a service user. The VMs are tied to a "router" which is > > the billable resource that the user understands and interacts with > > through the API. > > Amrith: Doug, yes because we're looking not just for service VM's but all > resources provisioned by a service. So, to Matt's comment about a > blackbox DBaaS, the VM's, storage, snapshots, ... they should all be > owned by the service, charged to a users quota but not visible to the > user directly. I still don't understand. If you have entities that represent the DBaaS "host" or "database" or "database backup" or whatever, then you put a quota on those entities and you bill for them. If the database actually runs in a VM or the backup is a snapshot, those are implementation details. You don't want to have to rewrite your quota management or billing integration if those details change. Doug > > > > Jay: > > > Frankly, I believe all of these types of services should be built > > as applications that run on OpenStack (or other) > > infrastructure. In other words, they should not be part of the > > infrastructure itself. > > > > There's really no need for a user of a DBaaS to have access to the > > host or hosts the DB is running on. If the user really wanted > > that, they would just spin up a VM/baremetal server and install > > the thing themselves. > > and subsequently in follow-up with Zane: > > > Think only in terms of what a user of a DBaaS really wants. At the > > end of the day, all they want is an address in the cloud where they > > can point their application to write and read data from. > > ... > > At the end of the day, I think Trove is best implemented as a hosted > > application that exposes an API to its users that is entirely > > separate from the underlying infrastructure APIs like > > Cinder/Nova/Neutron. > > Amrith: Yes, I agree, +1000 > > > > Clint (in response to Jay's proposal regarding the service making all > resources multi-tenant) raised a concern about having multi-tenant > shared resources. The issue is with ensuring separation between > tenants (don't want to use the word isolation because this is database > related). > > Amrith: yes, definitely a concern and one that we don't have today > because each DB is a VM of its own. Personally, I'd rather stick with > that construct, one DB per VM/container/baremetal and leave that be > the separation boundary. > > > > Zane: Discomfort over throwing out working code, grass is greener on > the other side, is there anything
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
Excerpts from Amrith Kumar's message of 2017-07-12 06:14:28 -0500: > All: > > First, let me thank all of you who responded and provided feedback > on what I wrote. I've summarized what I heard below and am posting > it as one consolidated response rather than responding to each > of your messages and making this thread even deeper. > > As I say at the end of this email, I will be setting up a session at > the Denver PTG to specifically continue this conversation and hope > you will all be able to attend. As soon as time slots for PTG are > announced, I will try and pick this slot and request that you please > attend. > > > > Thierry: naming issue; call it Hoard if it does not have a migration > path. > > > > Kevin: use a container approach with k8s as the orchestration > mechanism, addresses multiple issues including performance. Trove to > provide containers for multiple components which cooperate to provide > a single instance of a database or cluster. Don't put all components > (agent, monitoring, database) in a single VM, decoupling makes > migraiton and upgrades easier and allows trove to reuse database > vendor supplied containers. Performance of databases in VM's poor > compared to databases on bare-metal. > > > > Doug Hellmann: > > > Does "service VM" need to be a first-class thing? Akanda creates > > them, using a service user. The VMs are tied to a "router" which is > > the billable resource that the user understands and interacts with > > through the API. > > Amrith: Doug, yes because we're looking not just for service VM's but all > resources provisioned by a service. So, to Matt's comment about a > blackbox DBaaS, the VM's, storage, snapshots, ... they should all be > owned by the service, charged to a users quota but not visible to the > user directly. I still don't understand. If you have entities that represent the DBaaS "host" or "database" or "database backup" or whatever, then you put a quota on those entities and you bill for them. If the database actually runs in a VM or the backup is a snapshot, those are implementation details. You don't want to have to rewrite your quota management or billing integration if those details change. Doug > > > > Jay: > > > Frankly, I believe all of these types of services should be built > > as applications that run on OpenStack (or other) > > infrastructure. In other words, they should not be part of the > > infrastructure itself. > > > > There's really no need for a user of a DBaaS to have access to the > > host or hosts the DB is running on. If the user really wanted > > that, they would just spin up a VM/baremetal server and install > > the thing themselves. > > and subsequently in follow-up with Zane: > > > Think only in terms of what a user of a DBaaS really wants. At the > > end of the day, all they want is an address in the cloud where they > > can point their application to write and read data from. > > ... > > At the end of the day, I think Trove is best implemented as a hosted > > application that exposes an API to its users that is entirely > > separate from the underlying infrastructure APIs like > > Cinder/Nova/Neutron. > > Amrith: Yes, I agree, +1000 > > > > Clint (in response to Jay's proposal regarding the service making all > resources multi-tenant) raised a concern about having multi-tenant > shared resources. The issue is with ensuring separation between > tenants (don't want to use the word isolation because this is database > related). > > Amrith: yes, definitely a concern and one that we don't have today > because each DB is a VM of its own. Personally, I'd rather stick with > that construct, one DB per VM/container/baremetal and leave that be > the separation boundary. > > > > Zane: Discomfort over throwing out working code, grass is greener on > the other side, is there anything to salvage? > > Amrith: Yes, there is certainly a 'grass is greener with a rewrite' > fallacy. But, there is stuff that can be salvaged. The elements are > still good, they are separable and can be used with the new > project. Much of the controller logic however will fall by the > wayside. > > In a similar vein, Clint asks about the elements that Trove provides, > "how has that worked out". > > Amrith: Honestly, not well. Trove only provided reference elements > suitable for development use. Never really production hardened > ones. For example, the image elements trove provides don't bake the > guest agent in; they assume that at VM launch, the guest agent code > will be slurped (technical term) from the controller and > launched. Great for debugging, not great for production. That is > something that should change. But, equally, I've heard disagreements > saying that slurping the guest agent at runtime is clever and good > in production. > > > > Zane: consider using Mistral for workflow. > > > The disadvantage, obviously, is that it requires the cloud to offer > > Mistral as-a-Service, w
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
All: First, let me thank all of you who responded and provided feedback on what I wrote. I've summarized what I heard below and am posting it as one consolidated response rather than responding to each of your messages and making this thread even deeper. As I say at the end of this email, I will be setting up a session at the Denver PTG to specifically continue this conversation and hope you will all be able to attend. As soon as time slots for PTG are announced, I will try and pick this slot and request that you please attend. Thierry: naming issue; call it Hoard if it does not have a migration path. Kevin: use a container approach with k8s as the orchestration mechanism, addresses multiple issues including performance. Trove to provide containers for multiple components which cooperate to provide a single instance of a database or cluster. Don't put all components (agent, monitoring, database) in a single VM, decoupling makes migraiton and upgrades easier and allows trove to reuse database vendor supplied containers. Performance of databases in VM's poor compared to databases on bare-metal. Doug Hellmann: > Does "service VM" need to be a first-class thing? Akanda creates > them, using a service user. The VMs are tied to a "router" which is > the billable resource that the user understands and interacts with > through the API. Amrith: Doug, yes because we're looking not just for service VM's but all resources provisioned by a service. So, to Matt's comment about a blackbox DBaaS, the VM's, storage, snapshots, ... they should all be owned by the service, charged to a users quota but not visible to the user directly. Jay: > Frankly, I believe all of these types of services should be built > as applications that run on OpenStack (or other) > infrastructure. In other words, they should not be part of the > infrastructure itself. > > There's really no need for a user of a DBaaS to have access to the > host or hosts the DB is running on. If the user really wanted > that, they would just spin up a VM/baremetal server and install > the thing themselves. and subsequently in follow-up with Zane: > Think only in terms of what a user of a DBaaS really wants. At the > end of the day, all they want is an address in the cloud where they > can point their application to write and read data from. > ... > At the end of the day, I think Trove is best implemented as a hosted > application that exposes an API to its users that is entirely > separate from the underlying infrastructure APIs like > Cinder/Nova/Neutron. Amrith: Yes, I agree, +1000 Clint (in response to Jay's proposal regarding the service making all resources multi-tenant) raised a concern about having multi-tenant shared resources. The issue is with ensuring separation between tenants (don't want to use the word isolation because this is database related). Amrith: yes, definitely a concern and one that we don't have today because each DB is a VM of its own. Personally, I'd rather stick with that construct, one DB per VM/container/baremetal and leave that be the separation boundary. Zane: Discomfort over throwing out working code, grass is greener on the other side, is there anything to salvage? Amrith: Yes, there is certainly a 'grass is greener with a rewrite' fallacy. But, there is stuff that can be salvaged. The elements are still good, they are separable and can be used with the new project. Much of the controller logic however will fall by the wayside. In a similar vein, Clint asks about the elements that Trove provides, "how has that worked out". Amrith: Honestly, not well. Trove only provided reference elements suitable for development use. Never really production hardened ones. For example, the image elements trove provides don't bake the guest agent in; they assume that at VM launch, the guest agent code will be slurped (technical term) from the controller and launched. Great for debugging, not great for production. That is something that should change. But, equally, I've heard disagreements saying that slurping the guest agent at runtime is clever and good in production. Zane: consider using Mistral for workflow. > The disadvantage, obviously, is that it requires the cloud to offer > Mistral as-a-Service, which currently doesn't include nearly as many > clouds as I'd like. Amrith: Yes, as we discussed, we are in agreement with both parts of this recommendation. Zane, Jay and Dims: a subtle distinction between Tessmaster and Magnum (I want a database figure out the lower layers, vs. I want a k8s cluster). Zane: Fun fact: Trove started out as a *complete fork* of Nova(!). Amrith: Not fun at all :) Never, ever, ever, ever f5g do that again. Yeah, sure, if you can have i18n, and k8s, I can have f5g :) Thierry: > We generally need to be very careful about creating dependencies > between OpenStack projects. > ... > I understand it's a hard trade-off: you want to reuse functional
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
Manoj, It would be great if these teams were brought into this conversation. I am not averse to the evolutionary approach, merely observing that in the absence of commitment and contributors who wish to participate in this evolution, we will be unable to sustain the project. Regarding your view that it is feasible and rational to evolve Trove, I would to understand the rationale behind those judgements and the resources that you believe that it will take to make those possible, and a clear statement of what your/IBM's commitment of resources to the project would be. Thanks, -amrith On Thu, Jun 29, 2017 at 9:33 AM, Manoj Kumar wrote: > Amrith: Some comments regarding the scarcity of deployments, and the > proposed approach. > > We know of multiple teams that are now independently charging down and > investing in a Trove path. They are at various stages of deployment and > are beyond tire-kicking. They are beginning to build dev/test environments, > some are building commercial products, and we fully expect some people to > be in production with Trove by the end of the year. Collectively, we need > to start bridging and engaging these people into the Trove community. > > We also strongly believe that we need an evolutionary approach to moving > Trove forward vs. the revolutionary approach that is being proposed. Our > deeply held view is that it is feasible and rationale to evolve Trove as it > exists today. We agree that there are architectural issues that have to be > addressed. Let's start working on addressing these issues as well as the > current currency issues but in a evolutionary way. The revolutionary > approach will halt all progress and set a bad precedent, and we believe > that it will cause people to walk away from the community and likely > OpenStack as well. > > - Manoj > > > > From:Amrith Kumar > To:"OpenStack Development Mailing List (not for usage questions)" > > Date: 06/18/2017 06:41 AM > Subject:[openstack-dev] [trove][all][tc] A proposal to > rearchitect Trove > -- > > > > Trove has evolved rapidly over the past several years, since integration > in IceHouse when it only supported single instances of a few databases. > Today it supports a dozen databases including clusters and replication. > > The user survey [1] indicates that while there is strong interest in the > project, there are few large production deployments that are known of (by > the development team). > > Recent changes in the OpenStack community at large (company realignments, > acquisitions, layoffs) and the Trove community in particular, coupled with > a mounting burden of technical debt have prompted me to make this proposal > to re-architect Trove. > > This email summarizes several of the issues that face the project, both > structurally and architecturally. This email does not claim to include a > detailed specification for what the new Trove would look like, merely the > recommendation that the community should come together and develop one so > that the project can be sustainable and useful to those who wish to use it > in the future. > > TL;DR > > Trove, with support for a dozen or so databases today, finds itself in a > bind because there are few developers, and a code-base with a significant > amount of technical debt. > > Some architectural choices which the team made over the years have > consequences which make the project less than ideal for deployers. > > Given that there are no major production deployments of Trove at present, > this provides us an opportunity to reset the project, learn from our v1 and > come up with a strong v2. > > An important aspect of making this proposal work is that we seek to > eliminate the effort (planning, and coding) involved in migrating existing > Trove v1 deployments to the proposed Trove v2. Effectively, with work > beginning on Trove v2 as proposed here, Trove v1 as released with Pike will > be marked as deprecated and users will have to migrate to Trove v2 when it > becomes available. > > While I would very much like to continue to support the users on Trove v1 > through this transition, the simple fact is that absent community > participation this will be impossible. Furthermore, given that there are no > production deployments of Trove at this time, it seems pointless to build > that upgrade path from Trove v1 to Trove v2; it would be the proverbial > bridge from nowhere. > > This (previous) statement is, I realize, contentious. There are those who > have told me that an upgrade path must be provided, and there are those who > have told me of unnamed deployments of Trove that would suffer. To this, > all I can say is that
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
Amrith: Some comments regarding the scarcity of deployments, and the proposed approach. We know of multiple teams that are now independently charging down and investing in a Trove path. They are at various stages of deployment and are beyond tire-kicking. They are beginning to build dev/test environments, some are building commercial products, and we fully expect some people to be in production with Trove by the end of the year. Collectively, we need to start bridging and engaging these people into the Trove community. We also strongly believe that we need an evolutionary approach to moving Trove forward vs. the revolutionary approach that is being proposed. Our deeply held view is that it is feasible and rationale to evolve Trove as it exists today. We agree that there are architectural issues that have to be addressed. Let's start working on addressing these issues as well as the current currency issues but in a evolutionary way. The revolutionary approach will halt all progress and set a bad precedent, and we believe that it will cause people to walk away from the community and likely OpenStack as well. - Manoj From: Amrith Kumar To: "OpenStack Development Mailing List (not for usage questions)" Date: 06/18/2017 06:41 AM Subject: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove Trove has evolved rapidly over the past several years, since integration in IceHouse when it only supported single instances of a few databases. Today it supports a dozen databases including clusters and replication. The user survey [1] indicates that while there is strong interest in the project, there are few large production deployments that are known of (by the development team). Recent changes in the OpenStack community at large (company realignments, acquisitions, layoffs) and the Trove community in particular, coupled with a mounting burden of technical debt have prompted me to make this proposal to re-architect Trove. This email summarizes several of the issues that face the project, both structurally and architecturally. This email does not claim to include a detailed specification for what the new Trove would look like, merely the recommendation that the community should come together and develop one so that the project can be sustainable and useful to those who wish to use it in the future. TL;DR Trove, with support for a dozen or so databases today, finds itself in a bind because there are few developers, and a code-base with a significant amount of technical debt. Some architectural choices which the team made over the years have consequences which make the project less than ideal for deployers. Given that there are no major production deployments of Trove at present, this provides us an opportunity to reset the project, learn from our v1 and come up with a strong v2. An important aspect of making this proposal work is that we seek to eliminate the effort (planning, and coding) involved in migrating existing Trove v1 deployments to the proposed Trove v2. Effectively, with work beginning on Trove v2 as proposed here, Trove v1 as released with Pike will be marked as deprecated and users will have to migrate to Trove v2 when it becomes available. While I would very much like to continue to support the users on Trove v1 through this transition, the simple fact is that absent community participation this will be impossible. Furthermore, given that there are no production deployments of Trove at this time, it seems pointless to build that upgrade path from Trove v1 to Trove v2; it would be the proverbial bridge from nowhere. This (previous) statement is, I realize, contentious. There are those who have told me that an upgrade path must be provided, and there are those who have told me of unnamed deployments of Trove that would suffer. To this, all I can say is that if an upgrade path is of value to you, then please commit the development resources to participate in the community to make that possible. But equally, preventing a v2 of Trove or delaying it will only make the v1 that we have today less valuable. We have learned a lot from v1, and the hope is that we can address that in v2. Some of the more significant things that I have learned are: - We should adopt a versioned front-end API from the very beginning; making the REST API versioned is not a ‘v2 feature’ - A guest agent running on a tenant instance, with connectivity to a shared management message bus is a security loophole; encrypting traffic, per-tenant-passwords, and any other scheme is merely lipstick on a security hole - Reliance on Nova for compute resources is fine, but dependence on Nova VM specific capabilities (like instance rebuild) is not; it makes things like containers or bare-metal second class citizens - A fair portion of what Trove does is resource orchestration; don’t reinvent the wheel, there’s Heat for t
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
hard to get all of it, users can assume its all there, >>> and devs don't have many silo's to cross to implement features that touch >>> multiple pieces. >>> >> >> I think it's kind of hysterical that you're advocating a monolithic >> approach when the thing you're advocating (k8s) is all about enabling >> non-monolithic microservices architectures. >> >> Look, the fact of the matter is that OpenStack's mission is larger than >> that of Kubernetes. And to say that "Ops don't have to work hard" to get >> and maintain a Kubernetes deployment (which, frankly, tends to be dozens of >> Kubernetes deployments, one for each tenant/project/namespace) is >> completely glossing over the fact that by abstracting away the >> infrastructure (k8s' "cloud provider" concept), Kubernetes developers >> simply get to ignore some of the hardest and trickiest parts of operations. >> >> So, let's try to compare apples to apples, shall we? >> >> It sounds like the end goal that you're advocating -- more than anything >> else -- is an easy-to-install package of OpenStack services that provides a >> Kubernetes-like experience for application developers. >> >> I 100% agree with that goal. 100%. >> >> But pulling Neutron, Cinder, Keystone, Designate, Barbican, and Octavia >> back into Nova is not the way to do that. You're trying to solve a >> packaging and installation problem with a code structure solution. >> >> In fact, if you look at the Kubernetes development community, you see the >> *opposite* direction being taken: they have broken out and are actively >> breaking out large pieces of the Kubernetes repository/codebase into >> separate repositories and addons/plugins. And this is being done to >> *accelerate* development of Kubernetes in very much the same way that >> splitting services out of Nova was done to accelerate the development of >> those various pieces of infrastructure code. >> >> This core functionality being combined has allowed them to land features >>> that are really important to users but has proven difficult for OpenStack >>> to do because of the silo's. OpenStack's general pattern has been, stand up >>> a new service for new feature, then no one wants to depend on it so its >>> ignored and each silo reimplements a lesser version of it themselves. >>> >> >> I disagree. I believe the reason Kubernetes is able to land features that >> are "really important to users" is primarily due to the following reasons: >> >> 1) The Kubernetes technical leadership strongly resists pressure from >> vendors to add yet-another-specialized-feature to the codebase. This >> ability to say "No" pays off in spades with regards to stability and focus. >> >> 2) The mission of Kubernetes is much smaller than OpenStack. If the >> OpenStack community were able to say "OpenStack is a container >> orchestration system", and not "OpenStack is a ubiquitous open source cloud >> operating system", we'd probably be able to deliver features in a more >> focused fashion. >> >> The OpenStack commons then continues to suffer. >>> >>> We need to stop this destructive cycle. >>> >>> OpenStack needs to figure out how to increase its commons. Both >>> internally and externally. etcd as a common service was a step in the right >>> direction. >>> >>> I think k8s needs to be another common service all the others can rely >>> on. That could greatly simplify the rest of the OpenStack projects as a lot >>> of its functionality no longer has to be implemented in each project. >>> >> >> I don't disagree with the goal of being able to rely on Kubernetes for >> many things. But relying on Kubernetes doesn't solve the "I want some >> easy-to-install infrastructure" problem. Nor does it solve the types of >> advanced networking scenarios that the NFV community requires. >> >> We also need a way to break down the silo walls and allow more cross >>> project collaboration for features. I fear the new push for letting >>> projects run standalone will make this worse, not better, further >>> fracturing OpenStack. >>> >> >> Perhaps you are referring to me with the above? As I said on Twitter, >> "Make your #OpenStack project usable by and useful for things outside of >> the OpenStack ecosystem. Fewer deps. Do one thing well. Solid APIs." >> >&
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
Kevin, just one comment inline below. On Thu, Jun 22, 2017 at 3:33 PM, Fox, Kevin M wrote: > No, I'm not necessarily advocating a monolithic approach. > > I'm saying that they have decided to start with functionality and accept > whats needed to get the task done. Theres not really such strong walls > between the various functionality, rbac/secrets/kublet/etc. They don't > spawn off a whole new project just to add functionality. they do so only > when needed. They also don't balk at one feature depending on another. > > rbac's important, so they implemented it. ssl cert management was > important. so they added that. adding a feature that restricts secret > downloads only to the physical nodes need them, could then reuse the rbac > system and ssl cert management. > > Their sigs are more oriented to features/functionality (or catagories > there of), not as much specific components. We need to do X. X may involve > changes to components A and B. > > OpenStack now tends to start with A and B and we try and work backwards > towards implementing X, which is hard due to the strong walls and unclear > ownership of the feature. And the general solution has been to try and make > C but not commit to C being in the core so users cant depend on it which > hasn't proven to be a very successful pattern. > > Your right, they are breaking up their code base as needed, like nova did. > I'm coming around to that being a pretty good approach to some things. > starting things is simpler, and if it ends up not needing its own whole > project, then it doesn't get one. if it needs one, then it gets one. Its > not by default, start whole new project with db user, db schema, api, > scheduler, etc. And the project might not end up with daemons split up in > exactly the way you would expect if you prepoptomized breaking off a > project not knowing exactly how it might integrate with everything else. > > Maybe the porcelain api that's been discussed for a while is part of the > solution. initial stuff can prototyped/start there and break off as needed > to separate projects and moved around without the user needing to know > where it ends up. > > Your right that OpenStack's scope is much grater. and think that the > commons are even more important in that case. If it doesn't have a solid > base, every project has to re-implement its own base. That takes a huge > amount of manpower all around. Its not sustainable. > > I guess we've gotten pretty far away from discussing Trove at this point. > Please keep the conversation going. > > Thanks, > Kevin > ____________________ > From: Jay Pipes [jaypi...@gmail.com] > Sent: Thursday, June 22, 2017 10:05 AM > To: openstack-dev@lists.openstack.org > Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect > Trove > > On 06/22/2017 11:59 AM, Fox, Kevin M wrote: > > My $0.02. > > > > That view of dependencies is why Kubernetes development is outpacing > OpenStacks and some users are leaving IMO. Not trying to be mean here but > trying to shine some light on this issue. > > > > Kubernetes at its core has essentially something kind of equivalent to > keystone (k8s rbac), nova (container mgmt), cinder (pv/pvc/storageclasses), > heat with convergence (deployments/daemonsets/etc), barbican (secrets), > designate (kube-dns), and octavia (kube-proxy,svc,ingress) in one unit. Ops > dont have to work hard to get all of it, users can assume its all there, > and devs don't have many silo's to cross to implement features that touch > multiple pieces. > > I think it's kind of hysterical that you're advocating a monolithic > approach when the thing you're advocating (k8s) is all about enabling > non-monolithic microservices architectures. > > Look, the fact of the matter is that OpenStack's mission is larger than > that of Kubernetes. And to say that "Ops don't have to work hard" to get > and maintain a Kubernetes deployment (which, frankly, tends to be dozens > of Kubernetes deployments, one for each tenant/project/namespace) is > completely glossing over the fact that by abstracting away the > infrastructure (k8s' "cloud provider" concept), Kubernetes developers > simply get to ignore some of the hardest and trickiest parts of operations. > > So, let's try to compare apples to apples, shall we? > > It sounds like the end goal that you're advocating -- more than anything > else -- is an easy-to-install package of OpenStack services that > provides a Kubernetes-like experience for application developers. > > I 100% agree with that goal. 100%. > > But
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
On 06/22/2017 11:59 AM, Fox, Kevin M wrote: My $0.02. That view of dependencies is why Kubernetes development is outpacing OpenStacks and some users are leaving IMO. Not trying to be mean here but trying to shine some light on this issue. Kubernetes at its core has essentially something kind of equivalent to keystone (k8s rbac), nova (container mgmt), cinder (pv/pvc/storageclasses), heat with convergence (deployments/daemonsets/etc), barbican (secrets), designate (kube-dns), and octavia (kube-proxy,svc,ingress) in one unit. Ops dont have to work hard to get all of it, users can assume its all there, and devs don't have many silo's to cross to implement features that touch multiple pieces. This core functionality being combined has allowed them to land features that are really important to users but has proven difficult for OpenStack to do because of the silo's. OpenStack's general pattern has been, stand up a new service for new feature, then no one wants to depend on it so its ignored and each silo reimplements a lesser version of it themselves. The OpenStack commons then continues to suffer. We need to stop this destructive cycle. OpenStack needs to figure out how to increase its commons. Both internally and externally. etcd as a common service was a step in the right direction. +1 to this, and it's a similar theme to my dismay a few weeks ago when I realized projects are looking to ditch oslo rather than improve it; since then I got to chase down a totally avoidable problem in Zaqar that's been confusing dozens of people because zaqar implemented their database layer as direct-to-SQLAlchemy rather than using oslo.db (https://bugs.launchpad.net/tripleo/+bug/1691951) and missed out on some basic stability features that oslo.db turns on. There is a balance to be struck between monolithic and expansive for sure, but I think the monolith-phobia may be affecting the quality of the product. It is possible to have clean modularity and separation of concerns in a codebase while still having tighter dependencies, it just takes more discipline to monitor the boundaries. I think k8s needs to be another common service all the others can rely on. That could greatly simplify the rest of the OpenStack projects as a lot of its functionality no longer has to be implemented in each project. We also need a way to break down the silo walls and allow more cross project collaboration for features. I fear the new push for letting projects run standalone will make this worse, not better, further fracturing OpenStack. Thanks, Kevin From: Thierry Carrez [thie...@openstack.org] Sent: Thursday, June 22, 2017 12:58 AM To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove Fox, Kevin M wrote: [...] If you build a Tessmaster clone just to do mariadb, then you share nothing with the other communities and have to reinvent the wheel, yet again. Operators load increases because the tool doesn't function like other tools. If you rely on a container orchestration engine that's already cross cloud that can be easily deployed by user or cloud operator, and fill in the gaps with what Trove wants to support, easy management of db's, you get to reuse a lot of the commons and the users slight increase in investment in dealing with the bit of extra plumbing in there allows other things to also be easily added to their cluster. Its very rare that a user would need to deploy/manage only a database. The net load on the operator decreases, not increases. I think the user-side tool could totally deploy on Kubernetes clusters -- if that was the only possible target that would make it a Kubernetes tool more than an open infrastructure tool, but that's definitely a possibility. I'm not sure work is needed there though, there are already tools (or charts) doing that ? For a server-side approach where you want to provide a DB-provisioning API, I fear that making the functionality depend on K8s would make TroveV2/Hoard would not only depend on Heat and Nova, but also depend on something that would deploy a Kubernetes cluster (Magnum?), which would likely hurt its adoption (and reusability in simpler setups). Since databases would just work perfectly well in VMs, it feels like a gratuitous dependency addition ? We generally need to be very careful about creating dependencies between OpenStack projects. On one side there are base services (like Keystone) that we said it was alright to depend on, but depending on anything else is likely to reduce adoption. Magnum adoption suffers from its dependency on Heat. If Heat starts depending on Zaqar, we make the problem worse. I understand it's a hard trade-off: you want to reuse functionality rather than reinvent it in every project... we just need
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
On 23/06/17 05:31, Thierry Carrez wrote: Zane Bitter wrote: But back in the day we had a process (incubation) for adding stuff to OpenStack that it made sense to depend on being there. It was a highly imperfect process. We got rid of that process with the big tent reform, but didn't really replace it with anything at all. Tags never evolved into a replacement as I hoped they would. So now we have a bunch of things that are integral to building a "Kubernetes-like experience for application developers" - secret storage, DNS, load balancing, asynchronous messaging - that exist but are not in most clouds. Yet another tangent in that thread, but you seem to regret a past that never happened. It kind of did. The TC used to require that new projects graduating into OpenStack didn't reimplement anything that an existing project in the integrated release already did. e.g. Sahara and Trove were required to use Heat for orchestration rather than rolling their own orchestration. The very strong implication was that once something was officially included in OpenStack you didn't develop the same thing again. It's true that nothing was ever enforced against existing projects (the only review was at incubation/graduation), but then again I can't think of a situation where it would have come up at that time. The "integrated release" was never about stuff that you can "depend on being there". It was about things that were tested to work well together, and released together. Projects were incubating until they were deemed mature-enough (and embedded-enough in our community) that it was fine for other projects to take the hit to be tested with them, and take the risk of being released together. I don't blame you for thinking otherwise: since the integrated release was the only answer we gave, everyone assumed it answered their specific question[1]. And that was why we needed to get rid of it. I agree and I supported getting rid of it. But not all of the roles it fulfilled (intended or otherwise) were replaced with anything. One of the things that fell by the wayside was the sense some of us had that we were building an integrated product with flexible deployment options, rather than a series of disconnected islands. If it was really about stuff you can "depend on being there" then most OpenStack clouds would have had Swift, Ceilometer, Trove and Sahara. Stuff you can "depend on being there" is a relatively-new concept: https://governance.openstack.org/tc/reference/base-services.html Yes, we can (and should) add more of those when they are relevant to most OpenStack deployments, otherwise projects will never start depending on Barbican and continue to NIH secrets management locally. But since any addition comes with a high operational cost, we need to consider them very carefully. +1 We should also consider use cases and group projects together (a concept we start to call "constellations"). Yes, it would be great that, if you have a IaaS/Compute use case, you could assume Designate is part of the mix. +1 [1] https://ttx.re/facets-of-the-integrated-release.html __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
Zane Bitter wrote: > But back in the day we had a process (incubation) for adding stuff to > OpenStack that it made sense to depend on being there. It was a highly > imperfect process. We got rid of that process with the big tent reform, > but didn't really replace it with anything at all. Tags never evolved > into a replacement as I hoped they would. > > So now we have a bunch of things that are integral to building a > "Kubernetes-like experience for application developers" - secret > storage, DNS, load balancing, asynchronous messaging - that exist but > are not in most clouds. Yet another tangent in that thread, but you seem to regret a past that never happened. The "integrated release" was never about stuff that you can "depend on being there". It was about things that were tested to work well together, and released together. Projects were incubating until they were deemed mature-enough (and embedded-enough in our community) that it was fine for other projects to take the hit to be tested with them, and take the risk of being released together. I don't blame you for thinking otherwise: since the integrated release was the only answer we gave, everyone assumed it answered their specific question[1]. And that was why we needed to get rid of it. If it was really about stuff you can "depend on being there" then most OpenStack clouds would have had Swift, Ceilometer, Trove and Sahara. Stuff you can "depend on being there" is a relatively-new concept: https://governance.openstack.org/tc/reference/base-services.html Yes, we can (and should) add more of those when they are relevant to most OpenStack deployments, otherwise projects will never start depending on Barbican and continue to NIH secrets management locally. But since any addition comes with a high operational cost, we need to consider them very carefully. We should also consider use cases and group projects together (a concept we start to call "constellations"). Yes, it would be great that, if you have a IaaS/Compute use case, you could assume Designate is part of the mix. [1] https://ttx.re/facets-of-the-integrated-release.html -- Thierry Carrez (ttx) __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
of Kubernetes deployments, one for each tenant/project/namespace) is completely glossing over the fact that by abstracting away the infrastructure (k8s' "cloud provider" concept), Kubernetes developers simply get to ignore some of the hardest and trickiest parts of operations. So, let's try to compare apples to apples, shall we? It sounds like the end goal that you're advocating -- more than anything else -- is an easy-to-install package of OpenStack services that provides a Kubernetes-like experience for application developers. I 100% agree with that goal. 100%. But pulling Neutron, Cinder, Keystone, Designate, Barbican, and Octavia back into Nova is not the way to do that. You're trying to solve a packaging and installation problem with a code structure solution. In fact, if you look at the Kubernetes development community, you see the *opposite* direction being taken: they have broken out and are actively breaking out large pieces of the Kubernetes repository/codebase into separate repositories and addons/plugins. And this is being done to *accelerate* development of Kubernetes in very much the same way that splitting services out of Nova was done to accelerate the development of those various pieces of infrastructure code. This core functionality being combined has allowed them to land features that are really important to users but has proven difficult for OpenStack to do because of the silo's. OpenStack's general pattern has been, stand up a new service for new feature, then no one wants to depend on it so its ignored and each silo reimplements a lesser version of it themselves. I disagree. I believe the reason Kubernetes is able to land features that are "really important to users" is primarily due to the following reasons: 1) The Kubernetes technical leadership strongly resists pressure from vendors to add yet-another-specialized-feature to the codebase. This ability to say "No" pays off in spades with regards to stability and focus. 2) The mission of Kubernetes is much smaller than OpenStack. If the OpenStack community were able to say "OpenStack is a container orchestration system", and not "OpenStack is a ubiquitous open source cloud operating system", we'd probably be able to deliver features in a more focused fashion. The OpenStack commons then continues to suffer. We need to stop this destructive cycle. OpenStack needs to figure out how to increase its commons. Both internally and externally. etcd as a common service was a step in the right direction. I think k8s needs to be another common service all the others can rely on. That could greatly simplify the rest of the OpenStack projects as a lot of its functionality no longer has to be implemented in each project. I don't disagree with the goal of being able to rely on Kubernetes for many things. But relying on Kubernetes doesn't solve the "I want some easy-to-install infrastructure" problem. Nor does it solve the types of advanced networking scenarios that the NFV community requires. We also need a way to break down the silo walls and allow more cross project collaboration for features. I fear the new push for letting projects run standalone will make this worse, not better, further fracturing OpenStack. Perhaps you are referring to me with the above? As I said on Twitter, "Make your #OpenStack project usable by and useful for things outside of the OpenStack ecosystem. Fewer deps. Do one thing well. Solid APIs." I don't think that the above leads to "further fracturing OpenStack". I think it leads to solid, reusable components. Best, -jay Thanks, Kevin From: Thierry Carrez [thie...@openstack.org] Sent: Thursday, June 22, 2017 12:58 AM To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove Fox, Kevin M wrote: [...] If you build a Tessmaster clone just to do mariadb, then you share nothing with the other communities and have to reinvent the wheel, yet again. Operators load increases because the tool doesn't function like other tools. If you rely on a container orchestration engine that's already cross cloud that can be easily deployed by user or cloud operator, and fill in the gaps with what Trove wants to support, easy management of db's, you get to reuse a lot of the commons and the users slight increase in investment in dealing with the bit of extra plumbing in there allows other things to also be easily added to their cluster. Its very rare that a user would need to deploy/manage only a database. The net load on the operator decreases, not increases. I think the user-side tool could totally deploy on Kubernetes clusters -- if that was the only possible target that would make it a Kubernetes tool more than an o
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
No, I'm not necessarily advocating a monolithic approach. I'm saying that they have decided to start with functionality and accept whats needed to get the task done. Theres not really such strong walls between the various functionality, rbac/secrets/kublet/etc. They don't spawn off a whole new project just to add functionality. they do so only when needed. They also don't balk at one feature depending on another. rbac's important, so they implemented it. ssl cert management was important. so they added that. adding a feature that restricts secret downloads only to the physical nodes need them, could then reuse the rbac system and ssl cert management. Their sigs are more oriented to features/functionality (or catagories there of), not as much specific components. We need to do X. X may involve changes to components A and B. OpenStack now tends to start with A and B and we try and work backwards towards implementing X, which is hard due to the strong walls and unclear ownership of the feature. And the general solution has been to try and make C but not commit to C being in the core so users cant depend on it which hasn't proven to be a very successful pattern. Your right, they are breaking up their code base as needed, like nova did. I'm coming around to that being a pretty good approach to some things. starting things is simpler, and if it ends up not needing its own whole project, then it doesn't get one. if it needs one, then it gets one. Its not by default, start whole new project with db user, db schema, api, scheduler, etc. And the project might not end up with daemons split up in exactly the way you would expect if you prepoptomized breaking off a project not knowing exactly how it might integrate with everything else. Maybe the porcelain api that's been discussed for a while is part of the solution. initial stuff can prototyped/start there and break off as needed to separate projects and moved around without the user needing to know where it ends up. Your right that OpenStack's scope is much grater. and think that the commons are even more important in that case. If it doesn't have a solid base, every project has to re-implement its own base. That takes a huge amount of manpower all around. Its not sustainable. I guess we've gotten pretty far away from discussing Trove at this point. Thanks, Kevin From: Jay Pipes [jaypi...@gmail.com] Sent: Thursday, June 22, 2017 10:05 AM To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove On 06/22/2017 11:59 AM, Fox, Kevin M wrote: > My $0.02. > > That view of dependencies is why Kubernetes development is outpacing > OpenStacks and some users are leaving IMO. Not trying to be mean here but > trying to shine some light on this issue. > > Kubernetes at its core has essentially something kind of equivalent to > keystone (k8s rbac), nova (container mgmt), cinder (pv/pvc/storageclasses), > heat with convergence (deployments/daemonsets/etc), barbican (secrets), > designate (kube-dns), and octavia (kube-proxy,svc,ingress) in one unit. Ops > dont have to work hard to get all of it, users can assume its all there, and > devs don't have many silo's to cross to implement features that touch > multiple pieces. I think it's kind of hysterical that you're advocating a monolithic approach when the thing you're advocating (k8s) is all about enabling non-monolithic microservices architectures. Look, the fact of the matter is that OpenStack's mission is larger than that of Kubernetes. And to say that "Ops don't have to work hard" to get and maintain a Kubernetes deployment (which, frankly, tends to be dozens of Kubernetes deployments, one for each tenant/project/namespace) is completely glossing over the fact that by abstracting away the infrastructure (k8s' "cloud provider" concept), Kubernetes developers simply get to ignore some of the hardest and trickiest parts of operations. So, let's try to compare apples to apples, shall we? It sounds like the end goal that you're advocating -- more than anything else -- is an easy-to-install package of OpenStack services that provides a Kubernetes-like experience for application developers. I 100% agree with that goal. 100%. But pulling Neutron, Cinder, Keystone, Designate, Barbican, and Octavia back into Nova is not the way to do that. You're trying to solve a packaging and installation problem with a code structure solution. In fact, if you look at the Kubernetes development community, you see the *opposite* direction being taken: they have broken out and are actively breaking out large pieces of the Kubernetes repository/codebase into separate repositories and addons/plugins. And this is being done to *accele
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
> -Original Message- > From: Zane Bitter [mailto:zbit...@redhat.com] > Sent: June-20-17 4:57 PM > To: openstack-dev@lists.openstack.org > Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect > Trove > > On 20/06/17 11:45, Jay Pipes wrote: > > Good discussion, Zane. Comments inline. > > ++ > > > On 06/20/2017 11:01 AM, Zane Bitter wrote: > >> On 20/06/17 10:08, Jay Pipes wrote: > >>> On 06/20/2017 09:42 AM, Doug Hellmann wrote: > >>>> Does "service VM" need to be a first-class thing? Akanda creates > >>>> them, using a service user. The VMs are tied to a "router" which > is > >>>> the billable resource that the user understands and interacts with > >>>> through the API. > >>> > >>> Frankly, I believe all of these types of services should be built > as > >>> applications that run on OpenStack (or other) infrastructure. In > >>> other words, they should not be part of the infrastructure itself. > >>> > >>> There's really no need for a user of a DBaaS to have access to the > >>> host or hosts the DB is running on. If the user really wanted that, > >>> they would just spin up a VM/baremetal server and install the thing > >>> themselves. > >> > >> Hey Jay, > >> I'd be interested in exploring this idea with you, because I think > >> everyone agrees that this would be a good goal, but at least in my > >> mind it's not obvious what the technical solution should be. > >> (Actually, I've read your email a bunch of times now, and I go back > >> and forth on which one you're actually advocating for.) The two > >> options, as I see it, are as follows: > >> > >> 1) The database VMs are created in the user's tena^W project. They > >> connect directly to the tenant's networks, are governed by the > user's > >> quota, and are billed to the project as Nova VMs (on top of whatever > >> additional billing might come along with the management services). A > >> [future] feature in Nova (https://review.openstack.org/#/c/438134/) > >> allows the Trove service to lock down access so that the user cannot > >> actually interact with the server using Nova, but must go through > the > >> Trove API. On a cloud that doesn't include Trove, a user could run > >> Trove as an application themselves and all it would have to do > >> differently is not pass the service token to lock down the VM. > >> > >> alternatively: > >> > >> 2) The database VMs are created in a project belonging to the > >> operator of the service. They're connected to the user's network > >> through , and isolated from other users' databases running in > >> the same project through magic?>. > >> Trove has its own quota management and billing. The user cannot > >> interact with the server using Nova since it is owned by a different > >> project. On a cloud that doesn't include Trove, a user could run > >> Trove as an application themselves, by giving it credentials for > >> their own project and disabling all of the cross-tenant networking > stuff. > > > > None of the above :) > > > > Don't think about VMs at all. Or networking plumbing. Or volume > > storage or any of that. > > OK, but somebody has to ;) > > > Think only in terms of what a user of a DBaaS really wants. At the > end > > of the day, all they want is an address in the cloud where they can > > point their application to write and read data from. > > > > Do they want that data connection to be fast and reliable? Of course, > > but how that happens is irrelevant to them > > > > Do they want that data to be safe and backed up? Of course, but how > > that happens is irrelevant to them. > > Fair enough. The world has changed a lot since RDS (which was the model > for Trove) was designed, it's certainly worth reviewing the base > assumptions before embarking on a new design. > > > The problem with many of these high-level *aaS projects is that they > > consider their user to be a typical tenant of general cloud > > infrastructure -- focused on launching VMs and creating volumes and > > networks etc. And the discussions around the implementation of these > > projects always comes back to minutia about how to set up secure > > communication channels between a control plane message bus and the > > service VMs. >
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
tl;dr - I think Trove's successor has a future, but there are two conflicting ideas presented and Trove should pick one or the other. Excerpts from Amrith Kumar's message of 2017-06-18 07:35:49 -0400: > > We have learned a lot from v1, and the hope is that we can address that in > v2. Some of the more significant things that I have learned are: > > - We should adopt a versioned front-end API from the very beginning; making > the REST API versioned is not a ‘v2 feature’ > +1 > - A guest agent running on a tenant instance, with connectivity to a shared > management message bus is a security loophole; encrypting traffic, > per-tenant-passwords, and any other scheme is merely lipstick on a security > hole > This is a broad statement, and I'm not sure I understand the actual risk you're presenting here as "a security loophole". How else would you administer a database server than through some kind of agent? Whether that agent is a python daemon of our making, sshd, or whatever kubernetes component lets you change things, they're all administrative pieces that sit next to the resource. > - Reliance on Nova for compute resources is fine, but dependence on Nova VM > specific capabilities (like instance rebuild) is not; it makes things like > containers or bare-metal second class citizens > I whole heartedly agree that rebuild is a poor choice for database servers. In fact, I believe it is a completely non-scalable feature that should not even exist in Nova. This is kind of a "we shouldn't be this". What should we be running database clusters on? > - A fair portion of what Trove does is resource orchestration; don’t > reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as far > along when Trove got started but that’s not the case today and we have an > opportunity to fix that now > Yeah. You can do that. I'm not really sure what it gets you at this level. There was an effort a few years ago to use Heat for Trove and some other pieces, but they fell short at the point where they had to ask Heat for a few features like, oddly enough, rebuild confirmation after test. Also, it increases friction to your project if your project requires Heat in a cloud. That's a whole new service that one would have to choose to expose or not to users and manage just for Trove. That's a massive dependency, and it should come with something significant. I don't see what it actually gets you when you already have to keep track of your resources for cluster membership purposes anyway. > - A similarly significant portion of what Trove does is to implement a > state-machine that will perform specific workflows involved in implementing > database specific operations. This makes the Trove taskmanager a stateful > entity. Some of the operations could take a fair amount of time. This is a > serious architectural flaw > A state driven workflow is unavoidable if you're going to do cluster manipulation. So you can defer this off to Mistral or some other workflow engine, but I don't think it's an architectural flaw _that Trove does it_. Clusters have states. They have to be tracked. Do that well and your users will be happy. > - Tenants should not ever be able to directly interact with the underlying > storage and compute used by database instances; that should be the default > configuration, not an untested deployment alternative > Agreed. There's no point in having an "inside the cloud" service if you're just going to hand them the keys to the VMs and volumes anyway. The point of something like Trove is to be able to retain control at the operator level, and only give users the interface you promised, optimized without the limitations of the cloud. > - The CI should test all databases that are considered to be ‘supported’ > without excessive use of resources in the gate; better code modularization > will help determine the tests which can safely be skipped in testing changes > Take the same approach as the other driver-hosting things. If it's in-tree, it has to have a gate test. > - Clusters should be first class citizens not an afterthought, single > instance databases may be the ‘special case’, not the other way around > +1 > - The project must provide guest images (or at least complete tooling for > deployers to build these); while the project can’t distribute operating > systems and database software, the current deployment model merely impedes > adoption > IIRC the project provides dib elements and a basic command line to build images for your cloud, yes? Has that not worked out? > - Clusters spanning OpenStack deployments are a real thing that must be > supported > This is the most problematic thing you asserted. There are two basic desires I see that drive a Trove adoption: 1) I need database clusters and I don't know how to do them right. 2) I need _high performance/availability/capacity_ databases and my cloud's standard VM flavors/hosts/networks/disks/etc. stand in the way of that. For t
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
On 06/22/2017 11:59 AM, Fox, Kevin M wrote: My $0.02. That view of dependencies is why Kubernetes development is outpacing OpenStacks and some users are leaving IMO. Not trying to be mean here but trying to shine some light on this issue. Kubernetes at its core has essentially something kind of equivalent to keystone (k8s rbac), nova (container mgmt), cinder (pv/pvc/storageclasses), heat with convergence (deployments/daemonsets/etc), barbican (secrets), designate (kube-dns), and octavia (kube-proxy,svc,ingress) in one unit. Ops dont have to work hard to get all of it, users can assume its all there, and devs don't have many silo's to cross to implement features that touch multiple pieces. I think it's kind of hysterical that you're advocating a monolithic approach when the thing you're advocating (k8s) is all about enabling non-monolithic microservices architectures. Look, the fact of the matter is that OpenStack's mission is larger than that of Kubernetes. And to say that "Ops don't have to work hard" to get and maintain a Kubernetes deployment (which, frankly, tends to be dozens of Kubernetes deployments, one for each tenant/project/namespace) is completely glossing over the fact that by abstracting away the infrastructure (k8s' "cloud provider" concept), Kubernetes developers simply get to ignore some of the hardest and trickiest parts of operations. So, let's try to compare apples to apples, shall we? It sounds like the end goal that you're advocating -- more than anything else -- is an easy-to-install package of OpenStack services that provides a Kubernetes-like experience for application developers. I 100% agree with that goal. 100%. But pulling Neutron, Cinder, Keystone, Designate, Barbican, and Octavia back into Nova is not the way to do that. You're trying to solve a packaging and installation problem with a code structure solution. In fact, if you look at the Kubernetes development community, you see the *opposite* direction being taken: they have broken out and are actively breaking out large pieces of the Kubernetes repository/codebase into separate repositories and addons/plugins. And this is being done to *accelerate* development of Kubernetes in very much the same way that splitting services out of Nova was done to accelerate the development of those various pieces of infrastructure code. This core functionality being combined has allowed them to land features that are really important to users but has proven difficult for OpenStack to do because of the silo's. OpenStack's general pattern has been, stand up a new service for new feature, then no one wants to depend on it so its ignored and each silo reimplements a lesser version of it themselves. I disagree. I believe the reason Kubernetes is able to land features that are "really important to users" is primarily due to the following reasons: 1) The Kubernetes technical leadership strongly resists pressure from vendors to add yet-another-specialized-feature to the codebase. This ability to say "No" pays off in spades with regards to stability and focus. 2) The mission of Kubernetes is much smaller than OpenStack. If the OpenStack community were able to say "OpenStack is a container orchestration system", and not "OpenStack is a ubiquitous open source cloud operating system", we'd probably be able to deliver features in a more focused fashion. The OpenStack commons then continues to suffer. We need to stop this destructive cycle. OpenStack needs to figure out how to increase its commons. Both internally and externally. etcd as a common service was a step in the right direction. I think k8s needs to be another common service all the others can rely on. That could greatly simplify the rest of the OpenStack projects as a lot of its functionality no longer has to be implemented in each project. I don't disagree with the goal of being able to rely on Kubernetes for many things. But relying on Kubernetes doesn't solve the "I want some easy-to-install infrastructure" problem. Nor does it solve the types of advanced networking scenarios that the NFV community requires. We also need a way to break down the silo walls and allow more cross project collaboration for features. I fear the new push for letting projects run standalone will make this worse, not better, further fracturing OpenStack. Perhaps you are referring to me with the above? As I said on Twitter, "Make your #OpenStack project usable by and useful for things outside of the OpenStack ecosystem. Fewer deps. Do one thing well. Solid APIs." I don't think that the above leads to "further fracturing OpenStack". I think it leads to solid, reusable components. Best, -jay Thanks, Kevin ________________________ From:
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
2017-06-22 18:59 GMT+03:00 Fox, Kevin M : > My $0.02. > > That view of dependencies is why Kubernetes development is outpacing > OpenStacks and some users are leaving IMO. Not trying to be mean here but > trying to shine some light on this issue. > > Kubernetes at its core has essentially something kind of equivalent to > keystone (k8s rbac), nova (container mgmt), cinder (pv/pvc/storageclasses), > heat with convergence (deployments/daemonsets/etc), barbican (secrets), > designate (kube-dns), and octavia (kube-proxy,svc,ingress) in one unit. Ops > dont have to work hard to get all of it, users can assume its all there, > and devs don't have many silo's to cross to implement features that touch > multiple pieces. > > This core functionality being combined has allowed them to land features > that are really important to users but has proven difficult for OpenStack > to do because of the silo's. OpenStack's general pattern has been, stand up > a new service for new feature, then no one wants to depend on it so its > ignored and each silo reimplements a lesser version of it themselves. > > Totally agree The OpenStack commons then continues to suffer. > > We need to stop this destructive cycle. > > OpenStack needs to figure out how to increase its commons. Both internally > and externally. etcd as a common service was a step in the right direction. > > I think k8s needs to be another common service all the others can rely on. > That could greatly simplify the rest of the OpenStack projects as a lot of > its functionality no longer has to be implemented in each project. > > We also need a way to break down the silo walls and allow more cross > project collaboration for features. I fear the new push for letting > projects run standalone will make this worse, not better, further > fracturing OpenStack. > > Thanks, > Kevin > > From: Thierry Carrez [thie...@openstack.org] > Sent: Thursday, June 22, 2017 12:58 AM > To: openstack-dev@lists.openstack.org > Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect > Trove > > Fox, Kevin M wrote: > > [...] > > If you build a Tessmaster clone just to do mariadb, then you share > nothing with the other communities and have to reinvent the wheel, yet > again. Operators load increases because the tool doesn't function like > other tools. > > > > If you rely on a container orchestration engine that's already cross > cloud that can be easily deployed by user or cloud operator, and fill in > the gaps with what Trove wants to support, easy management of db's, you get > to reuse a lot of the commons and the users slight increase in investment > in dealing with the bit of extra plumbing in there allows other things to > also be easily added to their cluster. Its very rare that a user would need > to deploy/manage only a database. The net load on the operator decreases, > not increases. > > I think the user-side tool could totally deploy on Kubernetes clusters > -- if that was the only possible target that would make it a Kubernetes > tool more than an open infrastructure tool, but that's definitely a > possibility. I'm not sure work is needed there though, there are already > tools (or charts) doing that ? > > For a server-side approach where you want to provide a DB-provisioning > API, I fear that making the functionality depend on K8s would make > TroveV2/Hoard would not only depend on Heat and Nova, but also depend on > something that would deploy a Kubernetes cluster (Magnum?), which would > likely hurt its adoption (and reusability in simpler setups). Since > databases would just work perfectly well in VMs, it feels like a > gratuitous dependency addition ? > > We generally need to be very careful about creating dependencies between > OpenStack projects. On one side there are base services (like Keystone) > that we said it was alright to depend on, but depending on anything else > is likely to reduce adoption. Magnum adoption suffers from its > dependency on Heat. If Heat starts depending on Zaqar, we make the > problem worse. I understand it's a hard trade-off: you want to reuse > functionality rather than reinvent it in every project... we just need > to recognize the cost of doing that. > > -- > Thierry Carrez (ttx) > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __ &g
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
My $0.02. That view of dependencies is why Kubernetes development is outpacing OpenStacks and some users are leaving IMO. Not trying to be mean here but trying to shine some light on this issue. Kubernetes at its core has essentially something kind of equivalent to keystone (k8s rbac), nova (container mgmt), cinder (pv/pvc/storageclasses), heat with convergence (deployments/daemonsets/etc), barbican (secrets), designate (kube-dns), and octavia (kube-proxy,svc,ingress) in one unit. Ops dont have to work hard to get all of it, users can assume its all there, and devs don't have many silo's to cross to implement features that touch multiple pieces. This core functionality being combined has allowed them to land features that are really important to users but has proven difficult for OpenStack to do because of the silo's. OpenStack's general pattern has been, stand up a new service for new feature, then no one wants to depend on it so its ignored and each silo reimplements a lesser version of it themselves. The OpenStack commons then continues to suffer. We need to stop this destructive cycle. OpenStack needs to figure out how to increase its commons. Both internally and externally. etcd as a common service was a step in the right direction. I think k8s needs to be another common service all the others can rely on. That could greatly simplify the rest of the OpenStack projects as a lot of its functionality no longer has to be implemented in each project. We also need a way to break down the silo walls and allow more cross project collaboration for features. I fear the new push for letting projects run standalone will make this worse, not better, further fracturing OpenStack. Thanks, Kevin From: Thierry Carrez [thie...@openstack.org] Sent: Thursday, June 22, 2017 12:58 AM To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove Fox, Kevin M wrote: > [...] > If you build a Tessmaster clone just to do mariadb, then you share nothing > with the other communities and have to reinvent the wheel, yet again. > Operators load increases because the tool doesn't function like other tools. > > If you rely on a container orchestration engine that's already cross cloud > that can be easily deployed by user or cloud operator, and fill in the gaps > with what Trove wants to support, easy management of db's, you get to reuse a > lot of the commons and the users slight increase in investment in dealing > with the bit of extra plumbing in there allows other things to also be easily > added to their cluster. Its very rare that a user would need to deploy/manage > only a database. The net load on the operator decreases, not increases. I think the user-side tool could totally deploy on Kubernetes clusters -- if that was the only possible target that would make it a Kubernetes tool more than an open infrastructure tool, but that's definitely a possibility. I'm not sure work is needed there though, there are already tools (or charts) doing that ? For a server-side approach where you want to provide a DB-provisioning API, I fear that making the functionality depend on K8s would make TroveV2/Hoard would not only depend on Heat and Nova, but also depend on something that would deploy a Kubernetes cluster (Magnum?), which would likely hurt its adoption (and reusability in simpler setups). Since databases would just work perfectly well in VMs, it feels like a gratuitous dependency addition ? We generally need to be very careful about creating dependencies between OpenStack projects. On one side there are base services (like Keystone) that we said it was alright to depend on, but depending on anything else is likely to reduce adoption. Magnum adoption suffers from its dependency on Heat. If Heat starts depending on Zaqar, we make the problem worse. I understand it's a hard trade-off: you want to reuse functionality rather than reinvent it in every project... we just need to recognize the cost of doing that. -- Thierry Carrez (ttx) __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
Fox, Kevin M wrote: > [...] > If you build a Tessmaster clone just to do mariadb, then you share nothing > with the other communities and have to reinvent the wheel, yet again. > Operators load increases because the tool doesn't function like other tools. > > If you rely on a container orchestration engine that's already cross cloud > that can be easily deployed by user or cloud operator, and fill in the gaps > with what Trove wants to support, easy management of db's, you get to reuse a > lot of the commons and the users slight increase in investment in dealing > with the bit of extra plumbing in there allows other things to also be easily > added to their cluster. Its very rare that a user would need to deploy/manage > only a database. The net load on the operator decreases, not increases. I think the user-side tool could totally deploy on Kubernetes clusters -- if that was the only possible target that would make it a Kubernetes tool more than an open infrastructure tool, but that's definitely a possibility. I'm not sure work is needed there though, there are already tools (or charts) doing that ? For a server-side approach where you want to provide a DB-provisioning API, I fear that making the functionality depend on K8s would make TroveV2/Hoard would not only depend on Heat and Nova, but also depend on something that would deploy a Kubernetes cluster (Magnum?), which would likely hurt its adoption (and reusability in simpler setups). Since databases would just work perfectly well in VMs, it feels like a gratuitous dependency addition ? We generally need to be very careful about creating dependencies between OpenStack projects. On one side there are base services (like Keystone) that we said it was alright to depend on, but depending on anything else is likely to reduce adoption. Magnum adoption suffers from its dependency on Heat. If Heat starts depending on Zaqar, we make the problem worse. I understand it's a hard trade-off: you want to reuse functionality rather than reinvent it in every project... we just need to recognize the cost of doing that. -- Thierry Carrez (ttx) __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
Thank you Kevin. Lots of container (specific?) goodness here. -amrith -amrith -- Amrith Kumar Phone: +1-978-563-9590 On Mon, Jun 19, 2017 at 2:34 PM, Fox, Kevin M wrote: > Thanks for starting this difficult discussion. > > I think I agree with all the lessons learned except the nova one. while > you can treat containers and vm's the same, after years of using both, I > really don't think its a good idea to treat them equally. Containers can't > work properly if used as a vm. (really, really.) > > I agree whole heartedly with your statement that its mostly an > orchestration problem and should reuse stuff now that there are options. > > I would propose the following that I think meets your goals and could > widen your contributor base substantially: > > Look at the Kubernetes (k8s) concept of Operator -> > https://coreos.com/blog/introducing-operators.html > > They allow application specific logic to be added to Kubernetes while > reusing the rest of k8s to do what its good at. Container Orchestration. > etcd is just a clustered database and if the operator concept works for it, > it should also work for other databases such as Gallera. > > Where I think the containers/vm thing is incompatible is the thing I think > will make Trove's life easier. You can think of a member of the database as > few different components, such as: > * main database process > * metrics gatherer (such as https://github.com/prometheus/mysqld_exporter > ) > * trove_guest_agent > > With the current approach, all are mixed into the same vm image, making it > very difficult to update the trove_guest_agent without touching the main > database process. (needed when you upgrade the trove controllers). With the > k8s sidecar concept, each would be a separate container loaded into the > same pod. > > So rather then needing to maintain a trove image for every possible > combination of db version, trove version, etc, you can reuse upstream > database containers along with trove provided guest agents. > > There's a secure channel between kube-apiserver and kubelet so you can > reuse it for secure communications. No need to add anything for secure > communication. trove engine -> kubectl exec x-db -c guest_agent some > command. > > There is k8s federation, so if the operator was started at the federation > level, it can cross multiple OpenStack regions. > > Another big feature I that hasn't been mentioned yet that I think is > critical. In our performance tests, databases in VM's have never performed > particularly well. Using k8s as a base, bare metal nodes could be pulled in > easily, with dedicated disk or ssd's that the pods land on that are very > very close to the database. This should give native performance. > > So, my suggestion would be to strongly consider basing Trove v2 on > Kubernetes. It can provide a huge bang for the buck, simplifying the Trove > architecture substantially while gaining the new features your list as > being important. The Trove v2 OpenStack api can be exposed as a very thin > wrapper over k8s Third Party Resources (TPR) and would make Trove entirely > stateless. k8s maintains all state for everything in etcd. > > Please consider this architecture. > > Thanks, > Kevin > > -------------- > *From:* Amrith Kumar [amrith.ku...@gmail.com] > *Sent:* Sunday, June 18, 2017 4:35 AM > *To:* OpenStack Development Mailing List (not for usage questions) > *Subject:* [openstack-dev] [trove][all][tc] A proposal to rearchitect > Trove > > Trove has evolved rapidly over the past several years, since integration > in IceHouse when it only supported single instances of a few databases. > Today it supports a dozen databases including clusters and replication. > > The user survey [1] indicates that while there is strong interest in the > project, there are few large production deployments that are known of (by > the development team). > > Recent changes in the OpenStack community at large (company realignments, > acquisitions, layoffs) and the Trove community in particular, coupled with > a mounting burden of technical debt have prompted me to make this proposal > to re-architect Trove. > > This email summarizes several of the issues that face the project, both > structurally and architecturally. This email does not claim to include a > detailed specification for what the new Trove would look like, merely the > recommendation that the community should come together and develop one so > that the project can be sustainable and useful to those who wish to use it > in the future. > > TL;DR > > Trove, with support for a dozen or so databases today, finds itself in a > bind be
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
There already is a user side tools for deploying plumbing onto your own cloud. stuff like Tessmaster itself. I think the win is being able to extend that k8s with the ability to declaratively request database clusters and manage them. Its all about the commons. If you build a Tessmaster clone just to do mariadb, then you share nothing with the other communities and have to reinvent the wheel, yet again. Operators load increases because the tool doesn't function like other tools. If you rely on a container orchestration engine that's already cross cloud that can be easily deployed by user or cloud operator, and fill in the gaps with what Trove wants to support, easy management of db's, you get to reuse a lot of the commons and the users slight increase in investment in dealing with the bit of extra plumbing in there allows other things to also be easily added to their cluster. Its very rare that a user would need to deploy/manage only a database. The net load on the operator decreases, not increases. Look at helm apps for some examples. They do complex web applications that have web tiers, database tiers, etc. But they currently suffer from lack of good support for clustered databases. In the end, the majority of users care about helm install my_scalable_app kind of things rather then installing all the things by hand. Its a pain. OpenStack itself has this issue. It has lots of an api tiers and a db tiers. If Trove was a k8s operator, OpenStack on k8s could use it to deploy the rest of OpenStack. Even more sharing. Thanks, Kevin From: Thierry Carrez [thie...@openstack.org] Sent: Wednesday, June 21, 2017 1:52 AM To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove Zane Bitter wrote: > [...] > Until then it seems to me that the tradeoff is between decoupling it > from the particular cloud it's running on so that users can optionally > deploy it standalone (essentially Vish's proposed solution for the *aaS > services from many moons ago) vs. decoupling it from OpenStack in > general so that the operator has more flexibility in how to deploy. > > I'd love to be able to cover both - from a user using it standalone to > spin up and manage a DB in containers on a shared PaaS, through to a > user accessing it as a service to provide a DB running on a dedicated VM > or bare metal server, and everything in between. I don't know is such a > thing is feasible. I suspect we're going to have to talk a lot about VMs > and network plumbing and volume storage :) As another data point, we are seeing this very same tradeoff with Magnum vs. Tessmaster (with "I want to get a Kubernetes cluster" rather than "I want to get a database"). Tessmaster is the user-side tool from EBay deploying Kubernetes on different underlying cloud infrastructures: takes a bunch of cloud credentials, then deploys, grows and shrinks Kubernetes cluster for you. Magnum is the infrastructure-side tool from OpenStack giving you COE-as-a-service, through a provisioning API. Jay is advocating for Trove to be more like Tessmaster, and less like Magnum. I think I agree with Zane that those are two different approaches: From a public cloud provider perspective serving lots of small users, I think a provisioning API makes sense. The user in that case is in a "black box" approach, so I think the resulting resources should not really be accessible as VMs by the tenant, even if they end up being Nova VMs. The provisioning API could propose several options (K8s or Mesos, MySQL or PostgreSQL). From a private cloud / hybrid cloud / large cloud user perspective, the user-side deployment tool, letting you deploy the software on various types of infrastructure, probably makes more sense. It's probably more work to run it, but you gain in flexibility. That user-side tool would probably not support multiple options, but be application-specific. So yes, ideally we would cover both. Because they target different users, and both are right... -- Thierry Carrez (ttx) __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
On 21/06/17 01:49, Mark Kirkwood wrote: On 21/06/17 02:08, Jay Pipes wrote: On 06/20/2017 09:42 AM, Doug Hellmann wrote: Does "service VM" need to be a first-class thing? Akanda creates them, using a service user. The VMs are tied to a "router" which is the billable resource that the user understands and interacts with through the API. Frankly, I believe all of these types of services should be built as applications that run on OpenStack (or other) infrastructure. In other words, they should not be part of the infrastructure itself. There's really no need for a user of a DBaaS to have access to the host or hosts the DB is running on. If the user really wanted that, they would just spin up a VM/baremetal server and install the thing themselves. Yes, I think this area is where some hard thinking would be rewarded. I recall when I first met Trove, in my mind I expected to be 'carving off a piece of database'...and was a bit surprised to discover that it (essentially) leveraged Nova VM + OS + DB (no criticism intended - just saying I was surprised). I think this is a common mistake (I know I've made it with respect to other services) when hearing about a new *aaS thing and making assumptions about the architecture. Here's a helpful way to think about it: A cloud service has to have robust multitenancy. In the case of DBaaS, that gives you two options. You can start with a database that is already multitenant. If that works for your users, great. But many users just want somebody else to manage $MY_FAVOURITE_DATABASE that is not multitenant by design. Your only real option in that case is to give them their own copy and isolate it somehow from everyone else's. This is the use case that RDS and Trove are designed to solve. It's important to note that this hasn't changed and isn't going to change in the foreseeable future. What *has* changed is that there are now more options for "isolate it somehow from everyone else's" - e.g. you can use a container instead of a VM. Of course after delving into how it worked I realized that it did make sense to make use of the various Nova things (schedulers etc) Fun fact: Trove started out as a *complete fork* of Nova(!). *but* now we are thinking about re-architecting (plus more options exist now), it would make sense to revisit this area. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
On Wed, Jun 21, 2017 at 1:52 AM, Thierry Carrez wrote: > Zane Bitter wrote: >> [...] >> Until then it seems to me that the tradeoff is between decoupling it >> from the particular cloud it's running on so that users can optionally >> deploy it standalone (essentially Vish's proposed solution for the *aaS >> services from many moons ago) vs. decoupling it from OpenStack in >> general so that the operator has more flexibility in how to deploy. >> >> I'd love to be able to cover both - from a user using it standalone to >> spin up and manage a DB in containers on a shared PaaS, through to a >> user accessing it as a service to provide a DB running on a dedicated VM >> or bare metal server, and everything in between. I don't know is such a >> thing is feasible. I suspect we're going to have to talk a lot about VMs >> and network plumbing and volume storage :) > > As another data point, we are seeing this very same tradeoff with Magnum > vs. Tessmaster (with "I want to get a Kubernetes cluster" rather than "I > want to get a database"). > > Tessmaster is the user-side tool from EBay deploying Kubernetes on > different underlying cloud infrastructures: takes a bunch of cloud > credentials, then deploys, grows and shrinks Kubernetes cluster for you. > > Magnum is the infrastructure-side tool from OpenStack giving you > COE-as-a-service, through a provisioning API. > > Jay is advocating for Trove to be more like Tessmaster, and less like > Magnum. I think I agree with Zane that those are two different approaches: > > From a public cloud provider perspective serving lots of small users, I > think a provisioning API makes sense. The user in that case is in a > "black box" approach, so I think the resulting resources should not > really be accessible as VMs by the tenant, even if they end up being > Nova VMs. The provisioning API could propose several options (K8s or > Mesos, MySQL or PostgreSQL). I like this! ^^ If we can pull off "different underlying cloud infrastructures" like TessMaster, that would be of more value to folks who may not be using OpenStack (or VMs!) > > From a private cloud / hybrid cloud / large cloud user perspective, the > user-side deployment tool, letting you deploy the software on various > types of infrastructure, probably makes more sense. It's probably more > work to run it, but you gain in flexibility. That user-side tool would > probably not support multiple options, but be application-specific. > > So yes, ideally we would cover both. Because they target different > users, and both are right... > > -- > Thierry Carrez (ttx) > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Davanum Srinivas :: https://twitter.com/dims __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
Zane Bitter wrote: > [...] > Until then it seems to me that the tradeoff is between decoupling it > from the particular cloud it's running on so that users can optionally > deploy it standalone (essentially Vish's proposed solution for the *aaS > services from many moons ago) vs. decoupling it from OpenStack in > general so that the operator has more flexibility in how to deploy. > > I'd love to be able to cover both - from a user using it standalone to > spin up and manage a DB in containers on a shared PaaS, through to a > user accessing it as a service to provide a DB running on a dedicated VM > or bare metal server, and everything in between. I don't know is such a > thing is feasible. I suspect we're going to have to talk a lot about VMs > and network plumbing and volume storage :) As another data point, we are seeing this very same tradeoff with Magnum vs. Tessmaster (with "I want to get a Kubernetes cluster" rather than "I want to get a database"). Tessmaster is the user-side tool from EBay deploying Kubernetes on different underlying cloud infrastructures: takes a bunch of cloud credentials, then deploys, grows and shrinks Kubernetes cluster for you. Magnum is the infrastructure-side tool from OpenStack giving you COE-as-a-service, through a provisioning API. Jay is advocating for Trove to be more like Tessmaster, and less like Magnum. I think I agree with Zane that those are two different approaches: From a public cloud provider perspective serving lots of small users, I think a provisioning API makes sense. The user in that case is in a "black box" approach, so I think the resulting resources should not really be accessible as VMs by the tenant, even if they end up being Nova VMs. The provisioning API could propose several options (K8s or Mesos, MySQL or PostgreSQL). From a private cloud / hybrid cloud / large cloud user perspective, the user-side deployment tool, letting you deploy the software on various types of infrastructure, probably makes more sense. It's probably more work to run it, but you gain in flexibility. That user-side tool would probably not support multiple options, but be application-specific. So yes, ideally we would cover both. Because they target different users, and both are right... -- Thierry Carrez (ttx) __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
On 21/06/17 02:08, Jay Pipes wrote: On 06/20/2017 09:42 AM, Doug Hellmann wrote: Does "service VM" need to be a first-class thing? Akanda creates them, using a service user. The VMs are tied to a "router" which is the billable resource that the user understands and interacts with through the API. Frankly, I believe all of these types of services should be built as applications that run on OpenStack (or other) infrastructure. In other words, they should not be part of the infrastructure itself. There's really no need for a user of a DBaaS to have access to the host or hosts the DB is running on. If the user really wanted that, they would just spin up a VM/baremetal server and install the thing themselves. Yes, I think this area is where some hard thinking would be rewarded. I recall when I first met Trove, in my mind I expected to be 'carving off a piece of database'...and was a bit surprised to discover that it (essentially) leveraged Nova VM + OS + DB (no criticism intended - just saying I was surprised). Of course after delving into how it worked I realized that it did make sense to make use of the various Nova things (schedulers etc)*but* now we are thinking about re-architecting (plus more options exist now), it would make sense to revisit this area. Best wishes Mark __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
On 20/06/17 11:45, Jay Pipes wrote: Good discussion, Zane. Comments inline. ++ On 06/20/2017 11:01 AM, Zane Bitter wrote: On 20/06/17 10:08, Jay Pipes wrote: On 06/20/2017 09:42 AM, Doug Hellmann wrote: Does "service VM" need to be a first-class thing? Akanda creates them, using a service user. The VMs are tied to a "router" which is the billable resource that the user understands and interacts with through the API. Frankly, I believe all of these types of services should be built as applications that run on OpenStack (or other) infrastructure. In other words, they should not be part of the infrastructure itself. There's really no need for a user of a DBaaS to have access to the host or hosts the DB is running on. If the user really wanted that, they would just spin up a VM/baremetal server and install the thing themselves. Hey Jay, I'd be interested in exploring this idea with you, because I think everyone agrees that this would be a good goal, but at least in my mind it's not obvious what the technical solution should be. (Actually, I've read your email a bunch of times now, and I go back and forth on which one you're actually advocating for.) The two options, as I see it, are as follows: 1) The database VMs are created in the user's tena^W project. They connect directly to the tenant's networks, are governed by the user's quota, and are billed to the project as Nova VMs (on top of whatever additional billing might come along with the management services). A [future] feature in Nova (https://review.openstack.org/#/c/438134/) allows the Trove service to lock down access so that the user cannot actually interact with the server using Nova, but must go through the Trove API. On a cloud that doesn't include Trove, a user could run Trove as an application themselves and all it would have to do differently is not pass the service token to lock down the VM. alternatively: 2) The database VMs are created in a project belonging to the operator of the service. They're connected to the user's network through , and isolated from other users' databases running in the same project through . Trove has its own quota management and billing. The user cannot interact with the server using Nova since it is owned by a different project. On a cloud that doesn't include Trove, a user could run Trove as an application themselves, by giving it credentials for their own project and disabling all of the cross-tenant networking stuff. None of the above :) Don't think about VMs at all. Or networking plumbing. Or volume storage or any of that. OK, but somebody has to ;) Think only in terms of what a user of a DBaaS really wants. At the end of the day, all they want is an address in the cloud where they can point their application to write and read data from. Do they want that data connection to be fast and reliable? Of course, but how that happens is irrelevant to them Do they want that data to be safe and backed up? Of course, but how that happens is irrelevant to them. Fair enough. The world has changed a lot since RDS (which was the model for Trove) was designed, it's certainly worth reviewing the base assumptions before embarking on a new design. The problem with many of these high-level *aaS projects is that they consider their user to be a typical tenant of general cloud infrastructure -- focused on launching VMs and creating volumes and networks etc. And the discussions around the implementation of these projects always comes back to minutia about how to set up secure communication channels between a control plane message bus and the service VMs. Incidentally, the reason that discussions always come back to that is because OpenStack isn't very good at it, which is a huge problem not only for the *aaS projects but for user applications in general running on OpenStack. If we had fine-grained authorisation and ubiquitous multi-tenant asynchronous messaging in OpenStack then I firmly believe that we, and application developers, would be in much better shape. If you create these projects as applications that run on cloud infrastructure (OpenStack, k8s or otherwise), I'm convinced there's an interesting idea here, but the terminology you're using doesn't really capture it. When you say 'as applications that run on cloud infrastructure', it sounds like you mean they should run in a Nova VM, or in a Kubernetes cluster somewhere, rather than on the OpenStack control plane. I don't think that's what you mean though, because you can (and IIUC Rackspace does) deploy OpenStack services that way already, and it has no real effect on the architecture of those services. then the discussions focus instead on how the real end-users -- the ones that actually call the APIs and utilize the service -- would interact with the APIs and not the underlying infrastructure itself. Here's an example to think about... What if a provider of this DBaaS ser
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
Excerpts from Jay Pipes's message of 2017-06-20 10:08:54 -0400: > On 06/20/2017 09:42 AM, Doug Hellmann wrote: > > Does "service VM" need to be a first-class thing? Akanda creates > > them, using a service user. The VMs are tied to a "router" which > > is the billable resource that the user understands and interacts with > > through the API. > > Frankly, I believe all of these types of services should be built as > applications that run on OpenStack (or other) infrastructure. In other > words, they should not be part of the infrastructure itself. > > There's really no need for a user of a DBaaS to have access to the host > or hosts the DB is running on. If the user really wanted that, they > would just spin up a VM/baremetal server and install the thing themselves. > There's one reason, and that is specialized resources that we don't trust to be multi-tenant. Baremetal done multi-tenant is hard, just ask our friends who were/are running OnMetal. But baremetal done for the purposes of running MySQL clusters that only allow users to access MySQL and control everything via an agent of sorts is a lot simpler. You can let them all share a layer 2 with no MAC filtering for instance, since you are in control at the OS level. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
On 06/20/2017 11:45 AM, Jay Pipes wrote: Good discussion, Zane. Comments inline. On 06/20/2017 11:01 AM, Zane Bitter wrote: 2) The database VMs are created in a project belonging to the operator of the service. They're connected to the user's network through , and isolated from other users' databases running in the same project through . Trove has its own quota management and billing. The user cannot interact with the server using Nova since it is owned by a different project. On a cloud that doesn't include Trove, a user could run Trove as an application themselves, by giving it credentials for their own project and disabling all of the cross-tenant networking stuff. None of the above :) Don't think about VMs at all. Or networking plumbing. Or volume storage or any of that. Think only in terms of what a user of a DBaaS really wants. At the end of the day, all they want is an address in the cloud where they can point their application to write and read data from. Do they want that data connection to be fast and reliable? Of course, but how that happens is irrelevant to them Do they want that data to be safe and backed up? Of course, but how that happens is irrelevant to them. Hi, I'm just newb trying to follow along...isnt that what #2 is proposing? just it's talking about the implementation a bit. (Guess this comes down to the terms "user" and "operator" - e.g. "operator" has the VMs w/ the DBs, "user" gets a login to a DB. "user" is the person who pushes the trove button to "give me a database") The problem with many of these high-level *aaS projects is that they consider their user to be a typical tenant of general cloud infrastructure -- focused on launching VMs and creating volumes and networks etc. And the discussions around the implementation of these projects always comes back to minutia about how to set up secure communication channels between a control plane message bus and the service VMs. If you create these projects as applications that run on cloud infrastructure (OpenStack, k8s or otherwise), then the discussions focus instead on how the real end-users -- the ones that actually call the APIs and utilize the service -- would interact with the APIs and not the underlying infrastructure itself. Here's an example to think about... What if a provider of this DBaaS service wanted to jam 100 database instances on a single VM and provide connectivity to those database instances to 100 different tenants? Would those tenants know if those databases were all serviced from a single database server process running on the VM? Or 100 contains each running a separate database server process? Or 10 containers running 10 database server processes each? No, of course not. And the tenant wouldn't care at all, because the point of the DBaaS service is to get a database. It isn't to get one or more VMs/containers/baremetal servers. At the end of the day, I think Trove is best implemented as a hosted application that exposes an API to its users that is entirely separate from the underlying infrastructure APIs like Cinder/Nova/Neutron. This is similar to Kevin's k8s Operator idea, which I support but in a generic fashion that isn't specific to k8s. In the same way that k8s abstracts the underlying infrastructure (via its "cloud provider" concept), I think that Trove and similar projects need to use a similar abstraction and focus on providing a different API to their users that doesn't leak the underlying infrastructure API concepts out. Best, -jay Of course the current situation, as Amrith alluded to, where the default is option (1) except without the lock-down feature in Nova, though some operators are deploying option (2) but it's not tested upstream... clearly that's the worst of all possible worlds, and AIUI nobody disagrees with that. To my mind, (1) sounds more like "applications that run on OpenStack (or other) infrastructure", since it doesn't require stuff like the admin-only cross-project networking that makes it effectively "part of the infrastructure itself" - as evidenced by the fact that unprivileged users can run it standalone with little more than a simple auth middleware change. But I suspect you are going to use similar logic to argue for (2)? I'd be interested to hear your thoughts. cheers, Zane. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
Good discussion, Zane. Comments inline. On 06/20/2017 11:01 AM, Zane Bitter wrote: On 20/06/17 10:08, Jay Pipes wrote: On 06/20/2017 09:42 AM, Doug Hellmann wrote: Does "service VM" need to be a first-class thing? Akanda creates them, using a service user. The VMs are tied to a "router" which is the billable resource that the user understands and interacts with through the API. Frankly, I believe all of these types of services should be built as applications that run on OpenStack (or other) infrastructure. In other words, they should not be part of the infrastructure itself. There's really no need for a user of a DBaaS to have access to the host or hosts the DB is running on. If the user really wanted that, they would just spin up a VM/baremetal server and install the thing themselves. Hey Jay, I'd be interested in exploring this idea with you, because I think everyone agrees that this would be a good goal, but at least in my mind it's not obvious what the technical solution should be. (Actually, I've read your email a bunch of times now, and I go back and forth on which one you're actually advocating for.) The two options, as I see it, are as follows: 1) The database VMs are created in the user's tena^W project. They connect directly to the tenant's networks, are governed by the user's quota, and are billed to the project as Nova VMs (on top of whatever additional billing might come along with the management services). A [future] feature in Nova (https://review.openstack.org/#/c/438134/) allows the Trove service to lock down access so that the user cannot actually interact with the server using Nova, but must go through the Trove API. On a cloud that doesn't include Trove, a user could run Trove as an application themselves and all it would have to do differently is not pass the service token to lock down the VM. alternatively: 2) The database VMs are created in a project belonging to the operator of the service. They're connected to the user's network through , and isolated from other users' databases running in the same project through . Trove has its own quota management and billing. The user cannot interact with the server using Nova since it is owned by a different project. On a cloud that doesn't include Trove, a user could run Trove as an application themselves, by giving it credentials for their own project and disabling all of the cross-tenant networking stuff. None of the above :) Don't think about VMs at all. Or networking plumbing. Or volume storage or any of that. Think only in terms of what a user of a DBaaS really wants. At the end of the day, all they want is an address in the cloud where they can point their application to write and read data from. Do they want that data connection to be fast and reliable? Of course, but how that happens is irrelevant to them Do they want that data to be safe and backed up? Of course, but how that happens is irrelevant to them. The problem with many of these high-level *aaS projects is that they consider their user to be a typical tenant of general cloud infrastructure -- focused on launching VMs and creating volumes and networks etc. And the discussions around the implementation of these projects always comes back to minutia about how to set up secure communication channels between a control plane message bus and the service VMs. If you create these projects as applications that run on cloud infrastructure (OpenStack, k8s or otherwise), then the discussions focus instead on how the real end-users -- the ones that actually call the APIs and utilize the service -- would interact with the APIs and not the underlying infrastructure itself. Here's an example to think about... What if a provider of this DBaaS service wanted to jam 100 database instances on a single VM and provide connectivity to those database instances to 100 different tenants? Would those tenants know if those databases were all serviced from a single database server process running on the VM? Or 100 contains each running a separate database server process? Or 10 containers running 10 database server processes each? No, of course not. And the tenant wouldn't care at all, because the point of the DBaaS service is to get a database. It isn't to get one or more VMs/containers/baremetal servers. At the end of the day, I think Trove is best implemented as a hosted application that exposes an API to its users that is entirely separate from the underlying infrastructure APIs like Cinder/Nova/Neutron. This is similar to Kevin's k8s Operator idea, which I support but in a generic fashion that isn't specific to k8s. In the same way that k8s abstracts the underlying infrastructure (via its "cloud provider" concept), I think that Trove and similar projects need to use a similar abstraction and focus on providing a different API to their users that doesn't leak the underlying infrastructure API concepts out
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
On 20/06/17 10:08, Jay Pipes wrote: On 06/20/2017 09:42 AM, Doug Hellmann wrote: Does "service VM" need to be a first-class thing? Akanda creates them, using a service user. The VMs are tied to a "router" which is the billable resource that the user understands and interacts with through the API. Frankly, I believe all of these types of services should be built as applications that run on OpenStack (or other) infrastructure. In other words, they should not be part of the infrastructure itself. There's really no need for a user of a DBaaS to have access to the host or hosts the DB is running on. If the user really wanted that, they would just spin up a VM/baremetal server and install the thing themselves. Hey Jay, I'd be interested in exploring this idea with you, because I think everyone agrees that this would be a good goal, but at least in my mind it's not obvious what the technical solution should be. (Actually, I've read your email a bunch of times now, and I go back and forth on which one you're actually advocating for.) The two options, as I see it, are as follows: 1) The database VMs are created in the user's tena^W project. They connect directly to the tenant's networks, are governed by the user's quota, and are billed to the project as Nova VMs (on top of whatever additional billing might come along with the management services). A [future] feature in Nova (https://review.openstack.org/#/c/438134/) allows the Trove service to lock down access so that the user cannot actually interact with the server using Nova, but must go through the Trove API. On a cloud that doesn't include Trove, a user could run Trove as an application themselves and all it would have to do differently is not pass the service token to lock down the VM. alternatively: 2) The database VMs are created in a project belonging to the operator of the service. They're connected to the user's network through , and isolated from other users' databases running in the same project through . Trove has its own quota management and billing. The user cannot interact with the server using Nova since it is owned by a different project. On a cloud that doesn't include Trove, a user could run Trove as an application themselves, by giving it credentials for their own project and disabling all of the cross-tenant networking stuff. Of course the current situation, as Amrith alluded to, where the default is option (1) except without the lock-down feature in Nova, though some operators are deploying option (2) but it's not tested upstream... clearly that's the worst of all possible worlds, and AIUI nobody disagrees with that. To my mind, (1) sounds more like "applications that run on OpenStack (or other) infrastructure", since it doesn't require stuff like the admin-only cross-project networking that makes it effectively "part of the infrastructure itself" - as evidenced by the fact that unprivileged users can run it standalone with little more than a simple auth middleware change. But I suspect you are going to use similar logic to argue for (2)? I'd be interested to hear your thoughts. cheers, Zane. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
On 18/06/17 07:35, Amrith Kumar wrote: Trove has evolved rapidly over the past several years, since integration in IceHouse when it only supported single instances of a few databases. Today it supports a dozen databases including clusters and replication. The user survey [1] indicates that while there is strong interest in the project, there are few large production deployments that are known of (by the development team). Recent changes in the OpenStack community at large (company realignments, acquisitions, layoffs) and the Trove community in particular, coupled with a mounting burden of technical debt have prompted me to make this proposal to re-architect Trove. This email summarizes several of the issues that face the project, both structurally and architecturally. This email does not claim to include a detailed specification for what the new Trove would look like, merely the recommendation that the community should come together and develop one so that the project can be sustainable and useful to those who wish to use it in the future. TL;DR Trove, with support for a dozen or so databases today, finds itself in a bind because there are few developers, and a code-base with a significant amount of technical debt. Some architectural choices which the team made over the years have consequences which make the project less than ideal for deployers. Given that there are no major production deployments of Trove at present, this provides us an opportunity to reset the project, learn from our v1 and come up with a strong v2. An important aspect of making this proposal work is that we seek to eliminate the effort (planning, and coding) involved in migrating existing Trove v1 deployments to the proposed Trove v2. Effectively, with work beginning on Trove v2 as proposed here, Trove v1 as released with Pike will be marked as deprecated and users will have to migrate to Trove v2 when it becomes available. I'm personally fine with not having a migration path (because I'm not personally running Trove v1 ;) although Thierry's point about choosing a different name is valid and surely something the TC will want to weigh in on. However, I am always concerned about throwing out working code and rewriting from scratch. I'd be more comfortable if I saw some value being salvaged from the existing Trove project, other than as just an extended PoC/learning exercise. Would the API be similar to the current Trove one? Can at least some tests be salvaged to rapidly increase confidence that the new code works as expected? While I would very much like to continue to support the users on Trove v1 through this transition, the simple fact is that absent community participation this will be impossible. Furthermore, given that there are no production deployments of Trove at this time, it seems pointless to build that upgrade path from Trove v1 to Trove v2; it would be the proverbial bridge from nowhere. This (previous) statement is, I realize, contentious. There are those who have told me that an upgrade path must be provided, and there are those who have told me of unnamed deployments of Trove that would suffer. To this, all I can say is that if an upgrade path is of value to you, then please commit the development resources to participate in the community to make that possible. But equally, preventing a v2 of Trove or delaying it will only make the v1 that we have today less valuable. We have learned a lot from v1, and the hope is that we can address that in v2. Some of the more significant things that I have learned are: - We should adopt a versioned front-end API from the very beginning; making the REST API versioned is not a ‘v2 feature’ - A guest agent running on a tenant instance, with connectivity to a shared management message bus is a security loophole; encrypting traffic, per-tenant-passwords, and any other scheme is merely lipstick on a security hole Totally agree here, any component of the architecture that is accessed directly by multiple tenants needs to be natively multi-tenant. I believe this has been one of the barriers to adoption. - Reliance on Nova for compute resources is fine, but dependence on Nova VM specific capabilities (like instance rebuild) is not; it makes things like containers or bare-metal second class citizens - A fair portion of what Trove does is resource orchestration; don’t reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as far along when Trove got started but that’s not the case today and we have an opportunity to fix that now +1, obviously ;) Although I also think Kevin's suggestion is worthy of serious consideration. - A similarly significant portion of what Trove does is to implement a state-machine that will perform specific workflows involved in implementing database specific operations. This makes the Trove taskmanager a stateful entity. Some of the operations could take a fair a
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
On 06/20/2017 09:42 AM, Doug Hellmann wrote: Does "service VM" need to be a first-class thing? Akanda creates them, using a service user. The VMs are tied to a "router" which is the billable resource that the user understands and interacts with through the API. Frankly, I believe all of these types of services should be built as applications that run on OpenStack (or other) infrastructure. In other words, they should not be part of the infrastructure itself. There's really no need for a user of a DBaaS to have access to the host or hosts the DB is running on. If the user really wanted that, they would just spin up a VM/baremetal server and install the thing themselves. Best, -jay __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
Excerpts from Curtis's message of 2017-06-19 18:56:25 -0600: > On Sun, Jun 18, 2017 at 5:35 AM, Amrith Kumar wrote: > > Trove has evolved rapidly over the past several years, since integration in > > IceHouse when it only supported single instances of a few databases. Today > > it supports a dozen databases including clusters and replication. > > > > The user survey [1] indicates that while there is strong interest in the > > project, there are few large production deployments that are known of (by > > the development team). > > > > Recent changes in the OpenStack community at large (company realignments, > > acquisitions, layoffs) and the Trove community in particular, coupled with a > > mounting burden of technical debt have prompted me to make this proposal to > > re-architect Trove. > > > > This email summarizes several of the issues that face the project, both > > structurally and architecturally. This email does not claim to include a > > detailed specification for what the new Trove would look like, merely the > > recommendation that the community should come together and develop one so > > that the project can be sustainable and useful to those who wish to use it > > in the future. > > > > TL;DR > > > > Trove, with support for a dozen or so databases today, finds itself in a > > bind because there are few developers, and a code-base with a significant > > amount of technical debt. > > > > Some architectural choices which the team made over the years have > > consequences which make the project less than ideal for deployers. > > > > Given that there are no major production deployments of Trove at present, > > this provides us an opportunity to reset the project, learn from our v1 and > > come up with a strong v2. > > > > An important aspect of making this proposal work is that we seek to > > eliminate the effort (planning, and coding) involved in migrating existing > > Trove v1 deployments to the proposed Trove v2. Effectively, with work > > beginning on Trove v2 as proposed here, Trove v1 as released with Pike will > > be marked as deprecated and users will have to migrate to Trove v2 when it > > becomes available. > > > > While I would very much like to continue to support the users on Trove v1 > > through this transition, the simple fact is that absent community > > participation this will be impossible. Furthermore, given that there are no > > production deployments of Trove at this time, it seems pointless to build > > that upgrade path from Trove v1 to Trove v2; it would be the proverbial > > bridge from nowhere. > > > > This (previous) statement is, I realize, contentious. There are those who > > have told me that an upgrade path must be provided, and there are those who > > have told me of unnamed deployments of Trove that would suffer. To this, all > > I can say is that if an upgrade path is of value to you, then please commit > > the development resources to participate in the community to make that > > possible. But equally, preventing a v2 of Trove or delaying it will only > > make the v1 that we have today less valuable. > > > > We have learned a lot from v1, and the hope is that we can address that in > > v2. Some of the more significant things that I have learned are: > > > > - We should adopt a versioned front-end API from the very beginning; making > > the REST API versioned is not a ‘v2 feature’ > > > > - A guest agent running on a tenant instance, with connectivity to a shared > > management message bus is a security loophole; encrypting traffic, > > per-tenant-passwords, and any other scheme is merely lipstick on a security > > hole > > > > - Reliance on Nova for compute resources is fine, but dependence on Nova VM > > specific capabilities (like instance rebuild) is not; it makes things like > > containers or bare-metal second class citizens > > > > - A fair portion of what Trove does is resource orchestration; don’t > > reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as far > > along when Trove got started but that’s not the case today and we have an > > opportunity to fix that now > > > > - A similarly significant portion of what Trove does is to implement a > > state-machine that will perform specific workflows involved in implementing > > database specific operations. This makes the Trove taskmanager a stateful > > entity. Some of the operations could take a fair amount of time. This is a > > serious architectural flaw > > > > - Tenants should not ever be able to directly interact with the underlying > > storage and compute used by database instances; that should be the default > > configuration, not an untested deployment alternative > > > > As an operator I wouldn't run Trove as it is, unless I absolutely had to. > > I think it is a good idea to reboot the project. I really think the > concept of "service VMs" should be a thing. I'm not sure where the > OpenStack community has landed on that, my fault for not paying close > attention, but we should be
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
On 19/06/17 20:56, Curtis wrote: I really think the concept of "service VMs" should be a thing. I'm not sure where the OpenStack community has landed on that, my fault for not paying close attention, but we should be able to create VMs for a tenant that are not managed by the tenant but that could be billed to them in some fashion. At least that's my opinion. https://review.openstack.org/#/c/438134/ __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
On 20/06/17 12:56, Curtis wrote: > On Sun, Jun 18, 2017 at 5:35 AM, Amrith Kumar wrote: >> Trove has evolved rapidly over the past several years, since integration in >> IceHouse when it only supported single instances of a few databases. Today >> it supports a dozen databases including clusters and replication. >> >> The user survey [1] indicates that while there is strong interest in the >> project, there are few large production deployments that are known of (by >> the development team). >> >> Recent changes in the OpenStack community at large (company realignments, >> acquisitions, layoffs) and the Trove community in particular, coupled with a >> mounting burden of technical debt have prompted me to make this proposal to >> re-architect Trove. >> >> This email summarizes several of the issues that face the project, both >> structurally and architecturally. This email does not claim to include a >> detailed specification for what the new Trove would look like, merely the >> recommendation that the community should come together and develop one so >> that the project can be sustainable and useful to those who wish to use it >> in the future. >> >> TL;DR >> >> Trove, with support for a dozen or so databases today, finds itself in a >> bind because there are few developers, and a code-base with a significant >> amount of technical debt. >> >> Some architectural choices which the team made over the years have >> consequences which make the project less than ideal for deployers. >> >> Given that there are no major production deployments of Trove at present, >> this provides us an opportunity to reset the project, learn from our v1 and >> come up with a strong v2. >> >> An important aspect of making this proposal work is that we seek to >> eliminate the effort (planning, and coding) involved in migrating existing >> Trove v1 deployments to the proposed Trove v2. Effectively, with work >> beginning on Trove v2 as proposed here, Trove v1 as released with Pike will >> be marked as deprecated and users will have to migrate to Trove v2 when it >> becomes available. >> >> While I would very much like to continue to support the users on Trove v1 >> through this transition, the simple fact is that absent community >> participation this will be impossible. Furthermore, given that there are no >> production deployments of Trove at this time, it seems pointless to build >> that upgrade path from Trove v1 to Trove v2; it would be the proverbial >> bridge from nowhere. >> >> This (previous) statement is, I realize, contentious. There are those who >> have told me that an upgrade path must be provided, and there are those who >> have told me of unnamed deployments of Trove that would suffer. To this, all >> I can say is that if an upgrade path is of value to you, then please commit >> the development resources to participate in the community to make that >> possible. But equally, preventing a v2 of Trove or delaying it will only >> make the v1 that we have today less valuable. >> >> We have learned a lot from v1, and the hope is that we can address that in >> v2. Some of the more significant things that I have learned are: >> >> - We should adopt a versioned front-end API from the very beginning; making >> the REST API versioned is not a ‘v2 feature’ >> >> - A guest agent running on a tenant instance, with connectivity to a shared >> management message bus is a security loophole; encrypting traffic, >> per-tenant-passwords, and any other scheme is merely lipstick on a security >> hole >> >> - Reliance on Nova for compute resources is fine, but dependence on Nova VM >> specific capabilities (like instance rebuild) is not; it makes things like >> containers or bare-metal second class citizens >> >> - A fair portion of what Trove does is resource orchestration; don’t >> reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as far >> along when Trove got started but that’s not the case today and we have an >> opportunity to fix that now >> >> - A similarly significant portion of what Trove does is to implement a >> state-machine that will perform specific workflows involved in implementing >> database specific operations. This makes the Trove taskmanager a stateful >> entity. Some of the operations could take a fair amount of time. This is a >> serious architectural flaw >> >> - Tenants should not ever be able to directly interact with the underlying >> storage and compute used by database instances; that should be the default >> configuration, not an untested deployment alternative >> > As an operator I wouldn't run Trove as it is, unless I absolutely had to. > > I think it is a good idea to reboot the project. I really think the > concept of "service VMs" should be a thing. I'm not sure where the > OpenStack community has landed on that, my fault for not paying close > attention, but we should be able to create VMs for a tenant that are > not managed by the tenant but that could be billed to them in some > fa
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
On Sun, Jun 18, 2017 at 5:35 AM, Amrith Kumar wrote: > Trove has evolved rapidly over the past several years, since integration in > IceHouse when it only supported single instances of a few databases. Today > it supports a dozen databases including clusters and replication. > > The user survey [1] indicates that while there is strong interest in the > project, there are few large production deployments that are known of (by > the development team). > > Recent changes in the OpenStack community at large (company realignments, > acquisitions, layoffs) and the Trove community in particular, coupled with a > mounting burden of technical debt have prompted me to make this proposal to > re-architect Trove. > > This email summarizes several of the issues that face the project, both > structurally and architecturally. This email does not claim to include a > detailed specification for what the new Trove would look like, merely the > recommendation that the community should come together and develop one so > that the project can be sustainable and useful to those who wish to use it > in the future. > > TL;DR > > Trove, with support for a dozen or so databases today, finds itself in a > bind because there are few developers, and a code-base with a significant > amount of technical debt. > > Some architectural choices which the team made over the years have > consequences which make the project less than ideal for deployers. > > Given that there are no major production deployments of Trove at present, > this provides us an opportunity to reset the project, learn from our v1 and > come up with a strong v2. > > An important aspect of making this proposal work is that we seek to > eliminate the effort (planning, and coding) involved in migrating existing > Trove v1 deployments to the proposed Trove v2. Effectively, with work > beginning on Trove v2 as proposed here, Trove v1 as released with Pike will > be marked as deprecated and users will have to migrate to Trove v2 when it > becomes available. > > While I would very much like to continue to support the users on Trove v1 > through this transition, the simple fact is that absent community > participation this will be impossible. Furthermore, given that there are no > production deployments of Trove at this time, it seems pointless to build > that upgrade path from Trove v1 to Trove v2; it would be the proverbial > bridge from nowhere. > > This (previous) statement is, I realize, contentious. There are those who > have told me that an upgrade path must be provided, and there are those who > have told me of unnamed deployments of Trove that would suffer. To this, all > I can say is that if an upgrade path is of value to you, then please commit > the development resources to participate in the community to make that > possible. But equally, preventing a v2 of Trove or delaying it will only > make the v1 that we have today less valuable. > > We have learned a lot from v1, and the hope is that we can address that in > v2. Some of the more significant things that I have learned are: > > - We should adopt a versioned front-end API from the very beginning; making > the REST API versioned is not a ‘v2 feature’ > > - A guest agent running on a tenant instance, with connectivity to a shared > management message bus is a security loophole; encrypting traffic, > per-tenant-passwords, and any other scheme is merely lipstick on a security > hole > > - Reliance on Nova for compute resources is fine, but dependence on Nova VM > specific capabilities (like instance rebuild) is not; it makes things like > containers or bare-metal second class citizens > > - A fair portion of what Trove does is resource orchestration; don’t > reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as far > along when Trove got started but that’s not the case today and we have an > opportunity to fix that now > > - A similarly significant portion of what Trove does is to implement a > state-machine that will perform specific workflows involved in implementing > database specific operations. This makes the Trove taskmanager a stateful > entity. Some of the operations could take a fair amount of time. This is a > serious architectural flaw > > - Tenants should not ever be able to directly interact with the underlying > storage and compute used by database instances; that should be the default > configuration, not an untested deployment alternative > As an operator I wouldn't run Trove as it is, unless I absolutely had to. I think it is a good idea to reboot the project. I really think the concept of "service VMs" should be a thing. I'm not sure where the OpenStack community has landed on that, my fault for not paying close attention, but we should be able to create VMs for a tenant that are not managed by the tenant but that could be billed to them in some fashion. At least that's my opinion. > - The CI should test all databases that are considered to be ‘supported’ > without excessive use o
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
Amrith, Some good thoughts in your email. I've replied to a few specific pieces below. Overall I think it's a good start to a plan. On Sun, Jun 18, 2017 at 5:35 AM, Amrith Kumar wrote: > Trove has evolved rapidly over the past several years, since integration > in IceHouse when it only supported single instances of a few databases. > Today it supports a dozen databases including clusters and replication. > > The user survey [1] indicates that while there is strong interest in the > project, there are few large production deployments that are known of (by > the development team). > > Recent changes in the OpenStack community at large (company realignments, > acquisitions, layoffs) and the Trove community in particular, coupled with > a mounting burden of technical debt have prompted me to make this proposal > to re-architect Trove. > > This email summarizes several of the issues that face the project, both > structurally and architecturally. This email does not claim to include a > detailed specification for what the new Trove would look like, merely the > recommendation that the community should come together and develop one so > that the project can be sustainable and useful to those who wish to use it > in the future. > > TL;DR > > Trove, with support for a dozen or so databases today, finds itself in a > bind because there are few developers, and a code-base with a significant > amount of technical debt. > > Some architectural choices which the team made over the years have > consequences which make the project less than ideal for deployers. > > Given that there are no major production deployments of Trove at present, > this provides us an opportunity to reset the project, learn from our v1 and > come up with a strong v2. > > An important aspect of making this proposal work is that we seek to > eliminate the effort (planning, and coding) involved in migrating existing > Trove v1 deployments to the proposed Trove v2. Effectively, with work > beginning on Trove v2 as proposed here, Trove v1 as released with Pike will > be marked as deprecated and users will have to migrate to Trove v2 when it > becomes available. > > While I would very much like to continue to support the users on Trove v1 > through this transition, the simple fact is that absent community > participation this will be impossible. Furthermore, given that there are no > production deployments of Trove at this time, it seems pointless to build > that upgrade path from Trove v1 to Trove v2; it would be the proverbial > bridge from nowhere. > > This (previous) statement is, I realize, contentious. There are those who > have told me that an upgrade path must be provided, and there are those who > have told me of unnamed deployments of Trove that would suffer. To this, > all I can say is that if an upgrade path is of value to you, then please > commit the development resources to participate in the community to make > that possible. But equally, preventing a v2 of Trove or delaying it will > only make the v1 that we have today less valuable. > > We have learned a lot from v1, and the hope is that we can address that in > v2. Some of the more significant things that I have learned are: > > - We should adopt a versioned front-end API from the very beginning; > making the REST API versioned is not a ‘v2 feature’ > > - A guest agent running on a tenant instance, with connectivity to a > shared management message bus is a security loophole; encrypting traffic, > per-tenant-passwords, and any other scheme is merely lipstick on a security > hole > This was a major concern when we deployed it and drove the architectural decisions. I'd be glad to see it resolved or re-architected. > > - Reliance on Nova for compute resources is fine, but dependence on Nova > VM specific capabilities (like instance rebuild) is not; it makes things > like containers or bare-metal second class citizens > > - A fair portion of what Trove does is resource orchestration; don’t > reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as far > along when Trove got started but that’s not the case today and we have an > opportunity to fix that now > +1 > > - A similarly significant portion of what Trove does is to implement a > state-machine that will perform specific workflows involved in implementing > database specific operations. This makes the Trove taskmanager a stateful > entity. Some of the operations could take a fair amount of time. This is a > serious architectural flaw > > - Tenants should not ever be able to directly interact with the underlying > storage and compute used by database instances; that should be the default > configuration, not an untested deployment alternative > +1 to this also. Trove should offer a black box DB as a Service, not something the user sees as an instance+storage that they feel that they can manipulate. > > - The CI should test all databases that are considered to be ‘supported’ > without excessive use of resources
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
Thanks for starting this difficult discussion. I think I agree with all the lessons learned except the nova one. while you can treat containers and vm's the same, after years of using both, I really don't think its a good idea to treat them equally. Containers can't work properly if used as a vm. (really, really.) I agree whole heartedly with your statement that its mostly an orchestration problem and should reuse stuff now that there are options. I would propose the following that I think meets your goals and could widen your contributor base substantially: Look at the Kubernetes (k8s) concept of Operator -> https://coreos.com/blog/introducing-operators.html They allow application specific logic to be added to Kubernetes while reusing the rest of k8s to do what its good at. Container Orchestration. etcd is just a clustered database and if the operator concept works for it, it should also work for other databases such as Gallera. Where I think the containers/vm thing is incompatible is the thing I think will make Trove's life easier. You can think of a member of the database as few different components, such as: * main database process * metrics gatherer (such as https://github.com/prometheus/mysqld_exporter) * trove_guest_agent With the current approach, all are mixed into the same vm image, making it very difficult to update the trove_guest_agent without touching the main database process. (needed when you upgrade the trove controllers). With the k8s sidecar concept, each would be a separate container loaded into the same pod. So rather then needing to maintain a trove image for every possible combination of db version, trove version, etc, you can reuse upstream database containers along with trove provided guest agents. There's a secure channel between kube-apiserver and kubelet so you can reuse it for secure communications. No need to add anything for secure communication. trove engine -> kubectl exec x-db -c guest_agent some command. There is k8s federation, so if the operator was started at the federation level, it can cross multiple OpenStack regions. Another big feature I that hasn't been mentioned yet that I think is critical. In our performance tests, databases in VM's have never performed particularly well. Using k8s as a base, bare metal nodes could be pulled in easily, with dedicated disk or ssd's that the pods land on that are very very close to the database. This should give native performance. So, my suggestion would be to strongly consider basing Trove v2 on Kubernetes. It can provide a huge bang for the buck, simplifying the Trove architecture substantially while gaining the new features your list as being important. The Trove v2 OpenStack api can be exposed as a very thin wrapper over k8s Third Party Resources (TPR) and would make Trove entirely stateless. k8s maintains all state for everything in etcd. Please consider this architecture. Thanks, Kevin From: Amrith Kumar [amrith.ku...@gmail.com] Sent: Sunday, June 18, 2017 4:35 AM To: OpenStack Development Mailing List (not for usage questions) Subject: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove Trove has evolved rapidly over the past several years, since integration in IceHouse when it only supported single instances of a few databases. Today it supports a dozen databases including clusters and replication. The user survey [1] indicates that while there is strong interest in the project, there are few large production deployments that are known of (by the development team). Recent changes in the OpenStack community at large (company realignments, acquisitions, layoffs) and the Trove community in particular, coupled with a mounting burden of technical debt have prompted me to make this proposal to re-architect Trove. This email summarizes several of the issues that face the project, both structurally and architecturally. This email does not claim to include a detailed specification for what the new Trove would look like, merely the recommendation that the community should come together and develop one so that the project can be sustainable and useful to those who wish to use it in the future. TL;DR Trove, with support for a dozen or so databases today, finds itself in a bind because there are few developers, and a code-base with a significant amount of technical debt. Some architectural choices which the team made over the years have consequences which make the project less than ideal for deployers. Given that there are no major production deployments of Trove at present, this provides us an opportunity to reset the project, learn from our v1 and come up with a strong v2. An important aspect of making this proposal work is that we seek to eliminate the effort (planning, and coding) involved in migrating existing Trove v1 deployments to the proposed Trove v2. Ef
Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
Amrith Kumar wrote: > [...] > An important aspect of making this proposal work is that we seek to > eliminate the effort (planning, and coding) involved in migrating > existing Trove v1 deployments to the proposed Trove v2. Effectively, > with work beginning on Trove v2 as proposed here, Trove v1 as released > with Pike will be marked as deprecated and users will have to migrate to > Trove v2 when it becomes available. > > While I would very much like to continue to support the users on Trove > v1 through this transition, the simple fact is that absent community > participation this will be impossible. Furthermore, given that there are > no production deployments of Trove at this time, it seems pointless to > build that upgrade path from Trove v1 to Trove v2; it would be the > proverbial bridge from nowhere. > [...] From an OpenStack project naming perspective, IMHO the line between a "v2" and a completely new project (with a new name) is whether you provide an upgrade path. I feel like if you won't support v1 users at all (and I understand the reasons why you wouldn't), the new project should not be called "Trove v2", but "Hoard". I don't really want to set a precedent of breaking users by restarting from scratch and calling it "v2", while everywhere else we encourage projects to never break their users. In all cases, providing offline tooling to migrate your Trove resources to Hoard equivalents would be a nice plus, but I'd say that this tooling is likely to appear if there is a need. Just be receptive to the idea of adding that in a tools/ directory :) -- Thierry Carrez (ttx) __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
Trove has evolved rapidly over the past several years, since integration in IceHouse when it only supported single instances of a few databases. Today it supports a dozen databases including clusters and replication. The user survey [1] indicates that while there is strong interest in the project, there are few large production deployments that are known of (by the development team). Recent changes in the OpenStack community at large (company realignments, acquisitions, layoffs) and the Trove community in particular, coupled with a mounting burden of technical debt have prompted me to make this proposal to re-architect Trove. This email summarizes several of the issues that face the project, both structurally and architecturally. This email does not claim to include a detailed specification for what the new Trove would look like, merely the recommendation that the community should come together and develop one so that the project can be sustainable and useful to those who wish to use it in the future. TL;DR Trove, with support for a dozen or so databases today, finds itself in a bind because there are few developers, and a code-base with a significant amount of technical debt. Some architectural choices which the team made over the years have consequences which make the project less than ideal for deployers. Given that there are no major production deployments of Trove at present, this provides us an opportunity to reset the project, learn from our v1 and come up with a strong v2. An important aspect of making this proposal work is that we seek to eliminate the effort (planning, and coding) involved in migrating existing Trove v1 deployments to the proposed Trove v2. Effectively, with work beginning on Trove v2 as proposed here, Trove v1 as released with Pike will be marked as deprecated and users will have to migrate to Trove v2 when it becomes available. While I would very much like to continue to support the users on Trove v1 through this transition, the simple fact is that absent community participation this will be impossible. Furthermore, given that there are no production deployments of Trove at this time, it seems pointless to build that upgrade path from Trove v1 to Trove v2; it would be the proverbial bridge from nowhere. This (previous) statement is, I realize, contentious. There are those who have told me that an upgrade path must be provided, and there are those who have told me of unnamed deployments of Trove that would suffer. To this, all I can say is that if an upgrade path is of value to you, then please commit the development resources to participate in the community to make that possible. But equally, preventing a v2 of Trove or delaying it will only make the v1 that we have today less valuable. We have learned a lot from v1, and the hope is that we can address that in v2. Some of the more significant things that I have learned are: - We should adopt a versioned front-end API from the very beginning; making the REST API versioned is not a ‘v2 feature’ - A guest agent running on a tenant instance, with connectivity to a shared management message bus is a security loophole; encrypting traffic, per-tenant-passwords, and any other scheme is merely lipstick on a security hole - Reliance on Nova for compute resources is fine, but dependence on Nova VM specific capabilities (like instance rebuild) is not; it makes things like containers or bare-metal second class citizens - A fair portion of what Trove does is resource orchestration; don’t reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as far along when Trove got started but that’s not the case today and we have an opportunity to fix that now - A similarly significant portion of what Trove does is to implement a state-machine that will perform specific workflows involved in implementing database specific operations. This makes the Trove taskmanager a stateful entity. Some of the operations could take a fair amount of time. This is a serious architectural flaw - Tenants should not ever be able to directly interact with the underlying storage and compute used by database instances; that should be the default configuration, not an untested deployment alternative - The CI should test all databases that are considered to be ‘supported’ without excessive use of resources in the gate; better code modularization will help determine the tests which can safely be skipped in testing changes - Clusters should be first class citizens not an afterthought, single instance databases may be the ‘special case’, not the other way around - The project must provide guest images (or at least complete tooling for deployers to build these); while the project can’t distribute operating systems and database software, the current deployment model merely impedes adoption - Clusters spanning OpenStack deployments are a real thing that must be supported This might sound harsh, that isn’t the intent. Each of these is the consequence