> -----Original Message----- > From: Zane Bitter [mailto:[email protected]] > Sent: June-20-17 4:57 PM > To: [email protected] > Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect > Trove > > On 20/06/17 11:45, Jay Pipes wrote: > > Good discussion, Zane. Comments inline. > > ++ > > > On 06/20/2017 11:01 AM, Zane Bitter wrote: > >> On 20/06/17 10:08, Jay Pipes wrote: > >>> On 06/20/2017 09:42 AM, Doug Hellmann wrote: > >>>> Does "service VM" need to be a first-class thing? Akanda creates > >>>> them, using a service user. The VMs are tied to a "router" which > is > >>>> the billable resource that the user understands and interacts with > >>>> through the API. > >>> > >>> Frankly, I believe all of these types of services should be built > as > >>> applications that run on OpenStack (or other) infrastructure. In > >>> other words, they should not be part of the infrastructure itself. > >>> > >>> There's really no need for a user of a DBaaS to have access to the > >>> host or hosts the DB is running on. If the user really wanted that, > >>> they would just spin up a VM/baremetal server and install the thing > >>> themselves. > >> > >> Hey Jay, > >> I'd be interested in exploring this idea with you, because I think > >> everyone agrees that this would be a good goal, but at least in my > >> mind it's not obvious what the technical solution should be. > >> (Actually, I've read your email a bunch of times now, and I go back > >> and forth on which one you're actually advocating for.) The two > >> options, as I see it, are as follows: > >> > >> 1) The database VMs are created in the user's tena^W project. They > >> connect directly to the tenant's networks, are governed by the > user's > >> quota, and are billed to the project as Nova VMs (on top of whatever > >> additional billing might come along with the management services). A > >> [future] feature in Nova (https://review.openstack.org/#/c/438134/) > >> allows the Trove service to lock down access so that the user cannot > >> actually interact with the server using Nova, but must go through > the > >> Trove API. On a cloud that doesn't include Trove, a user could run > >> Trove as an application themselves and all it would have to do > >> differently is not pass the service token to lock down the VM. > >> > >> alternatively: > >> > >> 2) The database VMs are created in a project belonging to the > >> operator of the service. They're connected to the user's network > >> through <magic>, and isolated from other users' databases running in > >> the same project through <security groups? hierarchical projects? > magic?>. > >> Trove has its own quota management and billing. The user cannot > >> interact with the server using Nova since it is owned by a different > >> project. On a cloud that doesn't include Trove, a user could run > >> Trove as an application themselves, by giving it credentials for > >> their own project and disabling all of the cross-tenant networking > stuff. > > > > None of the above :) > > > > Don't think about VMs at all. Or networking plumbing. Or volume > > storage or any of that. > > OK, but somebody has to ;) > > > Think only in terms of what a user of a DBaaS really wants. At the > end > > of the day, all they want is an address in the cloud where they can > > point their application to write and read data from. > > > > Do they want that data connection to be fast and reliable? Of course, > > but how that happens is irrelevant to them > > > > Do they want that data to be safe and backed up? Of course, but how > > that happens is irrelevant to them. > > Fair enough. The world has changed a lot since RDS (which was the model > for Trove) was designed, it's certainly worth reviewing the base > assumptions before embarking on a new design. > > > The problem with many of these high-level *aaS projects is that they > > consider their user to be a typical tenant of general cloud > > infrastructure -- focused on launching VMs and creating volumes and > > networks etc. And the discussions around the implementation of these > > projects always comes back to minutia about how to set up secure > > communication channels between a control plane message bus and the > > service VMs. > > Incidentally, the reason that discussions always come back to that is > because OpenStack isn't very good at it, which is a huge problem not > only for the *aaS projects but for user applications in general running > on OpenStack. > > If we had fine-grained authorisation and ubiquitous multi-tenant > asynchronous messaging in OpenStack then I firmly believe that we, and > application developers, would be in much better shape. > > > If you create these projects as applications that run on cloud > > infrastructure (OpenStack, k8s or otherwise), > > I'm convinced there's an interesting idea here, but the terminology > you're using doesn't really capture it. When you say 'as applications > that run on cloud infrastructure', it sounds like you mean they should > run in a Nova VM, or in a Kubernetes cluster somewhere, rather than on > the OpenStack control plane. I don't think that's what you mean though, > because you can (and IIUC Rackspace does) deploy OpenStack services > that way already, and it has no real effect on the architecture of > those services. > > > then the discussions focus > > instead on how the real end-users -- the ones that actually call the > > APIs and utilize the service -- would interact with the APIs and not > > the underlying infrastructure itself. > > > > Here's an example to think about... > > > > What if a provider of this DBaaS service wanted to jam 100 database > > instances on a single VM and provide connectivity to those database > > instances to 100 different tenants? > > > > Would those tenants know if those databases were all serviced from a > > single database server process running on the VM? > > You bet they would when one (or all) of the other 99 decided to run a > really expensive query at an inopportune moment :) > > > Or 100 contains each > > running a separate database server process? Or 10 containers running > > 10 database server processes each? > > > > No, of course not. And the tenant wouldn't care at all, because the > > Well, if they had any kind of regulatory (or even performance) > requirements then the tenant might care really quite a lot. But I take > your point that many might not and it would be good to be able to offer > them lower cost options. > > > point of the DBaaS service is to get a database. It isn't to get one > > or more VMs/containers/baremetal servers. > > I'm not sure I entirely agree here. There are two kinds of DBaaS. One > is a data API: a multitenant database a la DynamoDB. Those are very > cool, and I'm excited about the potential to reduce the granularity of > billing to a minimum, in much the same way Swift does for storage, and > I'm sad that OpenStack's attempt in this space (MagnetoDB) didn't work > out. But Trove is not that. > > People use Trove because they want to use a *particular* database, but > still have all the upgrades, backups, &c. handled for them. Given that > the choice of database is explicitly *not* abstracted away from them, > things like how many different VMs/containers/baremetal servers the > database is running on are very much relevant IMHO, because what you > want depends on both the database and how you're trying to use it. And > because (afaik) none of them have native multitenancy, it's necessary > that no tenant should have to share with any other. > > Essentially Trove operates at a moderate level of abstraction - > somewhere between managing the database + the infrastructure it runs on > yourself and just an API endpoint you poke data into. It also operates > at the coarse end of a granularity spectrum running from > VMs->Containers->pay as you go. > > It's reasonable to want to move closer to the middle of the granularity > spectrum. But you can't go all the way to the high abstraction/fine > grained ends of the spectra (which turn out to be equivalent) without > becoming something qualitatively different. > > > At the end of the day, I think Trove is best implemented as a hosted > > application that exposes an API to its users that is entirely > separate > > from the underlying infrastructure APIs like Cinder/Nova/Neutron. > > > > This is similar to Kevin's k8s Operator idea, which I support but in > a > > generic fashion that isn't specific to k8s. > > > > In the same way that k8s abstracts the underlying infrastructure (via > > its "cloud provider" concept), I think that Trove and similar > projects > > need to use a similar abstraction and focus on providing a different > > API to their users that doesn't leak the underlying infrastructure > API > > concepts out. > > OK, so trying to summarise (stop me if I'm getting it wrong): > essentially you support option (2) because it is a closed abstraction. > Trove has its own quota management, billing, &c. and the user can't see > the VM, so the operator is free to substitute a different backend that > allocates compute capacity in finer-grained increments than Nova does. > > Interestingly, that's only an issue because there is no finer-grained > compute resource than a VM available through the OpenStack API. If > there were an OpenStack API (or even just a Keystone-authenticated API) > to a shared, multitenant container orchestration cluster, this wouldn't > be an issue. But apart from OpenShift, I can't think of any cloud
[Hongbin Lu] I just wanted to clarify that there is such OpenStack API, which is Zun. Zun's API is container-centric that would give you a finer-grained compute resource than a VM, which is a container. Zun is Keystone-authenticated and multitenant, and it can bundle with Heat [1] (or Senlin in the future) to provide container orchestration equivalent functionalities. [1] https://review.openstack.org/#/c/437810/ > service that's doing that - AWS, Google, OpenStack are all using the > model where the COE cluster is deployed on VMs that are owned by a > particular tenant. Of all the things you could run in containers on > shared servers, databases have arguably the most to lose (performance, > security) and the least to gain (since they're by definition stateful). > So my question is: > if this is such a good idea for databases, why isn't anybody doing it > for everything container-based? i.e. instead of Magnum/Zun should we > just be working on a Keystone auth gateway for OpenShift (a.k.a. the > _one_ thing that _everyone_ had hitherto agreed was definitely out of > scope :D )? > > Until then it seems to me that the tradeoff is between decoupling it > from the particular cloud it's running on so that users can optionally > deploy it standalone (essentially Vish's proposed solution for the *aaS > services from many moons ago) vs. decoupling it from OpenStack in > general so that the operator has more flexibility in how to deploy. > > I'd love to be able to cover both - from a user using it standalone to > spin up and manage a DB in containers on a shared PaaS, through to a > user accessing it as a service to provide a DB running on a dedicated > VM or bare metal server, and everything in between. I don't know is > such a thing is feasible. I suspect we're going to have to talk a lot > about VMs and network plumbing and volume storage :) > > cheers, > Zane. > > > Best, > > -jay > > > >> Of course the current situation, as Amrith alluded to, where the > >> default is option (1) except without the lock-down feature in Nova, > >> though some operators are deploying option (2) but it's not tested > >> upstream... clearly that's the worst of all possible worlds, and > AIUI > >> nobody disagrees with that. > >> > >> To my mind, (1) sounds more like "applications that run on OpenStack > >> (or other) infrastructure", since it doesn't require stuff like the > >> admin-only cross-project networking that makes it effectively "part > >> of the infrastructure itself" - as evidenced by the fact that > >> unprivileged users can run it standalone with little more than a > >> simple auth middleware change. But I suspect you are going to use > >> similar logic to argue for (2)? I'd be interested to hear your > thoughts. > >> > >> cheers, > >> Zane. > >> > >> > _____________________________________________________________________ > >> _____ > >> > >> OpenStack Development Mailing List (not for usage questions) > >> Unsubscribe: > >> [email protected]?subject:unsubscribe > >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > ______________________________________________________________________ > > ____ OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: > > [email protected]?subject:unsubscribe > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > _______________________________________________________________________ > ___ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: OpenStack-dev- > [email protected]?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
