On Fri, Jun 09, 2017 at 12:35:59PM -0700, Clark Boylan wrote: > On Fri, Jun 9, 2017, at 09:22 AM, Monty Taylor wrote: > > Hey all! > > > > Tristan has recently pushed up some patches related to providing a Web > > Dashboard for Zuul. We have a web app for nodepool. We already have the > > Github webhook receiver which is inbound http. There have been folks who > > have expressed interest in adding active-REST abilities for performing > > actions. AND we have the new websocket-based log streaming. > > > > We're currently using Paste for HTTP serving (which is effectively > > dead), autobahn for websockets and WebOB for request/response processing. > > > > This means that before we get too far down the road, it's probably time > > to pick how we're going to do those things in general. There are 2 > > questions on the table: > > > > * HTTP serving > > * REST framework > > > > They may or may not be related, and one of the options on the table > > implies an answer for both. I'm going to start with the answer I think > > we should pick: > > > > *** tl;dr *** > > > > We should use aiohttp with no extra REST framework. > > > > Meaning: > > > > - aiohttp serving REST and websocket streaming in a scale-out tier > > - talking RPC to the scheduler over gear or zk > > - possible in-process aiohttp endpoints for k8s style health endpoints > > > > Since we're talking about a web scale-out tier that we should just have > > a single web tier for zuul and nodepool. This continues the thinking > > that nodepool is a component of Zuul. > > I'm not sure that this is a great idea. We've already seen that people > have wanted to use nodepool without a Zuul and even without performing > CI. IIRC paul wanted to use it to keep a set of asterisks floating > around for example. We've also seen that people want to use > subcomponents of nodepool to build and manage a set of images for clouds > without making instances. > Ya, asterisk use case aside, I think image build as a service is a prime example of something nodepool could be great at on its own. Especially now that nodepool-builder is scaling out very well with zookeeper.
> In the past we have been careful to keep logical tools separate which > has made it easy for us to add new tools and remove old ones. > Operationally this may be perceived as making things more difficult to a > newcomer, but it makes life much much better 3-6 months down the road. > > > > > In order to write zuul jobs, end-users must know what node labels are > > available. A zuul client that says "please get me a list of available > > node labels" could make sense to a user. As we get more non-OpenStack > > users, those people may not have any concept that there is a separate > > thing called "nodepool". > > > > *** The MUCH more verbose version *** > > > > I'm now going to outline all of the thoughts and options I've had or > > have heard other people say. It's an extra complete list - there are > > ideas in here you might find silly/bad. But since we're picking a > > direction, I think it's important we consider the options in front of us. > > > > This will cover 3 http serving options: > > > > - WSGI > > - aiohttp > > - gRPC > > > > and 3 REST framework options: > > > > - pecan > > - flask-restplus > > - apistar > > > > ** HTTP Serving ** > > > > WSGI > > > > The WSGI approach is one we're all familiar with and it works with > > pretty much every existing Python REST framework. For us I believe if we > > go this route we'd want to serve it with something like uwsgi and > > Apache. That adds the need for an Apache layer and/or management uwsgi > > process. However, it means we can make use of normal tools we all likely > > know at least to some degree. > > FWIW I don't think Apache would be required. uWSGI is a fairly capable > http server aiui. You can also pip install uwsgi so the simple case > remains fairly simple I think. > > > > > A downside is that we'll need to continue to handle our Websockets work > > independently (which is what we're doing now) > > > > Because it's in a separate process, the API tier will need to make > > requests of the scheduler over a bus, which could be either gearman or > > zk. > > > > Note that OpenStack has decided that this is a better solution than > using web servers in the python process. That doesn't necessarily mean > it is the best choice for Zuul, but it seems like there is a lot we can > learn from the choice to switch to WSGI in OpenStack. > > > aiohttp > > > > Zuul v3 is Python3, which means we can use aiohttp. aiohttp isn't > > particularly compatible with the REST frameworks, but it has built-in > > route support and helpers for receiving and returning JSON. We don't > > need ORM mapping support, so the only thing we'd really be MISSING from > > REST frameworks is auto-generated documentation. > > > > aiohttp also supports websockets directly, so we could port the autobahn > > work to use aiohttp. > > > > aiohttp can be run in-process in a thread. However, websocket > > log-streaming is already a separate process for scaling purposes, so if > > we decide that one impl backend is a value, it probably makes sense to > > just stick the web tier in the websocket scaleout process anyway. > > > > However, we could probably write a facade layer with a gear backend and > > an in-memory backend so that simple users could just run the in-process > > version but scale-out was possible for larger installs (like us) > > > > Since aiohttp can be in-process, it also allows us to easily add some > > '/health' endpoints to all of our services directly, even if they aren't > > intended to be publicly consumable. That's important for running richly > > inside of things like kubernetes that like to check in on health status > > of services to know about rescheduling them. This way we could add a > > simple thread to the scheduler and the executors and the mergers and the > > nodepool launchers and builders that adds a '/health' endpoint. > > > > See above. OpenStack has decided this is the wrong route to take > (granted with eventlet and python2.7 not asyncio and python3.5). There > are scaling and debugging challenges faced when you try to run an in > process web server. > > > gRPC / gRPC-REST gateway > > > > This is a curve-ball. We could define our APIs using gRPC. That gets us > > a story for an API that is easily consumable by all sorts of clients, > > and that supports exciting things like bi-directional streaming > > channels. gRPC isn't (yet) consumable directly in browsers, nor does > > Github send gRPC webhooks. BUT - there is a REST Gateway for gRPC: > > > > https://github.com/grpc-ecosystem/grpc-gateway > > > > that generates HTTP/1.1+JSON interfaces from the gRPC descriptions and > > translates between protobuf and json automatically. The "REST" interface > > it produces does not support url-based parameters, so everything is done > > in payload bodies, so it's: > > > > GET /nodes > > { > > 'id': 1234 > > } > > > > rather than > > > > GET /nodes/1234 > > > > but that's still totally fine - and totally works for both status.json > > and GH webhooks. > > > > The catch is - grpc-gateway is a grpc compiler plugin that generates > > golang code. So we'd either have to write our own plugin that does the > > same thing but for generating python code, or we'd have to write our > > gRPC/REST layer in go. I betcha folks would appreciate if we implemented > > the plugin for python, but that's a long tent-pole for this purpose so I > > don't honestly think we should consider it. Therefore, we should > > consider that using gRPC + gRPC-REST implies writing the web-tier in go. > > That obviously implies an additional process that needs to talk over an > > RPC bus. > > > > There are clear complexity costs involved with adding a second language > > component, especially WRT deployment. (pip install zuul would not be > > sufficient) OTOH - it would open the door to using protobuf-based > > objects for internal communication, and would open the door for rich > > client apps without REST polling and also potentially nice Android apps > > (gRPC works great for mobile apps) I think that makes it a hard sell. > > > > THAT SAID - there are only 2 things that explicitly need REST over HTTP > > 1.1 - thats the github webhooks and status.json. We could write > > everything in gRPC except those two. Browser support for gRPC is coming > > soon (they've moved from "someone is working on it" to "contact us about > > early access") so status.json could move to being pure gRPC as well ... > > and the webhook endpoint is pretty simple, so just having it be an > > in-process aiohttp handler isn't a terrible cost. So if we thought > > "screw it, let's just gRPC and not have an HTTP/1.1 REST interface at > > all" - we can stay all in python and gRPC isn't a huge cost at that > > point. > > > > gRPC doesn't handle websockets - but we could still run the gRPC serving > > and the websocket serving out of the same scale-out web tier. > > > > Another data point for gRPC is that the etcd3 work in OpenStack found > that the existing python lib(s) for grpc don't play nice with eventlet > or asyncio or anything that isn't Thread() > (https://github.com/grpc/grpc/issues/6046 is the bug tracking that I > think). This would potentially make the use of asyncio elsewhere > (websockets) more complicated. > > > ** Summary > > > > Based on the three above, it seems like we need to think about separate > > web-tier regardless of choice. The one option that doesn't strictly > > require a separate tier is the one that lets us align on websockets, so > > it seems that co-location there would be simple. > > > > aiohttp seems like the cleanest forward path. It'll require reworking > > the autobahn code (sorry Shrews) - but is nicely aligned with our > > Python3 state. It's new - but it's not as totally new as gRPC is. And > > since we'll already have some websockets stuff, we could also write > > streaming websockets APIs for the things where we'd want that from gRPC. > > > > * REST Framework > > > > If we decide to go the WSGI route, then we need to talk REST frameworks > > (and it's possible we decide to go WSGI because we want to use a REST > > framework) > > > > I'm not sure I understand why the WSGI and REST frameworks are being > conflated. You can do one or the other or both and whichever you choose > shouldn't affect the other too much aiui. There is even a flask-aiohttp > lib. > > > The assumption in this case is that the websocket layer is a separate > > entity. > > > > There are three 'reasonable' options available: > > > > - pecan > > - flask-restplus > > - apistar > > > > pecan > > > > pecan is used in a lot of OpenStack services and is also used by > > Storyboard, so it's well known. Tristan's patches so far use Pecan, so > > we've got example code. > > > > On the other hand, Pecan seems to be mostly only used in OpenStack land > > and hasn't gotten much adoption elsewhere. > > > > flask-restplus > > > > https://flask-restplus.readthedocs.io/en/stable/ > > > > flask is extremely popular for folks doing REST in Python. > > flask-restplus is a flask extension that also produces Swagger Docs for > > the REST api, and provides for serving an interactive swagger-ui based > > browseable interface to the API. You can also define models using > > JSONSchema. Those are not needed for simple cases like status.json, but > > for fuller REST API might be nice. > > > > Of course, in all cases we could simply document our API using swagger > > and get the same thing - but that does involve maintaining model/api > > descriptions and documentation separately. > > > > apistar > > > > https://github.com/tomchristie/apistar > > > > apistar is BRAND NEW and was announced at this year's PyCon. It's from > > the Django folks and is aimed at writing REST separate from Django. > > > > It's python3 from scratch - although it's SO python3 focused that it > > requires python 3.6. This is because it makes use of type annotations: > > Type hinting is in python 3.5 and apistar's trove identifer things > mention 3.5 support (not sure if actually the case though). But if so > 3.5 is far easier to use since it is in more distros than Arch and > Tumbleweed (like with 3.6). > > > > > def show_request(request: http.Request): > > return { > > 'method': request.method, > > 'url': request.url, > > 'headers': dict(request.headers) > > } > > > > def create_project() -> Response: > > data = {'name': 'new project', 'id': 123} > > headers = {'Location': 'http://example.com/project/123/'} > > return Response(data, status=201, headers=headers) > > > > and f'' strings: > > > > def echo_username(username): > > return {'message': f'Welcome, {username}!'} > > > > Python folks seem to be excited about apistar so far - but I think > > python 3.6 is a bridge too far - it honestly introduces more deployment > > issues as doing a golang-gRPC layer. > > > > ** Summary > > > > I don't think the REST frameworks offer enough benefit to justify their > > use and adopting WSGI as our path forward. > > Yesterday SpamapS mentioned wanting to be able to grow the Zuul > community. Just based on looking at the choices OpenStack is making > (moving TO wsgi) and the general populatity of Flask in the python > community I think that you may want to consider both wsgi and flask > simply because they are tools that are known to scale reasonably well > and many people are familiar with them. > > > > > ** Thoughts on RPC Bus ** > > > > gearman is a simple way to add RPC calls between an API tier and the > > scheduler. However, we got rid of gear from nodepool already, and we > > intend on getting rid of gearman in v4 anyway. > > > > If we use zk, we'll have to do a little bit more thinking about how to > > do the RPC calls which will make this take more work. BUT - it means we > > can define one API that covers both Zuul and Nodepool and will be > > forward compatible with a v4 no-gearman world. > > > > We *could* use gearman in zuul and run an API in-process in nodepool. > > Then we could take a page out of early Nova and do a proxy-layer in zuul > > that makes requests of nodepool's API. > > > > We could just assume that there's gonna be an Apache fronting this stuff > > and suggest deployment with routing to zuul and nodepool apis with > > mod_proxy rules. > > > > Finally, as clarkb pointed out in response to the ingestors spec, we > > could introduce MQTT and use it. I'm wary of doing that for this because > > it introduces a totally required new tech stack at a late stage. > > Mostly I was just pointing out that I think the vast majority of the > infastructure work to have something like a zuul ingestor is done. You > just have to read from an mqtt connection instead of a gerrit ssh > connection. Granted this does require running more services (mqtt server > and the event stream handler) and doesn't handle entities like Github. > > That said MQTT unlike Gearman seems to be seeing quite a bit of > development activity due to the popularity of IoT. Gearman has worked > reasonably well for us though so I don't think we need to just replace > it to get in on the IoT bandwagon. > > > > > Since we're starting fresh, I like the idea of a single API service that > > RPCs to zuul and nodepool, so I like the idea of using ZK for the RPC > > layer. BUT - using gear and adding just gear worker threads back to > > nodepol wouldn't be super-terrible maybe. > > Nodepool hasn't had a Gearman less release yet so you don't have to > worry about backward compat at least. > > > > > ** Final Summary ** > > > > As I tl;dr'd earlier, I think aiohttp co-located with the scale-out > > websocket tier talking to the scheduler over zk is the best bet for us. > > I think it's both simple enough to adopt and gets us a rich set of > > features. It also lets us implement in-process simple health endpoints > > on each service with the same tech stack. > > I'm wary of this simply because it looks a lot like repeating > OpenStack's (now failed) decision to stick web servers in a bunch of > python processes then do cooperative multithreading with them along with > all your application logic. It just gets complicated. I also think this > underestimates the value of using tools people are familiar with (wsgi > and flask) particularly if making it easy to jump in and building > community is a goal. > > Clark > > > _______________________________________________ > OpenStack-Infra mailing list > [email protected] > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra _______________________________________________ OpenStack-Infra mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
