Re: [OpenStack-Infra] On the subject of HTTP interfaces and Zuul

Monty Taylor Tue, 13 Jun 2017 07:10:37 -0700

On 06/10/2017 08:38 PM, Tristan Cacqueray wrote:

Hello folks,


Regardless of the HTTP interfaces architecture, I proposed this
zuul_dashboard thing to help Jenkins users migrate to
zuul-launcher/executor. As far as I can tell, we need a comprehensive

view of jobs' run where users can quickly check the results of criticaljobs

such as the periodic, post and tag jobs.

I think this is a really neat idea, but I have two different opinions onhow we should do it - those are the "optional dashboard" and "requireddashboard" route. I think "optional" is a consideration because aswritten currently, this depends on the SQL Reporter which is an optionalreporter plugin.


** Optional **

If we decide that we also want the dashboard to be an optional zuulfeature - because maybe for small installations we don't want to grow aMySQL depend - I think we should add the dashboard to the SQL Reporterplugin. So if you enable the SQL reporter, you get the dashboard.

In fact, that might be the right first-step, since it's just a smallchange to what's there now.


** Required **

I do think it's a valuable feature though, so we could decide we want itto always be present, similar to the current status page.

If we do that, I think we should define an "History" interface forreading the data that the dashboard needs, and the SQL Reporter shouldimplement that interface.

Then we should implement a ZK-based plugin that implements Reporter andHistory. That way a smaller user can simply configure to use the ZKreporter and history, and a larger user can configure to use the SQLreporter and history.

Another feature to consider is job trigger/enqueuing as well as
job termination. Though this needs some sort of authentication and
policy...

Yah. That one is a bit harder and we'll need to consider how that works.Would likely be a good topic for the next time we have in person. Maybeif we don't have a story for that by Denver it would be a good Denver topic?

There's also a third thing - which is halfway in between - and that's anAPI for getting current and not historical information but that doesn'tneed auth. Examples:


- What are the valid node labels?
- What is the current job config for project X?
- What optional features does this Zuul have enabled?

The last one is a thing I want given needing it but it not being therefor OpenStack. It could contain things like "this zuul has a dashboardand it's here" and "this zuul supports GH triggers via the OpenStackZuul App, Gerrit triggers for review.openstack.org, and has the mqttreporter enabled which reports to firehose.openstack.org" *hand wave*

Well these features can be implemented independently with the current
architecture, but it sounds better to have them baked in Zuul and
readily available to the end users. So I'm top-posting here to make
sure this will be part of the master plan :-)

Yes- I think it's the most important thing to make sure we're on thesame overall page about direction so that you (and others) can continueto explore this. We're cranking currently on getting v3 out the door andare trying to really careful about last-minute scope creep, so we mightnot have immediate bandwidth to review/land. BUT - the plugin interfaceis there for a reason and I want you to be able to work on these withoutbeing blocked so that when we've got bandwidth to start adding stuffyou're ready to go - but that with enough baseline agreement so thatstep one isn't "oh, let's rewrite everything now" :)

This makes me think it might be useful to have a document with a list ofbacklog of "these are all good ideas that we all agree should be here,but also that we're purposely not doing today". I'll send a follow upemail about that.

On June 9, 2017 4:22 pm, Monty Taylor wrote:
Hey all!
Tristan has recently pushed up some patches related to providing a WebDashboard for Zuul. We have a web app for nodepool. We already havethe Github webhook receiver which is inbound http. There have beenfolks who have expressed interest in adding active-REST abilities forperforming actions. AND we have the new websocket-based log streaming.
We're currently using Paste for HTTP serving (which is effectivelydead), autobahn for websockets and WebOB for request/response processing.
This means that before we get too far down the road, it's probablytime to pick how we're going to do those things in general. There are2 questions on the table:
* HTTP serving
* REST framework
They may or may not be related, and one of the options on the tableimplies an answer for both. I'm going to start with the answer I thinkwe should pick:
*** tl;dr ***

We should use aiohttp with no extra REST framework.

Meaning:

- aiohttp serving REST and websocket streaming in a scale-out tier
- talking RPC to the scheduler over gear or zk
- possible in-process aiohttp endpoints for k8s style health endpoints
Since we're talking about a web scale-out tier that we should justhave a single web tier for zuul and nodepool. This continues thethinking that nodepool is a component of Zuul.
In order to write zuul jobs, end-users must know what node labels areavailable. A zuul client that says "please get me a list of availablenode labels" could make sense to a user. As we get more non-OpenStackusers, those people may not have any concept that there is a separatething called "nodepool".
*** The MUCH more verbose version ***
I'm now going to outline all of the thoughts and options I've had orhave heard other people say. It's an extra complete list - there areideas in here you might find silly/bad. But since we're picking adirection, I think it's important we consider the options in front of us.
This will cover 3 http serving options:

- WSGI
- aiohttp
- gRPC

and 3 REST framework options:

- pecan
- flask-restplus
- apistar

** HTTP Serving **

WSGI
The WSGI approach is one we're all familiar with and it works withpretty much every existing Python REST framework. For us I believe ifwe go this route we'd want to serve it with something like uwsgi andApache. That adds the need for an Apache layer and/or management uwsgiprocess. However, it means we can make use of normal tools we alllikely know at least to some degree.
A downside is that we'll need to continue to handle our Websocketswork independently (which is what we're doing now)
Because it's in a separate process, the API tier will need to makerequests of the scheduler over a bus, which could be either gearman orzk.
aiohttp
Zuul v3 is Python3, which means we can use aiohttp. aiohttp isn'tparticularly compatible with the REST frameworks, but it has built-inroute support and helpers for receiving and returning JSON. We don'tneed ORM mapping support, so the only thing we'd really be MISSINGfrom REST frameworks is auto-generated documentation.
aiohttp also supports websockets directly, so we could port theautobahn work to use aiohttp.
aiohttp can be run in-process in a thread. However, websocketlog-streaming is already a separate process for scaling purposes, soif we decide that one impl backend is a value, it probably makes senseto just stick the web tier in the websocket scaleout process anyway.
However, we could probably write a facade layer with a gear backendand an in-memory backend so that simple users could just run thein-process version but scale-out was possible for larger installs(like us)
Since aiohttp can be in-process, it also allows us to easily add some'/health' endpoints to all of our services directly, even if theyaren't intended to be publicly consumable. That's important forrunning richly inside of things like kubernetes that like to check inon health status of services to know about rescheduling them. This waywe could add a simple thread to the scheduler and the executors andthe mergers and the nodepool launchers and builders that adds a'/health' endpoint.
gRPC / gRPC-REST gateway
This is a curve-ball. We could define our APIs using gRPC. That getsus a story for an API that is easily consumable by all sorts ofclients, and that supports exciting things like bi-directionalstreaming channels. gRPC isn't (yet) consumable directly in browsers,nor does Github send gRPC webhooks. BUT - there is a REST Gateway forgRPC:
https://github.com/grpc-ecosystem/grpc-gateway
that generates HTTP/1.1+JSON interfaces from the gRPC descriptions andtranslates between protobuf and json automatically. The "REST"interface it produces does not support url-based parameters, soeverything is done in payload bodies, so it's:
   GET /nodes
   {
     'id': 1234
   }

rather than

   GET /nodes/1234
but that's still totally fine - and totally works for both status.jsonand GH webhooks.
The catch is - grpc-gateway is a grpc compiler plugin that generatesgolang code. So we'd either have to write our own plugin that does thesame thing but for generating python code, or we'd have to write ourgRPC/REST layer in go. I betcha folks would appreciate if weimplemented the plugin for python, but that's a long tent-pole forthis purpose so I don't honestly think we should consider it.Therefore, we should consider that using gRPC + gRPC-REST implieswriting the web-tier in go. That obviously implies an additionalprocess that needs to talk over an RPC bus.
There are clear complexity costs involved with adding a secondlanguage component, especially WRT deployment. (pip install zuul wouldnot be sufficient) OTOH - it would open the door to usingprotobuf-based objects for internal communication, and would open thedoor for rich client apps without REST polling and also potentiallynice Android apps (gRPC works great for mobile apps) I think thatmakes it a hard sell.
THAT SAID - there are only 2 things that explicitly need REST overHTTP 1.1 - thats the github webhooks and status.json. We could writeeverything in gRPC except those two. Browser support for gRPC iscoming soon (they've moved from "someone is working on it" to "contactus about early access") so status.json could move to being pure gRPCas well ... and the webhook endpoint is pretty simple, so just havingit be an in-process aiohttp handler isn't a terrible cost. So if wethought "screw it, let's just gRPC and not have an HTTP/1.1 RESTinterface at all" - we can stay all in python and gRPC isn't a hugecost at that point.
gRPC doesn't handle websockets - but we could still run the gRPCserving and the websocket serving out of the same scale-out web tier.
** Summary
Based on the three above, it seems like we need to think aboutseparate web-tier regardless of choice. The one option that doesn'tstrictly require a separate tier is the one that lets us align onwebsockets, so it seems that co-location there would be simple.
aiohttp seems like the cleanest forward path. It'll require reworkingthe autobahn code (sorry Shrews) - but is nicely aligned with ourPython3 state. It's new - but it's not as totally new as gRPC is. Andsince we'll already have some websockets stuff, we could also writestreaming websockets APIs for the things where we'd want that from gRPC.
* REST Framework
If we decide to go the WSGI route, then we need to talk RESTframeworks (and it's possible we decide to go WSGI because we want touse a REST framework)
The assumption in this case is that the websocket layer is a separateentity.
There are three 'reasonable' options available:

- pecan
- flask-restplus
- apistar

pecan
pecan is used in a lot of OpenStack services and is also used byStoryboard, so it's well known. Tristan's patches so far use Pecan, sowe've got example code.
On the other hand, Pecan seems to be mostly only used in OpenStackland and hasn't gotten much adoption elsewhere.
flask-restplus

https://flask-restplus.readthedocs.io/en/stable/
flask is extremely popular for folks doing REST in Python.flask-restplus is a flask extension that also produces Swagger Docsfor the REST api, and provides for serving an interactive swagger-uibased browseable interface to the API. You can also define modelsusing JSONSchema. Those are not needed for simple cases likestatus.json, but for fuller REST API might be nice.
Of course, in all cases we could simply document our API using swaggerand get the same thing - but that does involve maintaining model/apidescriptions and documentation separately.
apistar

https://github.com/tomchristie/apistar
apistar is BRAND NEW and was announced at this year's PyCon. It's fromthe Django folks and is aimed at writing REST separate from Django.
It's python3 from scratch - although it's SO python3 focused that itrequires python 3.6. This is because it makes use of type annotations:
   def show_request(request: http.Request):
       return {
           'method': request.method,
           'url': request.url,
           'headers': dict(request.headers)
       }

   def create_project() -> Response:
       data = {'name': 'new project', 'id': 123}
       headers = {'Location': 'http://example.com/project/123/'}
       return Response(data, status=201, headers=headers)

and f'' strings:

   def echo_username(username):
     return {'message': f'Welcome, {username}!'}
Python folks seem to be excited about apistar so far - but I thinkpython 3.6 is a bridge too far - it honestly introduces moredeployment issues as doing a golang-gRPC layer.
** Summary
I don't think the REST frameworks offer enough benefit to justifytheir use and adopting WSGI as our path forward.
** Thoughts on RPC Bus **
gearman is a simple way to add RPC calls between an API tier and thescheduler. However, we got rid of gear from nodepool already, and weintend on getting rid of gearman in v4 anyway.
If we use zk, we'll have to do a little bit more thinking about how todo the RPC calls which will make this take more work. BUT - it meanswe can define one API that covers both Zuul and Nodepool and will beforward compatible with a v4 no-gearman world.
We *could* use gearman in zuul and run an API in-process in nodepool.Then we could take a page out of early Nova and do a proxy-layer inzuul that makes requests of nodepool's API.
We could just assume that there's gonna be an Apache fronting thisstuff and suggest deployment with routing to zuul and nodepool apiswith mod_proxy rules.
Finally, as clarkb pointed out in response to the ingestors spec, wecould introduce MQTT and use it. I'm wary of doing that for thisbecause it introduces a totally required new tech stack at a late stage.
Since we're starting fresh, I like the idea of a single API servicethat RPCs to zuul and nodepool, so I like the idea of using ZK for theRPC layer. BUT - using gear and adding just gear worker threads backto nodepol wouldn't be super-terrible maybe.
** Final Summary **
As I tl;dr'd earlier, I think aiohttp co-located with the scale-outwebsocket tier talking to the scheduler over zk is the best bet forus. I think it's both simple enough to adopt and gets us a rich set offeatures. It also lets us implement in-process simple health endpointson each service with the same tech stack.
_______________________________________________
OpenStack-Infra mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra



_______________________________________________
OpenStack-Infra mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra

Re: [OpenStack-Infra] On the subject of HTTP interfaces and Zuul

Reply via email to