I'd take it further, though in my opinion it shouldn't be done until we're ready to do away with Perl entirely. I've been working on this draft for a while now, and I sort of wanted to wait until I made a wiki page for a proposed API v2 spec before sending it, but what the heck. The conversation's here now.
I don't like the way our API is versioned. Let me be clear, it _should_ be versioned, and the versions _should_ be semantic, but it is my opinion that the API should be versioned alongside Traffic Ops rather than at its own pace. I've written essentially an article on why this should be the case, and it follows: ** TIMING ** The first issue I should address is timing - this needn't happen immediately. For the time being, I think a radical change would be much more harmful than continuing along our current versioning scheme. This change should be implemented with the advent of what we currently refer to as "API v2". Presumably, at this time all of the Perl code has been removed and we are looking to re-tool the API to be more sensible and standards-conforming. It's a time when we'll already be making big, breaking changes. I personally would love for this to be ATC 4.0, but that may not be a realistic goal. ** REDUNDANCY ** The easiest to see - albiet simultaneously most trivial - issue with the current API versioning scheme is how every request path must start with '/api/1.x/'. This is annoying, to be frank; especially when one considers that the current plan for Traffic Ops is to reduce it entirely to an API, so literally every endpoint will have '/api/' in common. We _know_ the endpoint we are trying to query is an API endpoint because _all_ endpoints are API endpoints. When we reach "API v2" the '/api/' part of request paths will have lost all value entirely. Even with that gone we are left with '/1.' (or '/2.' as the case may become) as a common prefix, again annoying although not totally useless in this case. However, the vast majority of API endpoints see no changes between minor versions, so really '/1.x' just becomes a static, constant prefix where 'x' is the latest version of the API. In any case, versioning the API alongside Traffic Ops solves this problem because our Traffic Ops server(s) emit HTTP headers that name the server. Once Perl is gone, we'll be free to use the HTTP `Server:` header to name the server e.g. `Server: Traffic Ops/3.2.1`. At this point, we could either implement an `OPTIONS` method request at the server's root that would just return some headers - including `Server:` or just let people to a `GET` (or better yet `HEAD`) request to `/ping` and pull the server version out of the headers. The client then has all the information it needs to communicate effectively with the server. The alternative to this within our current versioning schema is to implement an unversioned API endpoint such that we have a hundred `/api/1.x` endpoints and one that has some other - or possibly no - prefix, or have a versioned API version API endpoint, which is confusing even just to say. ** TRAFFIC OPS _IS_ THE API ** As mentioned previously, the endgame of our transition from Perl to Go is that Traffic Portal is the only UI for ATC - and in fact that's already somewhat true. That leaves Traffic Ops as a database and a REST API for interacting with said database. In fact, the database is often referred to as its own service: "ToDb", "Traffic Ops DB" etc. That means that we have the single most important Traffic Control component split in two - half of its functionality is versioned sanely with Traffic Control while the other half is versioned separately from anything else in the world. That's crazy, because if you have a program that only does two things, then surely a breaking change to one of those things means increasing its major version? If that's the case, then why are we not versioning the API alongside Traffic Ops? It may be argued (incorrectly, in my opinion) that Traffic Ops does more than serve an API to interact with a database. It generates configuration files and system images, it combines data and does heavy processing on it. But really those things shouldn't be taken into account in a versioning scheme except insofar as they affect the experience of some user, administrator, or application interfacing with Traffic Ops. If the API responses don't change in form or content, and if the process of setting up or maintaining the application haven't changed, then any code changes you've made are a patch, not a version change. Traffic Ops does big things, but at the end of the day it all just boils down to API inputs and outputs as far as anything and anyone else is concerned. ** CONFUSION ** We currently live in a world where I can run a script using the Traffic Ops API that works perfectly fine against Traffic Ops version 3.2.1, but then if I again test it against version 3.2.1 at some point in the future it breaks because breaking changes were made in the API. It's easy to say, "oh, that just means that when we make breaking API changes we should increment the version appropriately," but realize that this _is versioning the API alongside Traffic Ops_. If we're not doing that, we're saying there is unpredictability with the behavior of our system within releases, and if we _are_ doing that then the only difference between the API version and the Traffic Ops version is that the API version is confusingly behind by about 2 major revisions. It should just be the same for simplicity's sake. ** THE API "PROMISE" ** The single most common argument I hear in favor of our current API versioning scheme is "well we've said an API version 1.x behaves in this way, and so we must uphold that promise to the user base". Not only do we routinely break that promise already, (e.g. PR #3110 [https://github.com/apache/trafficcontrol/pull/3110]) but I'm certainly not suggesting that we don't uphold this promise. Yes, this does mean that making breaking API changes in TO would require a new major release and adding features/fields to the API would require a new minor release. I don't see that as a big deal, especially if implemented at the time I'm suggesting when we'd be re-designing the API - and it does sorely need to be redesigned. * The API Needs to be Redesigned * I'm going to go down this rabbit hole for a second, if you're already convinced the TO API needs a re-design then feel free to skip this section. I'm not going to touch on any problems caused in the API as currently implemented by the use of a standalone API version - that's what the entire article is for. Currently, our API currently has three huge problems: 1. Rampant oversaturation of endpoints We have a systemic issue of re-implementing behaviour in multiple endpoints. This is due in part to a lack of good documentation - so developers aren't aware of the endpoints available to them - and partly because of the "Mojolicious Mindset" that plagues our oldest endpoints. The "Mojolicious Mindset" refers to the use of URL path fragments as request parameters, e.g. '/users/{{ID}}' instead of/in addition to '/users?id={{ID}}'. From the perspective of someone who is just writing the back-end for these endpoints, there's no clear advantage to one over the other except that the former seems to more clearly reflect the intent of requesting a specific object whereas the latter could be seen as more of a "filter" on a larger collection. That's not incorrect, necessarily, but the two are totally separate request paths, so having '/users' and '/users/{{ID}}' means documenting two endpoints instead of one, and it means two lines in the route definitions, and it means two seperate handlers instead of one (albiet a more complex one). Consider also that we have all of the following endpoints for manipulating Cache Groups: * /api/1.x/cachegroup/{{parameter ID}}/parameter * /api/1.x/cachegroup_fallbacks * /api/1.x/cachegroupparameters * /api/1.x/cachegroupparameters/{{ID}}/{{parameter ID}} * /api/1.x/cachegroups * /api/1.x/cachegroups/{{ID}} * /api/1.x/cachegroups/{{ID}}/deliveryservices * /api/1.x/cachegroups/{{ID}}/parameters * /api/1.x/cachegroups/{{ID}}/queue_updates * /api/1.x/cachegroups/{{ID}}/unassigned_parameters * /api/1.x/cachegroups/{{parameter ID}}/parameter_available * /api/1.x/cachegroups_trimmed These could all be collapsed neatly into one or two endpoints, but were instead implemented separately for whatever reason(s) (see Issue #2934 [https://github.com/apache/trafficcontrol/issues/2934] for details). 2. Improper/Non-existent standards conformity We have requests that should be read-only that make server-side state changes (Issue #3054 [https://github.com/apache/trafficcontrol/issues/3054]), we have endpoints returning success responses on failure (Issue #3003 [https://github.com/apache/trafficcontrol/issues/3003]) and our "CRUDER" has failed us. `PUT` should be used to _create_ objects (or possibly update them by creating an alternative representation with the same identifiying information) but is instead used as the primary "edit this" method. `POST` is for processing entities in a data payload, but is instead used for object creation. `PATCH` languishes, totally unimplemented. These problems are systemic and stem partially from the "Mojolicious Mindset" whereby new functionality is introduced into the API by first considering what request method is appropriate and then deciding on a request path that names the operation being done. Request methods are meant to be the different ways in which a client interacts with a resource on the server, and thus the resources themselves should be considered primary. The "CRUDER" hampers this mindset, because it makes treating payloads and query parameters generic and isn't receptive to injection of new behaviour. The "CRUDER" is a great innovation, to be sure, as it saves us quite a bit of time and prevents duplicating code, but the problem it solves is an emergent property of of API problem #1 above. With fewer endpoints we'd have much more specific handling, and the need for and advantages of the "CRUDER" will vanish. 3. Needing multiple queries to obtain a single piece of information This issue is pretty deeply rooted, and is related to the way our database is structured. But that's part of the problem - an API needn't replicate the database, and is therefore free from some of the constraints that bind a database. The way things are now, in order to e.g. create a server definition I must make several preliminary requests to determine the integral, unique, non-deterministic identifiers of other objects. I think we all agree that ideally these integral IDs would have no part in the output of the API, and many people I've spoken to would support wiping them from the database entirely given enough time and manpower.
