Re: [EXTERNAL] Re: Traffic Ops API versioning issues

Moltzau, Matthew Thu, 14 Feb 2019 08:39:26 -0800

> The "CRUDER" is a great innovation, to be sure, as it saves us quite a bit of 
> time and prevents
        duplicating code, but the problem it solves is an emergent property of 
of API problem #1 above.
> With fewer endpoints we'd have much more specific handling, and the need for 
> and advantages of
        the "CRUDER" will vanish.


FYI: a restructure of our CRUDer interfaces currently has a PR here that is in 
review:

https://github.com/apache/trafficcontrol/pull/2886

It allows the user to implement a single C, R, U, or D operation for better 
flexibility.

On 2/14/19, 8:21 AM, "Fieck, Brennan" <[email protected]> wrote:

    I'd take it further, though in my opinion it shouldn't be done until we're 
ready to do away with Perl entirely.
    I've been working on this draft for a while now, and I sort of wanted to 
wait until I made a wiki page for a proposed API v2 spec before sending it, but 
what the heck. The conversation's here now.
    
    I don't like the way our API is versioned. Let me be clear, it _should_ be 
versioned, and the
    versions _should_ be semantic, but it is my opinion that the API should be 
versioned alongside
    Traffic Ops rather than at its own pace. I've written essentially an 
article on why this should be
    the case, and it follows:
    
    ** TIMING **
    The first issue I should address is timing - this needn't happen 
immediately. For the time being, I
    think a radical change would be much more harmful than continuing along our 
current versioning
    scheme. This change should be implemented with the advent of what we 
currently refer to as "API v2".
    Presumably, at this time all of the Perl code has been removed and we are 
looking to re-tool the API
    to be more sensible and standards-conforming. It's a time when we'll 
already be making big, breaking
    changes. I personally would love for this to be ATC 4.0, but that may not 
be a realistic goal.
    
    ** REDUNDANCY **
    The easiest to see - albiet simultaneously most trivial - issue with the 
current API versioning
    scheme is how every request path must start with '/api/1.x/'. This is 
annoying, to be frank;
    especially when one considers that the current plan for Traffic Ops is to 
reduce it entirely to an
    API, so literally every endpoint will have '/api/' in common. We _know_ the 
endpoint we are trying
    to query is an API endpoint because _all_ endpoints are API endpoints. When 
we reach "API v2" the
    '/api/' part of request paths will have lost all value entirely.
    Even with that gone we are left with '/1.' (or '/2.' as the case may 
become) as a common prefix,
    again annoying although not totally useless in this case. However, the vast 
majority of API
    endpoints see no changes between minor versions, so really '/1.x' just 
becomes a static, constant
    prefix where 'x' is the latest version of the API.
    In any case, versioning the API alongside Traffic Ops solves this problem 
because our Traffic Ops
    server(s) emit HTTP headers that name the server. Once Perl is gone, we'll 
be free to use the HTTP
    `Server:` header to name the server e.g. `Server: Traffic Ops/3.2.1`. At 
this point, we could
    either implement an `OPTIONS` method request at the server's root that 
would just return some
    headers - including `Server:` or just let people to a `GET` (or better yet 
`HEAD`) request to
    `/ping` and pull the server version out of the headers. The client then has 
all the information it
    needs to communicate effectively with the server. The alternative to this 
within our current
    versioning schema is to implement an unversioned API endpoint such that we 
have a hundred `/api/1.x`
    endpoints and one that has some other - or possibly no - prefix, or have a 
versioned API version API
    endpoint, which is confusing even just to say.
    
    ** TRAFFIC OPS _IS_ THE API **
    As mentioned previously, the endgame of our transition from Perl to Go is 
that Traffic Portal is the
    only UI for ATC - and in fact that's already somewhat true. That leaves 
Traffic Ops as a database
    and a REST API for interacting with said database. In fact, the database is 
often referred to as its
    own service: "ToDb", "Traffic Ops DB" etc. That means that we have the 
single most important Traffic
    Control component split in two - half of its functionality is versioned 
sanely with Traffic Control
    while the other half is versioned separately from anything else in the 
world. That's crazy, because
    if you have a program that only does two things, then surely a breaking 
change to one of those
    things means increasing its major version? If that's the case, then why are 
we not versioning the
    API alongside Traffic Ops?
    It may be argued (incorrectly, in my opinion) that Traffic Ops does more 
than serve an API to
    interact with a database. It generates configuration files and system 
images, it combines data and
    does heavy processing on it. But really those things shouldn't be taken 
into account in a versioning
    scheme except insofar as they affect the experience of some user, 
administrator, or application
    interfacing with Traffic Ops. If the API responses don't change in form or 
content, and if the
    process of setting up or maintaining the application haven't changed, then 
any code changes you've
    made are a patch, not a version change. Traffic Ops does big things, but at 
the end of the day it
    all just boils down to API inputs and outputs as far as anything and anyone 
else is concerned.
    
    ** CONFUSION **
    We currently live in a world where I can run a script using the Traffic Ops 
API that works perfectly
    fine against Traffic Ops version 3.2.1, but then if I again test it against 
version 3.2.1 at some
    point in the future it breaks because breaking changes were made in the 
API. It's easy to say, "oh,
    that just means that when we make breaking API changes we should increment 
the version
    appropriately," but realize that this _is versioning the API alongside 
Traffic Ops_. If we're not
    doing that, we're saying there is unpredictability with the behavior of our 
system within releases,
    and if we _are_ doing that then the only difference between the API version 
and the Traffic Ops
    version is that the API version is confusingly behind by about 2 major 
revisions. It should just be
    the same for simplicity's sake.
    
    ** THE API "PROMISE" **
    The single most common argument I hear in favor of our current API 
versioning scheme is "well we've
    said an API version 1.x behaves in this way, and so we must uphold that 
promise to the user base".
    Not only do we routinely break that promise already,
    (e.g. PR #3110 [https://github.com/apache/trafficcontrol/pull/3110]) but 
I'm certainly not
    suggesting that we don't uphold this promise. Yes, this does mean that 
making breaking API changes
    in TO would require a new major release and adding features/fields to the 
API would require a new
    minor release. I don't see that as a big deal, especially if implemented at 
the time I'm suggesting
    when we'd be re-designing the API - and it does sorely need to be 
redesigned.
    
    * The API Needs to be Redesigned *
    I'm going to go down this rabbit hole for a second, if you're already 
convinced the TO API needs a
    re-design then feel free to skip this section. I'm not going to touch on 
any problems caused in the
    API as currently implemented by the use of a standalone API version - 
that's what the entire article
    is for.
    Currently, our API currently has three huge problems:
    
    1. Rampant oversaturation of endpoints
        We have a systemic issue of re-implementing behaviour in multiple 
endpoints. This is due in part
        to a lack of good documentation - so developers aren't aware of the 
endpoints available to them
        - and partly because of the "Mojolicious Mindset" that plagues our 
oldest endpoints. The
        "Mojolicious Mindset" refers to the use of URL path fragments as 
request parameters, e.g.
        '/users/{{ID}}' instead of/in addition to '/users?id={{ID}}'. From the 
perspective of someone
        who is just writing the back-end for these endpoints, there's no clear 
advantage to one over the
        other except that the former seems to more clearly reflect the intent 
of requesting a specific
        object whereas the latter could be seen as more of a "filter" on a 
larger collection. That's not
        incorrect, necessarily, but the two are totally separate request paths, 
so having '/users' and
        '/users/{{ID}}' means documenting two endpoints instead of one, and it 
means two lines in the
        route definitions, and it means two seperate handlers instead of one 
(albiet a more complex
        one).
        Consider also that we have all of the following endpoints for 
manipulating Cache Groups:
    
                * /api/1.x/cachegroup/{{parameter ID}}/parameter
                * /api/1.x/cachegroup_fallbacks
                * /api/1.x/cachegroupparameters
                * /api/1.x/cachegroupparameters/{{ID}}/{{parameter ID}}
                * /api/1.x/cachegroups
                * /api/1.x/cachegroups/{{ID}}
                * /api/1.x/cachegroups/{{ID}}/deliveryservices
                * /api/1.x/cachegroups/{{ID}}/parameters
                * /api/1.x/cachegroups/{{ID}}/queue_updates
                * /api/1.x/cachegroups/{{ID}}/unassigned_parameters
                * /api/1.x/cachegroups/{{parameter ID}}/parameter_available
                * /api/1.x/cachegroups_trimmed
    
        These could all be collapsed neatly into one or two endpoints, but were 
instead implemented
        separately for whatever reason(s)
        (see Issue #2934 [https://github.com/apache/trafficcontrol/issues/2934] 
for details).
    
    2. Improper/Non-existent standards conformity
        We have requests that should be read-only that make server-side state 
changes (Issue #3054
        [https://github.com/apache/trafficcontrol/issues/3054]), we have 
endpoints returning success
        responses on failure (Issue #3003 
[https://github.com/apache/trafficcontrol/issues/3003]) and
        our "CRUDER" has failed us. `PUT` should be used to _create_ objects 
(or possibly update them by
        creating an alternative representation with the same identifiying 
information) but is instead
        used as the primary "edit this" method. `POST` is for processing 
entities in a data payload, but
        is instead used for object creation. `PATCH` languishes, totally 
unimplemented. These problems
        are systemic and stem partially from the "Mojolicious Mindset" whereby 
new functionality is
        introduced into the API by first considering what request method is 
appropriate and then
        deciding on a request path that names the operation being done. Request 
methods are meant to be
        the different ways in which a client interacts with a resource on the 
server, and thus the
        resources themselves should be considered primary. The "CRUDER" hampers 
this mindset, because it
        makes treating payloads and query parameters generic and isn't 
receptive to injection of new
        behaviour.
        The "CRUDER" is a great innovation, to be sure, as it saves us quite a 
bit of time and prevents
        duplicating code, but the problem it solves is an emergent property of 
of API problem #1 above.
        With fewer endpoints we'd have much more specific handling, and the 
need for and advantages of
        the "CRUDER" will vanish.
    
    3. Needing multiple queries to obtain a single piece of information
        This issue is pretty deeply rooted, and is related to the way our 
database is structured. But
        that's part of the problem - an API needn't replicate the database, and 
is therefore free from
        some of the constraints that bind a database.
        The way things are now, in order to e.g. create a server definition I 
must make several
        preliminary requests to determine the integral, unique, 
non-deterministic identifiers of other
        objects. I think we all agree that ideally these integral IDs would 
have no part in the output
        of the API, and many people I've spoken to would support wiping them 
from the database entirely
        given enough time and manpower.

Re: [EXTERNAL] Re: Traffic Ops API versioning issues

Reply via email to