assuming that everyone is ok with ditching minor version support for the
api (not really sure if rob is yet), what about this release plan?

TC 3.0 / supports api 1.3, 1.2, 1.1 (this is already out the door)
TC 4.0 / supports api 1.4, 1.3, 1.2, 1.1 (cut this branch asap)
TC 5.0 / supports api v2, treat all calls to 1.x as 1.4 (this will be
master after the 4.0 branch is cut)

note: this leaves the existing data loss (
https://github.com/apache/trafficcontrol/issues/3497) problem in 3.0 and
4.0 that needs to be addressed.

On Fri, Apr 19, 2019 at 3:39 PM Gray, Jonathan <[email protected]>
wrote:

> Also to be sure I'm tracking properly the complex subject here:
>
> 1. Today there are two promises:
>   1a. One major rev back at the project level for all code/api/etc
>   1b. API versions independently (but probably not stable enough on
> whatever the latest is due to self-discipline/definition)
>
> 2.  Primary issues around API stability are the addition and removal of
> routes/fields
>   2a.  This relates to the discussion around 1b
>   2b.  A significant portion of this could be mitigated through the
> support of a PATCH operation as opposed to passing full objects around
> which already has inherent race conditions in a multi-user system.
>   2c.  This is where maintenance costs are potentially very high, but
> might be mitigated somewhat through rob's apiver library.
>
> 3.  Secondary issues are around strong versus weak typing
>   3a. This is the primary source or breakage lately and is the result of
> conversions from perl to go
>   3b.  This breaks not only promise 1b, but also promise 1a.
>   3c.  The solution to this is simply to be more diligent in the use of
> existing or new datatypes until such time that a new major revision of
> either 1a or 1b is made.  Example use case issue #3304.
>
> 4.  Better support for discoverability and compatibility by and between
> API components/consumers
>   4a.  At the moment there is no way as a client to know which API
> versions are supported by an arbitrary TO instance.  Documented in issue
> #2872
>   4b. This leads to failure scenarios wherein TO isn't upgraded first or
> newer clients exist.  I stopped counting when 5 different people internally
> ran into this with the addition of the 1.4 API for us.
>
> 5.  There are additional concerns with how we handle 1b with regard to
> master as opposed to OSS releases and promise 1a
>   5a. There is a lack of formality when it comes to does component X on
> master changeset A work with component Y on master changeset B before it
> lands in OSS Release Q.
>   5b.  Presently this entails SME reviewing changelogs of component X & Y.
>   5c.  This is the question SemVer helps with by better defining that
> formality at the API layer.
>     5c.i.  It's not a total solution because unversioned
> payloads/workflows such as the CRConfig can still cause additional issues.
>   5d.  This is where today we're relying on our existing monolithic
> repository and one version to rule them all stance so that in theory we
> never ask 5a
>     5d.i.  This is backed by our current testing procedures.  When running
> TO API tests today it's presumed to use one version inside the CIAB
> environment.
>
> I think if we can clarify and agree on this, the questsions around how to
> version technically and ATC-based clients versus 3rd-party clients is
> mostly mitigated.
>
> Jonathan G
>
>
> On 4/19/19, 1:26 PM, "Rawlin Peters" <[email protected]> wrote:
>
>     How about this:
>     1. At the Go struct level, every struct for an API endpoint gets a
>     "Version" field.
>     2. Every time a backwards-compatible field is added to an API
>     endpoint, the Version is incremented for that struct (even if a TC
>     release hasn't been made between increments).
>     3. This Version field is read-only and included in all GET/PUT/POST
>     JSON responses.
>
>     For the Go TO client, this would allow the client to GET a resource
>     and check if the resource's version matches the version that the
>     client is currently using. If the client is just reading certain
>     fields and only cares to read the fields it already knows about, it
>     wouldn't have to check the version field at all. If the client wants
>     to update a resource, it could GET the resource, compare the resource
>     versions, and make a determination about whether or not the update
>     would be safe. If the versions match, it can safely update the
>     resource without risk of data loss. If the versions do NOT match, the
>     client might choose to error out instead and/or send an email to the
>     maintainer that it should be recompiled for safety.
>
>     This would allow classes of clients that handle unknown fields
>     properly (e.g. Traffic Portal, Python+Java TO clients) to continue
>     working as they do today, only having to worry about the API major
>     version.
>
>     For clients that might be more susceptible to data loss due to
>     addition of unknown fields (e.g. Go TO client), they would have enough
>     information returned to them by the API in order to know if they can
>     safely make updates to resources. If a Go TO client is only reading
>     certain fields and not making updates, it would probably never need to
>     be recompiled for the entire life of the API major version. If a Go TO
>     client is updating resources, it only needs recompiled as often as the
>     APIs its actually using have been updated.
>
>     This would allow random, non-standard TO clients to be written and
>     used for as long as possible before recompilation is required, and
>     allow the client to build in proper safeguards only where necessary.
>
>     - Rawlin
>
>     On Fri, Apr 19, 2019 at 12:01 PM Robert Butts <[email protected]> wrote:
>     >
>     > I'm trying really hard to come up with a solution that addresses
> everyone's
>     > major concerns. I think we'll have a better product, that everyone
> can live
>     > with, if we all try to think of solutions and are willing to
> compromise,
>     > rather than take hard-line approach and refuse to compromise, and
> argue
>     > until we're all just unhappy, and whatever gets pushed through meets
> a few
>     > people's concerns and nobody else's. I'd definitely appreciate any
> help in
>     > that regard.
>     >
>     > Sometimes there really are only two options, A or B. But this
> particular
>     > issue has countless possibilities. We're all smart people, we can
> figure
>     > something out that addresses everyone's needs and concerns.
>     >
>     > What about this idea:
>     >
>     > Along the lines of @hbeatty 's suggestion, what if we:
>     >
>     > 1. Make the API version match the TC version.
>     > 2. Always release new TC major versions, never do TC minor version
> releases.
>     > 3. Support one major version back, in the API and clients.
>     > 4. New backward-compatible changes require a TC=API major version
> increase.
>     > 5. OPTIONAL: There should be an API route to get the exact TC
> version (e.g.
>     > https://.../api/v3/version). (This isn't strictly necessary, but
> it's
>     > on @hbeatty
>     > 's list, and I know it's on @jonathan_gray 's, and it's super-easy
> and
>     > there's no reason not to.)
>     >
>     > This:
>     > 1. Addresses the client version bugs concern: older clients simply
> don't
>     > work because we don't support them, and newer clients will get the
> "please
>     > downgrade" response.
>     > 2. Addresses the code ease-of-writing concern: We only ever have to
>     > maintain 1 older version in the API, which will typically only be a
> few
>     > fields on a few endpoints.
>     > 3. Partially addresses the ease-of-use concern (Ops/@jonathan_gray).
> It
>     > addresses the scripts-breaking-things problem, but it does make user
>     > scripts have a hard upgrade deadline. I see this as the biggest
> weakness of
>     > this idea, and unfortunately I don't see a remedy; if the user-side
> people
>     > are willing to live with that?
>     > 4. Patch versions are still ok. This doesn't prevent e.g. 4.0.1 when
> we
>     > find a major bug in a release; just adding new things that would be a
>     > SemVer Minor Version.
>     >
>     > Some points:
>     > 1. Only doing major versions, we'll obviously quickly reach Traffic
> Control
>     > Version 47. I think that's ok. There's precedent for this, Chrome and
>     > Firefox both do this, Chrome's latest version is 68 and Firefox is
> 66. It
>     > might seem odd, but I don't think there are any big downsides.
>     > 2. This will make @jonathan_gray 's (/ Ops) life slightly harder,
> having to
>     > upgrade script clients more frequently. But it prevents the data
> loss risks
>     > (which I know everyone here doesn't agree with, but some of us do,
> so bear
>     > with me), and upgrading our maintained clients should be relatively
> simple.
>     > 2.1 As @hbeatty points out, if we release at a 6-month interval,
> this would
>     > mean scripts using old clients would be supported for 1 year. We
> could
>     > optionally support 2 major versions back, if we were willing to live
> with a
>     > little more server work, to support 2-year-old clients.
>     >
>     > Just to be clear, I personally don't like making the API version
> match the
>     > TC version, for reasons I won't get into here. I also loathe
> Reflection.
>     > But I can live with those things, if it addresses everyone else's
> concerns.
>     > This proposal isn't perfect; there is no perfect solution that will
> fulfill
>     > everyone's ideal. But, is this something we could live with? If not,
> is
>     > there a way to modify it to address whatever is unacceptable, while
> still
>     > addressing the major concerns others have? Or is this just right out?
>     >
>     >
>     > On Fri, Apr 19, 2019 at 10:43 AM Jeff Elsloo <[email protected]>
> wrote:
>     >
>     > > Without actually seeing how that would look across the code base,
> the
>     > > best I can say is maybe. On the surface your proposal seems to
> improve
>     > > the areas I'm concerned about, but we still have this implicit
> model
>     > > where the server is responsible for dealing with older clients that
>     > > might not submit all data as expected. This implicitly requires us
> to
>     > > handle the absence of that data in future APIs and think about how
> any
>     > > change might impact all client versions across versions of ATC.
>     > >
>     > > My concern really amounts to the investment of time required to
> think
>     > > through and implement changes that may affect the myriad of
> different
>     > > client/server version combinations. If we remove that from the
>     > > equation entirely, we have a much simpler API that has a 1:1
>     > > correspondence with the route and function, and only one way to
>     > > create/update a $thing (i.e.: a delivery service). I think having
> only
>     > > one way to create/update a $thing is a much safer way of doing
>     > > business than continuing to support multiple versions of clients,
>     > > regardless of how easy that might be with this proposed approach.
>     > > Unless I'm missing something, the implementation might be
> simplified
>     > > using this approach but the complexity of solving for the
> combination
>     > > of client versions still exists which makes it harder to do
> anything
>     > > when writing API code.
>     > >
>     > > So, it isn't a matter of whether this approach is simple enough
> for us
>     > > to continue with semantic versioning. It's a matter of whether we
> want
>     > > to have to continue to deal with older clients that prevent us from
>     > > making certain changes in the API because we are afraid of breaking
>     > > that client. I think that's a lot of burden for our small
> development
>     > > team to shoulder for questionable utility. Viewed from another
> lens,
>     > > with the semantic versioning approach we are enabling clients to be
>     > > lazy about updating their _unknown and custom_ client code at the
> cost
>     > > of developer productivity and progress on our project.
>     > >
>     > > I'm not saying that semantic versioning is solely to blame for our
>     > > lack of progress on our migration to Golang, but it's one more
> thing
>     > > that is slowing us down and definitely hasn't helped improve
> progress.
>     > > --
>     > > Thanks,
>     > > Jeff
>     > >
>     > > On Thu, Apr 18, 2019 at 3:23 PM Robert Butts <[email protected]>
> wrote:
>     > > >
>     > > > >This is about simplifying our code in the API
>     > > >
>     > > > @jeff.elsloo That's what the tag solution I proposed does. The
> only
>     > > > difference from not versioning, is that fields will have a new
> tag,
>     > > > "NewField *int `json:"newField, db:"new_field", api:"1.5"`, and
> endpoints
>     > > > will have an extra line, "json := api.NewJSON("1.4")". That's
> it. That
>     > > > would be the entirety of the API code (or very nearly, Rawlin is
> right, I
>     > > > haven't implemented it to be 100% sure). The library itself is
> also tiny,
>     > > > it's ~250 lines of logic in a single file. 500 lines including
> comments
>     > > and
>     > > > boilerplate.
>     > > >
>     > > > How do you feel about that? Would that be simple enough?
>     > > >
>     > > >
>     > > > On Thu, Apr 18, 2019 at 3:15 PM Fieck, Brennan <
>     > > [email protected]>
>     > > > wrote:
>     > > >
>     > > > > >If you're deploying the head of master, API minor versioning
> doesn't
>     > > > > really solve that consistent API problem unless we start
> saying that
>     > > > > every single new field added to an API endpoint is a new minor
> version
>     > > > > instead of just incrementing an API's version once per TC
> release.
>     > > > >
>     > > > > Yeah, you shouldn't expect an active development branch to be
> stable -
>     > > > > it's the whole reason we have releases at all. We can't support
>     > > something
>     > > > > that changes potentially a dozen times in a day.
>     > > > >
>     > > > > >If someone goes
>     > > > > to the trouble to understand how our APIs work and develops
> their own
>     > > > > client code, why is it so unreasonable to expect them to also
>     > > > > understand how an update of Traffic Ops could impact their
> _custom_
>     > > > > tooling?
>     > > > >
>     > > > > I agree with this so hard. I'd love to just say "TO vX uses
> the vX API,
>     > > > > major changes to the biggest TC component are a major change
> to TC",
>     > > but at
>     > > > > any given time we support and provide bug/security fixes for
> versions
>     > > X and
>     > > > > X-1. I'll settle for eliminating minor API versions, though.
>     > > Developers can
>     > > > > be expected to understand that changing versions of a thing
> can change
>     > > > > aspects of the ways in which you can interact with said thing.
> A major
>     > > > > version change means major changes and a minor version change
> means
>     > > minor
>     > > > > changes.
>     > > > > ________________________________________
>     > > > > From: Jeff Elsloo <[email protected]>
>     > > > > Sent: Thursday, April 18, 2019 2:59 PM
>     > > > > To: [email protected]
>     > > > > Subject: Re: [EXTERNAL] Re: Traffic Ops API versioning issues
>     > > > >
>     > > > > > Maybe I'm the only one, and everyone else can vote me out,
> but I
>     > > don't
>     > > > > see
>     > > > > that as acceptable. It's our responsibility as developers to
> create a
>     > > safe
>     > > > > user experience, and unacceptable to declare real bugs to be
> the user's
>     > > > > fault for not using it right. When our Production CDN goes down
>     > > because an
>     > > > > Ops person used an old client and didn't "just recompile,"
> it's not
>     > > that
>     > > > > Ops person's fault, it's our fault as Developers, for
> designing a
>     > > dangerous
>     > > > > system. Our job is to prevent the CDN from going down, not to
> shift the
>     > > > > blame when it does.
>     > > > >
>     > > > > I don't think this is about shifting blame, safety, or the
> potential
>     > > > > to crash a CDN. This is about simplifying our code in the API
> and
>     > > > > making it more maintainable. If we simplify the API, we can
> accelerate
>     > > > > development and get more things done, and maybe even complete
> this
>     > > > > Golang migration. Another plus is simplification of routes by
>     > > > > eliminating versioning means less code and likely more
> stability and
>     > > > > safety, easier testing, and less developer confusion, in the
> long run.
>     > > > >
>     > > > > I think it's unreasonable for us to shoulder the burden and
> cost to
>     > > > > maintain various API versions because we're afraid we might
> break some
>     > > > > client out in the wild that might or might not exist. If
> someone goes
>     > > > > to the trouble to understand how our APIs work and develops
> their own
>     > > > > client code, why is it so unreasonable to expect them to also
>     > > > > understand how an update of Traffic Ops could impact their
> _custom_
>     > > > > tooling?
>     > > > >
>     > > > > Obviously we have to hold up our end of the deal and have good
> API
>     > > > > documentation and change logs. I think the cost of maintaining
> that is
>     > > > > much less than API versioning given our experience, especially
> after
>     > > > > we simplify the APIs. We're already doing much of that today.
>     > > > > --
>     > > > > Thanks,
>     > > > > Jeff
>     > > > >
>     > > > > On Thu, Apr 18, 2019 at 11:37 AM Robert Butts <[email protected]>
> wrote:
>     > > > > >
>     > > > > > >Without minor versions, #3497 would not even an issue. It's
> only an
>     > > > > issue
>     > > > > > because of the attempt to support minor versioning.
>     > > > > >
>     > > > > > That's simply not true. It's exactly the same issue.
> Removing minor
>     > > > > > versioning just hides the issue. You have declared:
>     > > > > >
>     > > > > > >only certain clients that don't handle new unknown fields
> would
>     > > > > > potentially be broken
>     > > > > >
>     > > > > > >all the client has to do is recompile
>     > > > > >
>     > > > > > Something doesn't cease to be an issue, because you redefine
> it to
>     > > be the
>     > > > > > user's fault. It's exactly the same issue, removing minor
> versions
>     > > just
>     > > > > > makes it much more difficult to debug.
>     > > > > >
>     > > > > > You're proposing not only removing minor versions, but
> creating data
>     > > loss
>     > > > > > and version mismatch bugs, and declaring them to be the
> user's fault.
>     > > > > >
>     > > > > > Maybe I'm the only one, and everyone else can vote me out,
> but I
>     > > don't
>     > > > > see
>     > > > > > that as acceptable. It's our responsibility as developers to
> create a
>     > > > > safe
>     > > > > > user experience, and unacceptable to declare real bugs to be
> the
>     > > user's
>     > > > > > fault for not using it right. When our Production CDN goes
> down
>     > > because
>     > > > > an
>     > > > > > Ops person used an old client and didn't "just recompile,"
> it's not
>     > > that
>     > > > > > Ops person's fault, it's our fault as Developers, for
> designing a
>     > > > > dangerous
>     > > > > > system. Our job is to prevent the CDN from going down, not
> to shift
>     > > the
>     > > > > > blame when it does.
>     > > > > >
>     > > > > > >Switching all the endpoints over to your "apiver" library
> would not
>     > > be
>     > > > > as
>     > > > > > trivial to implement or remove as you make it sound.
>     > > > > >
>     > > > > > Maybe. I'm offering to do it. If you're sure, why don't you
> let me
>     > > > > > demonstrate, and prove myself wrong?
>     > > > > >
>     > > > > > >It would require lots of added API test coverage
>     > > > > >
>     > > > > > Require? That would be ideal, but we have supported minor
> versions
>     > > for
>     > > > > the
>     > > > > > history of Traffic Ops, and never had extensive version
> tests. I
>     > > agree we
>     > > > > > should, but you're adding additional requirements to further
> your
>     > > > > position,
>     > > > > > which doesn't seem fair. Notwithstanding, the tag library
> already
>     > > has 90%
>     > > > > > test coverage and 3x as many lines of test code as logic;
> and the API
>     > > > > Tests
>     > > > > > are actually pretty easy, I just added one in the
> old-version-update
>     > > fix,
>     > > > > > and it was much easier than I expected:
>     > > > > >
>     > > > >
>     > >
> https://github.com/apache/trafficcontrol/pull/3500/commits/16f2c96f086836f1d655fd62e673ee0a5e95e785
>     > > > > > .
>     > > > > >
>     > > > > > >Certain UPDATE queries might be easy to generate from a
> given
>     > > struct if
>     > > > > > the struct only uses a single table, but I don't think
> something like
>     > > > > that
>     > > > > > would work for a field like `cachegroup.LocalizationMethods`
> which
>     > > > > doesn't
>     > > > > > come from the cachegroups table
>     > > > > >
>     > > > > > I believe it is easy. The function to parse tags can use the
> tags in
>     > > the
>     > > > > > primary object (Cachegroup), and the sub-objects
>     > > (LocalizationMethods)
>     > > > > will
>     > > > > > have their own version tags. I could be mistaken, I haven't
> actually
>     > > > > > written the code yet, but I'm pretty sure sub-objects with
>     > > sub-updates
>     > > > > > won't be any more difficult or require much if any special
> logic.
>     > > > > >
>     > > > > >
>     > > > > > On Thu, Apr 18, 2019 at 10:37 AM Gray, Jonathan <
>     > > > > [email protected]>
>     > > > > > wrote:
>     > > > > >
>     > > > > > > At the end of the day, what I want is a consistent API
> that I can
>     > > code
>     > > > > > > against in the head of master that's treated like a
> contract.  As
>     > > an
>     > > > > API
>     > > > > > > user outside of the ATC repo it's incredibly frustrating
> to have my
>     > > > > stuff
>     > > > > > > break all the time.  It basically encourages never
> developing
>     > > using the
>     > > > > > > latest API versions (regardless of how they're defined and
> even
>     > > then
>     > > > > things
>     > > > > > > still break retroactively) or a non-official OSS release
>     > > alltogether.
>     > > > > It's
>     > > > > > > a catch22 to be forced to either not vendor the
> go/python/bash
>     > > > > libraries
>     > > > > > > which leads to constant develop/recompile/deploys in
> lockstep with
>     > > ATC
>     > > > > or
>     > > > > > > vendor and still have to do these things when stuff breaks
> anyway
>     > > in
>     > > > > the
>     > > > > > > API.  Really debating the native client libraries at all
> is just a
>     > > red
>     > > > > > > herring because the root issue is the HTTP API itself
> which is the
>     > > real
>     > > > > > > thing to care about since not all integrations use one of
> the
>     > > client
>     > > > > > > libraries, nor can be forced to do so, and may require a
> rigid API
>     > > > > > > definition.
>     > > > > > >
>     > > > > > > Jonathan G
>     > > > > > >
>     > > > > > >
>     > > > > > > On 4/18/19, 10:12 AM, "Rawlin Peters" <
> [email protected]>
>     > > > > wrote:
>     > > > > > >
>     > > > > > >     > The UPDATE statements need modified to fix #3497
> even if we
>     > > get
>     > > > > rid
>     > > > > > > of
>     > > > > > >     > versioning. Unless we decide to permanently break
> all clients
>     > > > > older
>     > > > > > > than
>     > > > > > >     > the newest server field, with every new server
> upgrade. The
>     > > only
>     > > > > > > other
>     > > > > > >     > option is to fix the updates. Unless you know of a
> way to fix
>     > > > > missing
>     > > > > > >     > fields without changing the update statements, that
> I'm not
>     > > > > seeing?
>     > > > > > >
>     > > > > > >     By removing minor versioning, only certain clients
> that don't
>     > > > > handle
>     > > > > > >     new unknown fields would potentially be broken, and I
> believe
>     > > only
>     > > > > the
>     > > > > > >     TO Go client has that problem in our repo. However,
> the TO Go
>     > > > > client
>     > > > > > >     happens to use the same Go structs as
> traffic_ops_golang, so
>     > > > > whenever
>     > > > > > >     new fields are added to the API, all the client has to
> do is
>     > > > > recompile
>     > > > > > >     with the up-to-date structs. Unless we made breaking
> changes
>     > > to the
>     > > > > > >     client, in most cases all that would be needed for
> those
>     > > clients
>     > > > > is a
>     > > > > > >     recompile. Traffic Portal, the Python TO client, and
> I'm pretty
>     > > > > sure
>     > > > > > >     the Java TO client all handle unknown fields properly.
>     > > > > > >
>     > > > > > >     Without minor versions, #3497 would not even an issue.
> It's
>     > > only an
>     > > > > > >     issue because of the attempt to support minor
> versioning. If we
>     > > > > just
>     > > > > > >     support the major version, all client requests would be
>     > > treated as
>     > > > > v1,
>     > > > > > >     and there would only ever be one SQL UPDATE statement
> per major
>     > > > > > >     version. We wouldn't need to "upgrade" 1.2 requests
> into a 1.4
>     > > > > struct
>     > > > > > >     (thus preventing the bug in #3497) by selecting and
> inserting
>     > > all
>     > > > > 1.4
>     > > > > > >     values from the DB into the struct before handling the
> request
>     > > or
>     > > > > > >     dynamically generating the SQL UPDATE statement to use
> based
>     > > on the
>     > > > > > >     requested minor version.
>     > > > > > >
>     > > > > > >     > So, this solution actually gives us
>     > > > > > >     > this bug fix almost for free. All that's required is
> another
>     > > > > small
>     > > > > > > function
>     > > > > > >     > to iterate over the object fields to create the
> update query.
>     > > > > It's
>     > > > > > > by far
>     > > > > > >     > the easiest and simplest fix for #3497; unless we
> also
>     > > > > permanently
>     > > > > > > break
>     > > > > > >     > all older clients on every server upgrade along with
> the
>     > > minor
>     > > > > > > version
>     > > > > > >     > removal.
>     > > > > > >
>     > > > > > >     Switching all the endpoints over to your "apiver"
> library would
>     > > > > not be
>     > > > > > >     as trivial to implement or remove as you make it
> sound. It
>     > > would
>     > > > > > >     require lots of added API test coverage and a
> non-trivial
>     > > amount of
>     > > > > > >     code modifications to all API endpoints. Certain
> UPDATE queries
>     > > > > might
>     > > > > > >     be easy to generate from a given struct if the struct
> only
>     > > uses a
>     > > > > > >     single table, but I don't think something like that
> would work
>     > > for
>     > > > > a
>     > > > > > >     field like `cachegroup.LocalizationMethods` which
> doesn't come
>     > > from
>     > > > > > >     the cachegroups table and is updated separately from
> the rest
>     > > of
>     > > > > the
>     > > > > > >     cachegroup fields.
>     > > > > > >
>     > > > > > >     - Rawlin
>     > > > > > >
>     > > > > > >
>     > > > > > >
>     > > > >
>     > >
>
>
>

Reply via email to