@Rawlin, more like this email. Sorry I didn't make it up this far before I replied.
-Dew On Tue, Oct 23, 2018 at 10:27 AM Dave Neuman <[email protected]> wrote: > Since this topic is now 16 emails long, I'll make it 17 to try to do some > summarization, address concerns, and see if we can get some consensus. > Lately our email discussion get to this point where there are many, many > emails back and forth, but we aren't really getting anywhere (either for or > against). > > I think what Rawlin is proposing is as follows: > - Create a Traffic Ops database backup as part of the Traffic Ops upgrade > process > - Perform the upgrade as usual using `db/admin.pl upgrade` and `goose up` > - Keep the ability to do `goose down` as-is for whatever one off reason we > may have > - Add functionality to Traffic Ops such that if a `yum downgrade` is > performed, the user also has the ability to run a command like `db/ > admin.pl > restart` or something like that to restore the database to the post > upgraded version > - How exactly we do the backup/restore is a different issue for a different > topic, we are just trying to get basic consensus that this idea makes the > product better than it is today. > > So far, the concerns have been: > - What if data has changed since the upgrade > - We still need the ability to do a goose down > - Backups/Restores should only be done by a DBA > - We should test out the downgrade scenarios. > > To address these concerns: > >> What if data has changed since the upgrade > It is pretty reasonable to assume that if you are doing a complete > downgrade it is because you found an serious issue within a short time > period of doing your upgrade. It is pretty unlikely that someone is going > to upgrade and then find such a serious issue a week after upgrading. It > is my opinion that we should be writing software to support the 80-90% of > scenarios and not the 10-20% of scenarios. Yeah, it could happen, but that > doesn't mean it will. Also, if someone is concerned with losing data they > can perform a downgrade in the same way we have today, by running goose X > amount of times (one for each migration added). > > >> We still need the ability to do a goose down > Rawlin has already stated that we will not be losing this ability. We will > just be adding the ability to do a wholesale downgrade of the database. > > >> Backups/Restores should only be done by a DBA > Not all of our users are DBAs and I think it is completely reasonable to > provide the ability to do backup and restores for the purpose of > downgrading to a known good state. If we are going to break someone's > software, we should do whatever we can to help them fix it. We are not > requiring that users use this process and they are more than welcome to > perform their own backups if they prefer. > > >> We should test out the downgrade scenarios. > Yeah sure, but we know that we have a less than ideal solution today with > just goose down. I think what Rawlin proposes is in addition to `goose > down` and will be used only in situations where a full downgrade of Traffic > Ops is required. I don't think we need to test every single scenario to > know that this is an area we can improve upon. > > As for the replacing goose conversation, we should not hijack this thread > to discuss that. If we want to propose changing it then I would A) go dig > up the old thread we had on it and address the concerns from that thread in > a new proposal and B) submit a new mailing list topic. > > So, can we please get some consensus on this topic (either for or > against)? Basically are we for or against the idea? We don't need to > bring up every single possible design decision or edge case now, just agree > that the idea is a good one or isn't. > > You already know where I stand, but for the sake of clarity, I think this > is a good idea and greatly helps our ability to do downgrades. > > Thanks, > Dave > > > On Mon, Oct 22, 2018 at 8:59 AM Rawlin Peters <[email protected]> > wrote: > > > On Fri, Oct 19, 2018 at 12:14 PM Dewayne Richardson <[email protected]> > > wrote: > > > > > > I'm -1 until someone tests out the downgrade scenarios. My vote would > be > > > to keep the goose-like downgrade options (and potentially improve db/ > > > admin.pl if needed to allow more rollback options if needed). > > > > Can you elaborate on what kind of testing of downgrade scenarios you'd > > like to see? > > > > Ideally, I think we need to at least run SQL upgrade migrations on > > every PR submitted (if the PR doesn't have an upgrade migration, the > > test would still just run all the pre-existing ones), followed by N > > SQL downgrade migrations (N=number of migrations added in the PR). > > This should be a jenkins job that spins up a postgres docker container > > with the "Kabletown" data then runs the upgrade+downgrade migrations. > > If the migrations fail, the PR tests fail. This would give us somewhat > > of a guarantee that the SQL migrations actually run. > > > > - Rawlin > > >
