The reason the migrations manage "structure" is because of rollbacks with "goose Up and Down" (not saying "data" can't be rolled back, but it gets trickier, what if I screw up the "where" clause and inadvertently remove other data outside of the scope of the migration unintentionally?).
Migrations managing only structure also helps with "separation". I can apply "test" or "integration" data to the same structure, without having to run the migrations. Test data doesn't always align with seed data because I might need to artificially manipulate the data differently depending on the test scenario I want to achieve. If migrations have placed data in the database when I need to test, now I have to remove that static data before I even start. Yes, we do that now with the test data but the test data is preditable. We have this problem today with the migrations (just managing the structure). If I'm working in a branch and commit a migration with a migration file timestamp of yesterday, then you are also working on a different migration with a migration timestamp of today and commit and push before I do, then the timestamp sequencing gets misaligned. Once goose runs it "bookmarks" your migration in the goose_db_version table in front of mine, which causes goose to ignore my migration, which forces me to rename that file to a later timestamp just so goose will see it. So, if we "interleave" seed data (over time), managing that data order on top of the structure management in goose will add yet another level of complexity. This is the reason the db/admin.pl was created to help with this workflow. I know it's not optimal, but these are the hurdles we hit as we were figuring this stuff out. -Dewayne On Mon, Jun 12, 2017 at 10:14 PM, Naama Shoresh <[email protected]> wrote: > Hi, > > I want to suggest a slightly different approach. > Goose is the brain managing the DB upgrade, right? > The data patches are part of the DB evolution, but today we can't use goose > to run them because we have seeds.sql in the middle. > What I suggest is turning seeds.sql into another migration script, > resulting in the following procedure: > 0) (In clean installations) Tables creation > 1) Goose migrations: > 1a) Schema changes > 1b) Data seeding (seeds.sql) > 1c) Data changes > > Going forward, I believe a data change migration script will be attached to > most schema changes, instead of populating the DB in the seeds.sql. > > The benefits come from the fact that for future changes, the order > presented above (1a, 1b, 1c) is not strict. > Future schema/data changes are expressed in a single migration script, > containing all relevant operations. > This ensures that whatever change is needed (schema/data/both), and > whatever the change depends on, it can be handled by a single Goose > migration script. > > What do you think? > > > > On Fri, Jun 9, 2017 at 6:35 PM, Dewayne Richardson <[email protected]> > wrote: > > > Yea, it's just a new feature to admin.pl to support data conversions, to > > keep the migrations clean. Derek and I have been working through it. > > > > -Dew > > > > On Thu, Jun 8, 2017 at 7:40 AM, Jeremy Mitchell <[email protected]> > > wrote: > > > > > This seems to make sense to me but honestly, I'd probably defer to > > Dewayne. > > > > > > In theory, it would be nice if migrations only included "structural" > > > changes (new tables, columns, changing column types or not null, etc) > > and > > > seeds focused on the "base" (or the minimum required) static data > > required > > > of TO (types, statuses, roles, etc) and then yea, putting data fixing > or > > > data massaging as the last step makes sense to me. But you know what > they > > > say about theory... > > > > > > +1 > > > > > > Jeremy > > > > > > On Wed, Jun 7, 2017 at 8:41 AM, Gelinas, Derek < > > [email protected]> > > > wrote: > > > > > > > I'm adding a feature to traffic ops that creates a new column in > > > > steering_target called type, that is populated with type ids from the > > > type > > > > table. Using admin.pl upgrade, the column is created in migrations, > > and > > > > the two types for this table are populated by seeds.sql. None of > this > > is > > > > out of the ordinary. Unfortunately I also need to populate the type > > > column > > > > based on data that isn't in there until after seeds.sql is run, so I > > > can't > > > > place this into the migration. Seeds.sql needs to run after the > > > migration > > > > due to any structural changes that happen there. > > > > > > > > Dewayne and I have discussed this a bit this morning, and we're > > thinking > > > > the best solution might be a third step, run after seeds.sql, called > > > > patches.sql. This would be specifically for data fixes like in this > > use > > > > case. The order would be as follows: > > > > > > > > migration - structure > > > > seeds - static data > > > > patches - data fixes > > > > > > > > Thoughts? > > > > > > > > Derek > > > > > > > > > -- > *Naama Shoresh* > Qwilt | Work: +972-72-2221706 | Mobile: +972-52-3401999 | > [email protected] >
