Re: Data patches during upgrade

Naama Shoresh Tue, 20 Jun 2017 20:41:57 -0700

I understand.
Thanks.

On Tue, Jun 13, 2017 at 4:08 PM, Dewayne Richardson <[email protected]>
wrote:


> The reason the migrations manage "structure" is because of rollbacks with
> "goose Up and Down" (not saying "data" can't be rolled back, but it gets
> trickier, what if I screw up the "where" clause and inadvertently remove
> other data outside of the scope of the migration unintentionally?).
>
> Migrations managing only structure also helps with "separation".  I can
> apply "test" or "integration" data to the same structure, without having to
> run the migrations.  Test data doesn't always align with seed data because
> I might need to artificially manipulate the data differently depending on
> the test scenario I want to achieve.  If migrations have placed data in the
> database when I need to test, now I have to remove that static data before
> I even start.  Yes, we do that now with the test data but the test data is
> preditable.
>
> We have this problem today with the migrations (just managing the
> structure).  If I'm working in a branch and commit a migration with a
> migration file timestamp of yesterday, then you are also working on a
> different migration with a migration timestamp of today and commit and push
> before I do, then the timestamp sequencing gets misaligned.  Once goose
> runs it "bookmarks" your migration in the goose_db_version table in front
> of mine, which causes goose to ignore my migration, which forces me to
> rename that file to a later timestamp just so goose will see it.
>
> So, if we "interleave" seed data (over time), managing that data order on
> top of the structure management in goose will add yet another level of
> complexity.  This is the reason the db/admin.pl was created to help with
> this workflow.  I know it's not optimal, but these are the hurdles we hit
> as we were figuring this stuff out.
>
> -Dewayne
>
> On Mon, Jun 12, 2017 at 10:14 PM, Naama Shoresh <[email protected]> wrote:
>
> > Hi,
> >
> > I want to suggest a slightly different approach.
> > Goose is the brain managing the DB upgrade, right?
> > The data patches are part of the DB evolution, but today we can't use
> goose
> > to run them because we have seeds.sql in the middle.
> > What I suggest is turning seeds.sql into another migration script,
> > resulting in the following procedure:
> > 0) (In clean installations) Tables creation
> > 1) Goose migrations:
> > 1a) Schema changes
> > 1b) Data seeding (seeds.sql)
> > 1c) Data changes
> >
> > Going forward, I believe a data change migration script will be attached
> to
> > most schema changes, instead of populating the DB in the seeds.sql.
> >
> > The benefits come from the fact that for future changes, the order
> > presented above (1a, 1b, 1c) is not strict.
> > Future schema/data changes are expressed in a single migration script,
> > containing all relevant operations.
> > This ensures that whatever change is needed (schema/data/both), and
> > whatever the change depends on, it can be handled by a single Goose
> > migration script.
> >
> > What do you think?
> >
> >
> >
> > On Fri, Jun 9, 2017 at 6:35 PM, Dewayne Richardson <[email protected]>
> > wrote:
> >
> > > Yea, it's just a new feature to admin.pl to support data conversions,
> to
> > > keep the migrations clean.  Derek and I have been working through it.
> > >
> > > -Dew
> > >
> > > On Thu, Jun 8, 2017 at 7:40 AM, Jeremy Mitchell <[email protected]
> >
> > > wrote:
> > >
> > > > This seems to make sense to me but honestly, I'd probably defer to
> > > Dewayne.
> > > >
> > > > In theory, it would be nice if migrations only included "structural"
> > > > changes (new tables, columns, changing column types or not  null,
> etc)
> > > and
> > > > seeds focused on the "base" (or the minimum required) static data
> > > required
> > > > of TO (types, statuses, roles, etc) and then yea, putting data fixing
> > or
> > > > data massaging as the last step makes sense to me. But you know what
> > they
> > > > say about theory...
> > > >
> > > > +1
> > > >
> > > > Jeremy
> > > >
> > > > On Wed, Jun 7, 2017 at 8:41 AM, Gelinas, Derek <
> > > [email protected]>
> > > > wrote:
> > > >
> > > > > I'm adding a feature to traffic ops that creates a new column in
> > > > > steering_target called type, that is populated with type ids from
> the
> > > > type
> > > > > table.  Using admin.pl upgrade, the column is created in
> migrations,
> > > and
> > > > > the two types for this table are populated by seeds.sql.  None of
> > this
> > > is
> > > > > out of the ordinary.  Unfortunately I also need to populate the
> type
> > > > column
> > > > > based on data that isn't in there until after seeds.sql is run, so
> I
> > > > can't
> > > > > place this into the migration.  Seeds.sql needs to run after the
> > > > migration
> > > > > due to any structural changes that happen there.
> > > > >
> > > > > Dewayne and I have discussed this a bit this morning, and we're
> > > thinking
> > > > > the best solution might be a third step, run after seeds.sql,
> called
> > > > > patches.sql.  This would be specifically for data fixes like in
> this
> > > use
> > > > > case.  The order would be as follows:
> > > > >
> > > > > migration - structure
> > > > > seeds - static data
> > > > > patches - data fixes
> > > > >
> > > > > Thoughts?
> > > > >
> > > > > Derek
> > > >
> > >
> >
> >
> >
> > --
> > *Naama Shoresh*
> > Qwilt | Work: +972-72-2221706 | Mobile: +972-52-3401999 |
> > [email protected]
> >
>



-- 
*Naama Shoresh*
Qwilt | Work: +972-72-2221706 | Mobile: +972-52-3401999 | [email protected]

Re: Data patches during upgrade

Reply via email to