Re: Data patches during upgrade

2017-06-20 Thread Naama Shoresh
I understand.
Thanks.

On Tue, Jun 13, 2017 at 4:08 PM, Dewayne Richardson 
wrote:

> The reason the migrations manage "structure" is because of rollbacks with
> "goose Up and Down" (not saying "data" can't be rolled back, but it gets
> trickier, what if I screw up the "where" clause and inadvertently remove
> other data outside of the scope of the migration unintentionally?).
>
> Migrations managing only structure also helps with "separation".  I can
> apply "test" or "integration" data to the same structure, without having to
> run the migrations.  Test data doesn't always align with seed data because
> I might need to artificially manipulate the data differently depending on
> the test scenario I want to achieve.  If migrations have placed data in the
> database when I need to test, now I have to remove that static data before
> I even start.  Yes, we do that now with the test data but the test data is
> preditable.
>
> We have this problem today with the migrations (just managing the
> structure).  If I'm working in a branch and commit a migration with a
> migration file timestamp of yesterday, then you are also working on a
> different migration with a migration timestamp of today and commit and push
> before I do, then the timestamp sequencing gets misaligned.  Once goose
> runs it "bookmarks" your migration in the goose_db_version table in front
> of mine, which causes goose to ignore my migration, which forces me to
> rename that file to a later timestamp just so goose will see it.
>
> So, if we "interleave" seed data (over time), managing that data order on
> top of the structure management in goose will add yet another level of
> complexity.  This is the reason the db/admin.pl was created to help with
> this workflow.  I know it's not optimal, but these are the hurdles we hit
> as we were figuring this stuff out.
>
> -Dewayne
>
> On Mon, Jun 12, 2017 at 10:14 PM, Naama Shoresh  wrote:
>
> > Hi,
> >
> > I want to suggest a slightly different approach.
> > Goose is the brain managing the DB upgrade, right?
> > The data patches are part of the DB evolution, but today we can't use
> goose
> > to run them because we have seeds.sql in the middle.
> > What I suggest is turning seeds.sql into another migration script,
> > resulting in the following procedure:
> > 0) (In clean installations) Tables creation
> > 1) Goose migrations:
> > 1a) Schema changes
> > 1b) Data seeding (seeds.sql)
> > 1c) Data changes
> >
> > Going forward, I believe a data change migration script will be attached
> to
> > most schema changes, instead of populating the DB in the seeds.sql.
> >
> > The benefits come from the fact that for future changes, the order
> > presented above (1a, 1b, 1c) is not strict.
> > Future schema/data changes are expressed in a single migration script,
> > containing all relevant operations.
> > This ensures that whatever change is needed (schema/data/both), and
> > whatever the change depends on, it can be handled by a single Goose
> > migration script.
> >
> > What do you think?
> >
> >
> >
> > On Fri, Jun 9, 2017 at 6:35 PM, Dewayne Richardson 
> > wrote:
> >
> > > Yea, it's just a new feature to admin.pl to support data conversions,
> to
> > > keep the migrations clean.  Derek and I have been working through it.
> > >
> > > -Dew
> > >
> > > On Thu, Jun 8, 2017 at 7:40 AM, Jeremy Mitchell  >
> > > wrote:
> > >
> > > > This seems to make sense to me but honestly, I'd probably defer to
> > > Dewayne.
> > > >
> > > > In theory, it would be nice if migrations only included "structural"
> > > > changes (new tables, columns, changing column types or not  null,
> etc)
> > > and
> > > > seeds focused on the "base" (or the minimum required) static data
> > > required
> > > > of TO (types, statuses, roles, etc) and then yea, putting data fixing
> > or
> > > > data massaging as the last step makes sense to me. But you know what
> > they
> > > > say about theory...
> > > >
> > > > +1
> > > >
> > > > Jeremy
> > > >
> > > > On Wed, Jun 7, 2017 at 8:41 AM, Gelinas, Derek <
> > > derek_geli...@comcast.com>
> > > > wrote:
> > > >
> > > > > I'm adding a feature to traffic ops that creates a new column in
> > > > > steering_target called type, that is populated with type ids from
> the
> > > > type
> > > > > table.  Using admin.pl upgrade, the column is created in
> migrations,
> > > and
> > > > > the two types for this table are populated by seeds.sql.  None of
> > this
> > > is
> > > > > out of the ordinary.  Unfortunately I also need to populate the
> type
> > > > column
> > > > > based on data that isn't in there until after seeds.sql is run, so
> I
> > > > can't
> > > > > place this into the migration.  Seeds.sql needs to run after the
> > > > migration
> > > > > due to any structural changes that happen there.
> > > > >
> > > > > Dewayne and I have discussed this a bit this morning, and we're
> > > thinking
> > > > > the best 

Re: Data patches during upgrade

2017-06-09 Thread Dewayne Richardson
Yea, it's just a new feature to admin.pl to support data conversions, to
keep the migrations clean.  Derek and I have been working through it.

-Dew

On Thu, Jun 8, 2017 at 7:40 AM, Jeremy Mitchell 
wrote:

> This seems to make sense to me but honestly, I'd probably defer to Dewayne.
>
> In theory, it would be nice if migrations only included "structural"
> changes (new tables, columns, changing column types or not  null, etc) and
> seeds focused on the "base" (or the minimum required) static data required
> of TO (types, statuses, roles, etc) and then yea, putting data fixing or
> data massaging as the last step makes sense to me. But you know what they
> say about theory...
>
> +1
>
> Jeremy
>
> On Wed, Jun 7, 2017 at 8:41 AM, Gelinas, Derek 
> wrote:
>
> > I'm adding a feature to traffic ops that creates a new column in
> > steering_target called type, that is populated with type ids from the
> type
> > table.  Using admin.pl upgrade, the column is created in migrations, and
> > the two types for this table are populated by seeds.sql.  None of this is
> > out of the ordinary.  Unfortunately I also need to populate the type
> column
> > based on data that isn't in there until after seeds.sql is run, so I
> can't
> > place this into the migration.  Seeds.sql needs to run after the
> migration
> > due to any structural changes that happen there.
> >
> > Dewayne and I have discussed this a bit this morning, and we're thinking
> > the best solution might be a third step, run after seeds.sql, called
> > patches.sql.  This would be specifically for data fixes like in this use
> > case.  The order would be as follows:
> >
> > migration - structure
> > seeds - static data
> > patches - data fixes
> >
> > Thoughts?
> >
> > Derek
>


Re: Data patches during upgrade

2017-06-08 Thread Jeremy Mitchell
This seems to make sense to me but honestly, I'd probably defer to Dewayne.

In theory, it would be nice if migrations only included "structural"
changes (new tables, columns, changing column types or not  null, etc) and
seeds focused on the "base" (or the minimum required) static data required
of TO (types, statuses, roles, etc) and then yea, putting data fixing or
data massaging as the last step makes sense to me. But you know what they
say about theory...

+1

Jeremy

On Wed, Jun 7, 2017 at 8:41 AM, Gelinas, Derek 
wrote:

> I'm adding a feature to traffic ops that creates a new column in
> steering_target called type, that is populated with type ids from the type
> table.  Using admin.pl upgrade, the column is created in migrations, and
> the two types for this table are populated by seeds.sql.  None of this is
> out of the ordinary.  Unfortunately I also need to populate the type column
> based on data that isn't in there until after seeds.sql is run, so I can't
> place this into the migration.  Seeds.sql needs to run after the migration
> due to any structural changes that happen there.
>
> Dewayne and I have discussed this a bit this morning, and we're thinking
> the best solution might be a third step, run after seeds.sql, called
> patches.sql.  This would be specifically for data fixes like in this use
> case.  The order would be as follows:
>
> migration - structure
> seeds - static data
> patches - data fixes
>
> Thoughts?
>
> Derek


Data patches during upgrade

2017-06-07 Thread Gelinas, Derek
I'm adding a feature to traffic ops that creates a new column in 
steering_target called type, that is populated with type ids from the type 
table.  Using admin.pl upgrade, the column is created in migrations, and the 
two types for this table are populated by seeds.sql.  None of this is out of 
the ordinary.  Unfortunately I also need to populate the type column based on 
data that isn't in there until after seeds.sql is run, so I can't place this 
into the migration.  Seeds.sql needs to run after the migration due to any 
structural changes that happen there.

Dewayne and I have discussed this a bit this morning, and we're thinking the 
best solution might be a third step, run after seeds.sql, called patches.sql.  
This would be specifically for data fixes like in this use case.  The order 
would be as follows:

migration - structure
seeds - static data
patches - data fixes

Thoughts?

Derek