Re: Migrations: A bug or a feature needing documentation?

Barry Johnson Thu, 08 Aug 2019 04:34:55 -0700


On Wednesday, August 7, 2019 at 9:22:27 PM UTC-5, Markus Holtermann wrote:
>
> Let's look at the last question first, regarding duplicate entries in the 
> django_migrations table: Yes, this is to be a bug. At least how it's 
> currently used.
>


Agreed.
 

> Let's say you have migration foo.0001_initial and apply it. You then have 
> (foo, 
> 0001_initial) in the django_migrations table. You now create migration 
> foo.0002_something and also add a squashed migration 
> foo.0001_initial_squashed_0002_something. When you now run migrate, 
> Django will apply foo.0002_something and your database will have (foo, 
> 0001_initial), (foo, 0002_something) as well as (foo, 
> 0001_initial_squashed_0002_something).
> So far so good. That's all as expected. If you now remove foo.0001_initial 
> and foo.0002_something from your filesystem and remove the replaces 
> section in foo.0001_initial_squashed_0002_something it is as if Django 
> never new about foo.0001_initial or foo.0002_something. You can add new 
> migrations, everything works the way it should. However, if you were to add 
> e.g. foo.0002_something again, Django would treat it as already applied, 
> despite it being somewhere later in your migration graph.
> At this point, I don't think this is the intended behavior. That said, I'm 
> inclined to say that applying a squashed migration should "unrecord" all 
> migrations it replaces. I've not yet thought too much about the "fallout" 
> (backwards compatibility, 
>
rollback of migrations, ...). But at least with regards to migrating 
> forwards, this seems to be the right behavior.
>

To date, the migration system has been remarkably tolerant of spurious 
entries in the django_migrations table, and the squashing process (to date) 
does indeed leave breadcrumbs of since-deleted migrations behind.  Agree 
completely with your point about adding a subsequent migration with a name 
that exactly matches a previously created migration that has been removed. 
 For all practical purposes, the migration names should be unique over 
time, or the developer should ensure that the second incarnation of that 
name is a functional replacement.

We have, in the past, replaced the contents of the "0001_initial" migration 
in most of our apps, but it is incumbent on us to make sure that's correct.

Regarding your second point around "replaces" and merging migrations: I 
> think this will lead to inconsistencies in your migration order, thus 
> potentially causing trouble down the line. [...]  I suspect that two data 
> migrations could easily conflict or result in inconsistent data if applied 
> in the wrong order. For example, one data migration adding new records to a 
> table, and another one ensuring that all values in a column are in upper 
> case. If you apply both migrations in that order (insert and then ensure 
> uppercase) you can be certain that all values will be uppercase. If you, 
> however, first ensure uppercase and then insert additional values, you need 
> to make sure that the data in the second migration is properly formatted.
>

Oh, certainly agree, Markus, about the dangers of replacing migrations. 
 That's true across the board, even in the documented use cases.  The 
replacement steps MUST properly account for elidable data migrations.  

In the normal use case (the typical migration squashing process), the 
replacement step truly replaces the original step, and is executed at that 
original spot within the dependency graph.  (It can seem as if that 
replacement is moved "back in time", as if history were rewritten.  That 
means that any migrations subsequent to the original step will remain 
subsequent to the replacement, which works.

In the use case where the original does not exist, the replacement step 
remains where it is in the dependency graph.  It must, because there's 
nowhere earlier that it could be placed.  Could this introduce an 
inconsistency, especially with data migrations?  Yes, if the developer is 
not careful.  Using your example of an insertion of lowercase data followed 
by a conversion to uppercase data:  in one branch the insertion happens 
first; in a different branch (where the insertion is run as a replacement 
step) they could be run in a different order.  That would indeed appear to 
be one of the dangers of moving a database from one branch of code to 
another, and that (as things are today) the developer must understand the 
migration paths.

Hmmm.

Keeping all this in mind, I'm beginning to believe there is a procedural 
method to solve the parallel-development problem.  Just before creating a 
long-lived branch (such as as production release that may get patches), 
create an empty migration step in all apps.  Later, if patch migrations had 
become necessary in that branch, the mainline trunk can "replace" the empty 
step with an equivalent migration that replaces it.  At least, that will 
give us an order of execution that more closely resembles what the 
production tenants had been through, even though the mainline development 
tenants may have run things in a different order.

Thank you, Markus, for your insights.

baj

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/f02050f6-2516-450c-81d2-7161d4a15c73%40googlegroups.com.

Re: Migrations: A bug or a feature needing documentation?

Reply via email to