[ TL;DR: A migration may use a “replaces” list pointing to migrations that 
don’t actually exist.  This undocumented technique cleanly solves a recurring 
difficult migration problem.  We seek consensus on whether this should become a 
documented feature or that it is an unexpected side effect subject to change in 
the future. ]


We have found an undocumented behavior in the migration system that gracefully 
solves the troublesome problem of merging migrations created in parallel 
development branches.  If this behavior should survive, we’ll enter a 
documentation ticket – but if it’s considered a bug, we’ll need to stay away 
from it and fall back to the more difficult manual editing approaches we’ve 
used in the past.

The Use Case
------------------
We’re rapidly developing a large multi-tenant application (hundreds of ORM 
models, thousands of migrations and hundreds of thousands of lines of code so 
far, with quite a bit of work remaining) punctuated by periodic production 
releases.  We create a source code branch from our mainline development trunk 
for each production release, just in case we must rapidly issue patches to 
those production releases.  On rare occasions, we’ve had to make a schema 
change (such as adding a new field) as a patch to a production release, and 
make a parallel schema change in the mainline development trunk.

Of course, this normally causes a migration failure when migrating a production 
tenant from the patch release up to a later version of the mainline release – 
since the mainline release has a subsequent migration that adds the same field. 
 We’ve solved this in the past by manually rearranging the dependency order of 
the mainline trunk migrations (moving the replacement step before other new 
migrations for this later release), and fiddling with the contents of the 
django_migrations table to make it look like that mainline step has already 
been run before running the migrations.  We’re unhappy with that approach – 
it’s both time consuming and error prone.

This problem is similar to, but not identical to, that of squashing migrations.

(And yes, we do periodically squash our migrations.  We have about 600 
migration steps at the moment, left over from more than 2,000 originally 
created.  We’ve got another round of squashing coming up soon that should take 
us to less than 100 migrations – but we have more than a dozen developers 
adding more migrations every week.)

The Discovery
-------------------
Through trial and error, we found that our mainline migration step may declare 
itself as a replacement for the patch step (using the “replaces” attribute) – 
even if the patch migration itself doesn’t exist in the list of mainline 
migrations.

And if we do this, the migration engine simply works as hoped and our problem 
vanishes.  It’s absolutely wonderful; simple to implement and effective.  We 
love it.  New tenants run only the replacement step; tenants migrating from the 
patch release to the trunk release merely record the replacement step as having 
been completed without actually executing it; development tenants that never 
saw the original patch step simply record both the patch step and the 
replacement as having been completed.  It’s great.

The Worry
--------------
This approach seems undocumented in three different ways:

* The replacement migration is pointing at an original migration that doesn’t 
exist in the trunk’s migration files. (We created it in the patch branch and we 
know the migration name from that branch, but we never added the patch 
migration to the mainline trunk.)  The current documentation[1] describes 
keeping both the original and the replacement in place until all databases have 
migrated past the replacement step (and then deleting the original and removing 
the “replaces” attribute from the replacement).  The documentation implies, but 
does not explicitly state, that the original step should exist in the list.  
Our testing shows that the original need not exist (and we like it this way!).
* If we go ahead and add a copy of the patch release’s migration step to the 
mainline trunk, we introduce a “multiple leaf nodes” graph, since none of the 
mainline migrations depend upon this “side patch”.  However, apparently because 
there is a declared replacement for this patch step, the migration engine 
doesn’t raise the “multiple leaf nodes” exception.  This seems to be an 
oversight unless the replacement step is somehow acting as a merge (as if it 
had a dependency on the patch step) …  but we like the way it’s working now, if 
it were to become necessary to include the original step in the mainline 
migration list.
* We have found that we can have multiple replacement steps all claiming to 
replace the same original step number. (This conveniently handled a case where 
multiple migrations were originally created in the trunk, then backported as a 
single migration into a patch to an earlier production release.)  But this 
results in the path migration’s app and name being inserted into 
django_migrations table more than once.  These duplicate entries haven’t 
appeared to cause a problem, but they were unexpected.  It seems that the app 
and migration name ought to be “unique together” but aren’t – perhaps for 
performance reasons, since the contents of this table are normally managed 
solely by the migrations system.

The Question
-------------------
Would the core team consider the ability to “replace” a non-existent migration 
step to be a feature or a bug?  We prefer to think of this as a desirable 
feature, since it solves what seems to be a non-uncommon use case.  We haven’t 
seen any other documented approaches to solving the problem of migrations 
created in parallel branches – most published advice boils down to either 
“don’t do it”, “roll back your migrations then apply the new ones”, or “good 
luck on manually repairing things.”

If this IS considered a bug, we certainly could add the original migration from 
the patch release, but then we’ve added a migration “to the side” of the 
original dependency tree introducing another leaf node.  We’d hate for that to 
be considered a problem in the future, because the replacement step doesn’t 
look like it should act as a merge node (it doesn’t depend upon the original, 
just replaces it).

The third point, the insertion of duplicate records into django_migrations, 
does smell like a defect.
If people like this “feature” and believe it should be supported, we’d be happy 
to create a documentation PR.

Barry Johnson
Epicor Software Corporation

[1]: 
https://docs.djangoproject.com/en/2.2/topics/migrations/#migration-squashing

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/29136C68-DA75-431E-8C77-169097346AD1%40epicor.com.

Reply via email to