On 07/08/2013 08:15 AM, Nikola Đipanov wrote:

This is only true if you have one table with no relations that need to
be considered.

Here is an example of when it gets tricky - Say you have a table T1 and
a migration that adds a column c1 that relies on some data from table T2
and T1 has a FK that points to T2. And say for the sake of argument that
objects that are represented by rows in T1 and T2 have different
life-times in the system (think instances and devices, groups, quotas,
networks... this is common in our data model).

In order to properly migrate and assign values to the newly created c1
you will need to:

* Add the column c1 to the live T1
* join on live T2 *and* shadow T2 to get the data needed and populate
the new column.
* Add the column c1 to the shadow T1
* join on live T2 *and* shadow T2 to get the data needed and populate
the new column.

Hence - exponentially more joins, as I stated in my previous email.

Now - this was the *simplest* possible example - things get potentially
much more complicated if the new column relies on previous state of data
(say - counters of some sort), if you need to get data from a third
table (think many-to-many relationships) etc.

If you need a real example - take a look at migration 186 in the current
trunk.

As I said in the previous email, and based on the examples above - this
design decision (unconstrained rows) makes it difficult to reason about
data in the system!

I personally - as a developer working on the codebase - am not happy
making this trade-off in favour of archiving in this way - and would
like to see some design decisions changed, or at the very least a more
broad consensus, that the state as-is is actually OK and we don't need
to worry about it.

I agree that it's a mess. However, the current archiving code just does the simplest thing possible -- move what can be moved to shadow tables, and try again later if foreign key constraints prevent that. It's certainly possible to do something more clever, but that would require the archiving code to know more about all the other tables in the system, which sounds difficult to maintain.

Unfortunately, soft-deleted rows still satisfy FK constraints, so rows that point to soft-deleted rows never get deleted, so there is junk permanently left behind in some tables (like nova's instances). Soft-deletes let people get away with being sloppy, so people are sloppy.

What I'd really like to see is for the entire soft-delete idea to go away, and just delete rows when it's time to delete them. Does anyone remember why soft-deletes got added in the first place?

--
David Ripton   Red Hat   drip...@redhat.com

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to