Re: Call for review and comment: #7539 (ForeignKey on_delete)

Carl Meyer Fri, 05 Nov 2010 12:12:35 -0700

On Nov 5, 11:17 am, Russell Keith-Magee <[email protected]>
wrote:
> Giving credit where it's due, I think Michael Glassford also deserves
> a shoutout here. He was the original driving force behind this ticket,
> as was actively involved during the 1.2 development cycle.


Indeed! My mistake. I'd made a mental note about this when originally
reading through the ticket history, but lost track of it sometime in
the last couple weeks (so much for mental notes). Thanks to Michael
for originally spearheading this patch.

While I'm covering credits I missed the first time around, thanks also
to Jacob Kaplan-Moss for playing around with the patch last week and
pointing out the need for SET() to accept a callable.

>  * A minor optimization in SET() -- the "if callable(value)" check can
> be done as part of the factory function, rather than being repeated
> every time the generated SET function is invoked (i.e., no need to
> check if None is callable every time). It isn't a big overhead, but it
> has the potential to be a hotspot because it will be called a lot, so
> every optimization is worth it.

Sure; done. Adds a bit of almost-code-duplication, but it's only a
couple lines. (I went for duplicating the inner function definition as
the most efficient approach; one could also set a flag in the outer
context and check it each time in the inner function, but that
introduces an extra scope-crawling for the flag variable).

>  * The documentation for queryset.delete() doesn't quite read right to
> me. It seems to imply that queryset.delete() implements ON DELETE
> CASCADE by default, when it's really foreign keys that have the
> default cascading behavior. I don't have any problem saying that
> CASCADE is the default, but I'd like the docs to make more of the fact
> that it's the foreign key that actually controls the behavior.

Ok, I made a small change to emphasize this; does that cover the
concern?

>  * By my reading, the documentation for the on_delete attribute has
> the relationship described the wrong way around. on_delete is really
> answer the question "What happens to this object when the object I'm
> pointing to is deleted?"; as currently written, it feels more like
> "What happens to the other object when this object is deleted". For
> example, the description of PROTECT is "Prevent deletion of the
> referenced object by raising IntegrityError" -- to me the 'referenced
> object' is the object I'm pointing to, not myself.

I think if you review again you'll find that all of the phrasing there
is correct. PROTECT is an oddball because it _does_ actually prevent
deletion of the referenced object; by raising IntegrityError it vetos
the entire deletion, rather than allowing its ForeignKey to become
invalid or changed.

If you have specific suggestions for where the wording could be
clearer, I'm quite open to changes.

>  * The fact that GenericForeignKeys *don't* have the ability to
> customize deletion behavior is probably worth a documentation note.
> It's the sort of thing that would be an obvious inclusion from a pure
> API perspective.

Ok; added a note about that.

Another documentation-related question I have; is it necessary to mark
each passing referencing (and link) to on_delete with the version-
added marker? I went back and forth, and eventually decided against
it, for these reasons: (1) The passing references don't provide enough
info to use the feature; anyone following the link to the actual
feature documentation will see the version marker, and (2) inserting
the version-added markers in places like the QuerySet.delete
documentation made it appear ambiguous to me exactly which paragraphs
the marker referred to; it would be a reasonable interpretation to
think it covered everything through the end of the section, which
would not be accurate. Open to feedback from the documentation experts
here.

>  * modeltests/delete/tests.py seems to have preserved the
> setup/teardown and utility functions, but no actual tests. As far as I
> can make out, modeltests/delete is entirely vestigial. I would prefer
> to see the on_delete tests moved into the delete directory, since
> they're now the core tests for deletion functionality in Django.

Ha, good catch. Done.

>  * I don't see a test case where deletion of inherited objects is
> tested. I can't see a test case in the existing test suite either,
> which is an interesting oversight. It's implicitly tested by virtue of
> the OneToOne test, but I would feel more comfortable seeing an
> explicit test case -- especially given that it's an specific piece of
> branch coverage in the collect() method.

Another good catch. There are actually (sort of) tests for this, but
not where you'd expect to find them; in regressiontests.admin_views,
for the admin deletion preview.

Odd, because I remember noting that test omission in working on the
#6191 patch, but apparently didn't correct it. Anyway, added several
inheritance-related tests in modeltests.delete.

>  * I found two other tickets in a review of ticket history: #6108,
> #6870. Both relate to the way signals are used in deletion (especially
> the pre-delete signal). Reading the tickets and the django-dev
> archive, my gut tells me that we should probably close these two
> tickets -- they both appear to be alternate ways of achieving ON
> DELETE behaviors, except that they solutions they propose would be
> much harder to integrate with admin, or otherwise audit for
> completeness. Are there any alternate opinions on this?

#6108 I fully agree should be closed as wontfix. Since pre_delete is
sent individually for every collected object (as it should be), the
semantics of sending a "list of related objects to be deleted" with
each and every one of those is pretty much impossible to get right,
especially if allowing that list to be modified.

I'm of two minds on #6870. I don't see any a priori reason why the
pre_delete signal can't or shouldn't be sent earlier in the process;
it looks like a fairly simple change, with only a gain in the
usefulness of the signal. It won't cause any problems for admin or
other code, as it would just allow you to make changes to your models
before the collector deals with them.

On the other hand, I agree with you that the use cases I can see for
it duplicate the same use cases that #7539 already covers, so I'm not
sure if we want to encourage two different ways of doing the same
things. And there may be subtle backwards-compatibility concerns with
the way some people are currently using pre_delete? Can't come up with
a case for that off-hand, but that doesn't mean it doesn't exist.

As a quick test, I moved the sending of pre_delete from
Collector.delete() to Collector.add(), which accomplishes what #6870
requests (this change is pushed in the t7539-t6870 branch of my
github). On SQLite this resulted in only a single test failure, which
is spurious; it's simply because the new delete.deletion_order test is
currently using pre_delete signal order as a way to test what order
things are deleted in.

I certainly wouldn't make this change in behavior in the fix for
#7539; no reason to combine changes that can be kept separate. On the
whole, my inclination is to close #6108 as wontfix when #7539 is
closed, but leave #6870 in DDN for a while longer while we see the
effects of #7539 shake out.

Another related ticket is #10829; I think that one should perhaps just
be closed as fixed once #7539 lands. Unmanaged doesn't automatically
imply "don't modify" (really it just means "don't try to create the
table"), so I don't think there should be a change in the default FK
behavior with an unmanaged model. For the use-case given in the
ticket, where the unmanaged model is actually a non-modifiable view,
we now have the DO_NOTHING behavior available for those FKs.

> The only other issue that occurred to me during my review is a huge
> topic, but one that should in no way hold up this patch landing in
> trunk. However, I think it deserves a quick cursory examination to
> make sure we're not painting ourselves into a corner.
>
> It's the question of native deletes. Postgres and Oracle (and I think
> SQLite in recent versions) allow for a database-native implementation
> of cascaded deletes, accompanied by database level definitions of the
> cascade. This has the potential to be a serious performance boost for
> large deletions.
>
> Looking at this patch, my gut tells me that implementing native
> deletes will actually be fairly simple. It's mostly just a matter of
> setting up the right database table definitions, and invoking a native
> deletion instead of calling delete() on the Collector instance.

I think it's actually simpler than that. Just set all your FKs to the
DO_NOTHING behavior, set up the right database table definitions (with
ON DELETE constraints and defaults as needed; initial SQL can be used
for this), and then you can just use the normal ORM deletion methods
and Django won't do any cascading or related-object-collecting on your
behalf. AFAICS that's the entire purpose of providing the DO_NOTHING
behavior. Is there something I'm missing?

> My concern is that I can see a potential complication with SET_DEFAULT
> and other SET() variants which require database-side interpretations.
> Django doesn't currently expose default values at the database level.
> Defaults are entirely handled on the Python side. This means that we
> will either need to provide a way to inject SQL-native defaults, or a
> way to perform the "collection" independently of the "update" (or
> possibly both).

I'm quite happy saying that if you want the database handling
deletion, you are responsible for adding the necessary constraints and
defaults to your database tables. ISTM that initial SQL provides a
perfectly adequate way to do this already, so I don't see a need for
either of those changes. This is the position I already take in the
documentation for DO_NOTHING.

> The reason I raise this is to give quick consideration to whether we
> should change the CASCADE et al functions into classes, so that they
> can hold richer representations if need be in the future.

If the contract (an internal contract, since we aren't documenting it,
at least not yet) is just that they are a callable that takes certain
arguments, there's nothing preventing us from making any of them into
classes in the future as needed. It certainly isn't needed now, so I
think we should leave them as functions.

Carl

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Re: Call for review and comment: #7539 (ForeignKey on_delete)

Reply via email to