#32244: ORM inefficiency: ModelFormSet executes a single-object SELECT query per
formset instance when saving/validating
-------------------------------------+-------------------------------------
     Reporter:  Lushen Wu            |                    Owner:  nobody
         Type:                       |                   Status:  new
  Cleanup/optimization               |
    Component:  Database layer       |                  Version:  3.1
  (models, ORM)                      |
     Severity:  Normal               |               Resolution:
     Keywords:  formsets             |             Triage Stage:
                                     |  Unreviewed
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Description changed by Lushen Wu:

Old description:

> Conceptual summary of the issue:
> Let's say we have a Django app with {{{Author}}} and {{{Book}}} models,
> and use a {{{BookFormSet}}} to add / modify / delete books that are
> created by a given {{{Author}}}. The problem is when the
> {{{BookFormSet}}} is validated, {{{ModelChoiceField.to_python()}}} ends
> up calling {{{self.queryset.get(id=123)}}} which results in a single-
> object SELECT query for **each** book in the formset. That means if I
> want to update 15 books, Django performs 15 separate SELECT queries,
> which seems incredibly inefficient. (Our actual app is an editor that can
> update any number of objects in a single formset, e.g. 50+).
>
> My failed attempts to solve this:
> 1. First I tried passing a queryset to the {{{BookFormSet}}}, i.e.
> {{{formset = BookFormSet(data=request.POST,
> queryset=Book.objects.filter(author=1))}}}, but the `ModelChoiceField`
> still does its single-object SELECT queries.
> 2. Then I tried to see where the {{{ModelChoiceField}}} defines its
> queryset, which seems to be in {{{BaseModelFormSet.add_fields()}}}. I
> tried initiating the {{{ModelChoiceField}}} with the same queryset that I
> passed to the formset, e.g. {{{Book.objects.filter(author=1)}}} instead
> of the original code which would be
> {{{Book._default_manager.get_queryset()}}}. But this doesn't help because
> I guess the new queryset I defined isn't actually linked to what was
> passed to the formset (and we don't have a cache running). So the
> multiple SELECT queries still happen. (Note: I realize
> {{{_default_manager.get_queryset}}} is necessary for use cases where you
> would want to switch one Model instance for another one which might not
> be in the original queryset passed to the {{{BaseModelFormset}}}, but
> this is not our use case)
> 3. I noticed that {{{BaseFormSet._existing_object()}}} provides a way to
> check whether an object exists in the queryset that was giving to the
> FormSet constructor, which means that queryset is evaluated at most once
> and the results stored in {{{BaseFormSet._object_dict}}}. I thought there
> might be some way to have {{{ModelChoiceField.to_python()}}} do something
> similar before calling {{{self.queryset.get(id=123)}}}, but I don't think
> {{{ModelChoiceField}}} is aware of {{{BaseFormSet}}}, and it would seem
> an anti-pattern to reach up the hierarchy like this.
>
> The easiest solution seems to me to pass {{{BaseFormSet._object_dict}}}
> in some way to each {{{ModelForm}}} that's created, and then allow the
> {{{ModelChoiceField}}} to check {{{_object_dict}}} before making another
> SELECT query.

New description:

 Conceptual summary of the issue:
 Let's say we have a Django app with {{{Author}}} and {{{Book}}} models,
 and use a {{{BookFormSet}}} to add / modify / delete books that belong to
 a given {{{Author}}}. The problem is when the {{{BookFormSet}}} is
 validated, {{{ModelChoiceField.to_python()}}} ends up calling
 {{{self.queryset.get(id=123)}}} which results in a single-object SELECT
 query for **each** book in the formset. That means if I want to update 15
 books, Django performs 15 separate SELECT queries, which seems incredibly
 inefficient. (Our actual app is an editor that can update any number of
 objects in a single formset, e.g. 50+).

 My failed attempts to solve this:
 1. First I tried passing a queryset to the {{{BookFormSet}}}, i.e.
 {{{formset = BookFormSet(data=request.POST,
 queryset=Book.objects.filter(author=1))}}}, but the `ModelChoiceField`
 still does its single-object SELECT queries.
 2. Then I tried to see where the {{{ModelChoiceField}}} defines its
 queryset, which seems to be in {{{BaseModelFormSet.add_fields()}}}. I
 tried initiating the {{{ModelChoiceField}}} with the same queryset that I
 passed to the formset, e.g. {{{Book.objects.filter(author=1)}}} instead of
 the original code which would be
 {{{Book._default_manager.get_queryset()}}}. But this doesn't help because
 I guess the new queryset I defined isn't actually linked to what was
 passed to the formset (and we don't have a cache running). So the multiple
 SELECT queries still happen. (Note: I realize
 {{{_default_manager.get_queryset()}}} might be necessary in cases where
 the formset can be used to switch one Model instance to another instance
 which might not be in the original queryset passed to the
 {{{BaseModelFormset}}}, but this is not our use case)
 3. I noticed that {{{BaseFormSet._existing_object()}}} provides a way to
 check whether an object exists in the queryset that was giving to the
 FormSet constructor, which means that queryset is evaluated at most once
 and the results stored in {{{BaseFormSet._object_dict}}}. I thought there
 might be some way to have {{{ModelChoiceField.to_python()}}} do something
 similar before calling {{{self.queryset.get(id=123)}}}, but I don't think
 {{{ModelChoiceField}}} is aware of {{{BaseFormSet}}}, and it would seem an
 anti-pattern to reach up the hierarchy like this.

 The easiest solution seems to me to pass {{{BaseFormSet._object_dict}}} in
 some way to each {{{ModelForm}}} that's created, and then allow the
 {{{ModelChoiceField}}} to check this {{{_object_dict}}} before making
 another SELECT query.

--

-- 
Ticket URL: <https://code.djangoproject.com/ticket/32244#comment:1>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/066.212005c0bd2be6fa940a69335cd8fd01%40djangoproject.com.

Reply via email to