#33749: Queries take exponential time when filtering in a loop
-------------------------------------+-------------------------------------
               Reporter:  Enhaloed   |          Owner:  nobody
                   Type:             |         Status:  new
  Uncategorized                      |
              Component:  Database   |        Version:  3.2
  layer (models, ORM)                |
               Severity:  Normal     |       Keywords:
           Triage Stage:             |      Has patch:  0
  Unreviewed                         |
    Needs documentation:  0          |    Needs tests:  0
Patch needs improvement:  0          |  Easy pickings:  0
                  UI/UX:  0          |
-------------------------------------+-------------------------------------
 I have a loop that builds up various queries, then applies them with a
 filter.

 I was having some performance issues and started looking into it more
 closely and it seemed to be exponential time instead of something
 relatively linear.

 I made the query execute on each loop by printing the results and used
 django-debug-toolbar to time each query.

 Here is the important loop:

 {{{#!python
 text = 'Choose one, if there are ten or more fallen'
 words = text.split(' ')
 for word in words:
     word_query = Q()
     for search_field in search_fields:
         word_query |= Q(**{search_field + '__icontains': word})

         if search_field == 'name':
             # Also check the alternative name
             word_query |= Q(**{'name_without_punctuation__icontains':
 word})

     if exactness_option == CONS.TEXT_CONTAINS_ALL:
         cards = cards.filter(word_query)
         print(word_query)
         print(cards)
   }}}

 The word_query object scome out as expected, essentially checking if each
 word exists in any of the relationships, one word at a time
 {{{
 (OR: ('name__icontains', 'Choose'),
 ('name_without_punctuation__icontains', 'Choose'),
 ('name_without_punctuation__icontains', 'Choose'),
 ('ability_texts__text__icontains', 'Choose'), ('races__name__icontains',
 'Choose'))
 (OR: ('name__icontains', 'one,'), ('name_without_punctuation__icontains',
 'one,'), ('name_without_punctuation__icontains', 'one,'),
 ('ability_texts__text__icontains', 'one,'), ('races__name__icontains',
 'one,'))
 (OR: ('name__icontains', 'if'), ('name_without_punctuation__icontains',
 'if'), ('name_without_punctuation__icontains', 'if'),
 ('ability_texts__text__icontains', 'if'), ('races__name__icontains',
 'if'))
 (OR: ('name__icontains', 'there'), ('name_without_punctuation__icontains',
 'there'), ('name_without_punctuation__icontains', 'there'),
 ('ability_texts__text__icontains', 'there'), ('races__name__icontains',
 'there'))
 (OR: ('name__icontains', 'are'), ('name_without_punctuation__icontains',
 'are'), ('name_without_punctuation__icontains', 'are'),
 ('ability_texts__text__icontains', 'are'), ('races__name__icontains',
 'are'))
 (OR: ('name__icontains', 'ten'), ('name_without_punctuation__icontains',
 'ten'), ('name_without_punctuation__icontains', 'ten'),
 ('ability_texts__text__icontains', 'ten'), ('races__name__icontains',
 'ten'))
 (OR: ('name__icontains', 'or'), ('name_without_punctuation__icontains',
 'or'), ('name_without_punctuation__icontains', 'or'),
 ('ability_texts__text__icontains', 'or'), ('races__name__icontains',
 'or'))
 (OR: ('name__icontains', 'more'), ('name_without_punctuation__icontains',
 'more'), ('name_without_punctuation__icontains', 'more'),
 ('ability_texts__text__icontains', 'more'), ('races__name__icontains',
 'more'))
 (OR: ('name__icontains', 'fallen'),
 ('name_without_punctuation__icontains', 'fallen'),
 ('name_without_punctuation__icontains', 'fallen'),
 ('ability_texts__text__icontains', 'fallen'), ('races__name__icontains',
 'fallen'))
 }}}

 Using django-debug-toolbar to check the times for these queries however go
 as follows:
 {{{
 0.47s
 0.83s
 1.04s
 2.43s
 4.74s
 26.27s
 60.18s
 124.46s
 175.75s
 }}}

 As far as I can see these values should be somewhat similar since each
 query is checking the same 4 columns for a string of comparable length.
 Instead, they are going somewhat exponential.

 This is using PostgreSQL 13.4 with the engine
 django.db.backends.postgresql, Django v3.2.6

 Let me know if there's any other details I can provide to help with this.

-- 
Ticket URL: <https://code.djangoproject.com/ticket/33749>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/010701810fea5fd7-89c0ff32-8709-426f-8019-3a46b265ae24-000000%40eu-central-1.amazonses.com.

Reply via email to