#16937: Prefetch related objects
-------------------------------------+-------------------------------------
     Reporter:  lukeplant            |                    Owner:  nobody
         Type:  New feature          |                   Status:  new
    Component:  Database layer       |                  Version:  1.3
  (models, ORM)                      |               Resolution:
     Severity:  Normal               |             Triage Stage:  Accepted
     Keywords:                       |      Needs documentation:  0
    Has patch:  0                    |  Patch needs improvement:  1
  Needs tests:  0                    |                    UI/UX:  0
Easy pickings:  0                    |
-------------------------------------+-------------------------------------

Comment (by anonymous):

 Here is the hash-join patch. The test is a little diffferent this time
 (lost my original database...)

 Profile without dict-join:

 {{{
 Loop 0, used 0:00:21.332133
 Loop 1, used 0:00:21.286314
 Total: 0:00:42.618382
          11312670 function calls (11220378 primitive calls) in 43.177 CPU
 seconds

    Ordered by: cumulative time

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
         1    0.002    0.002   43.178   43.178 speedtester.py:1(<module>)
 68813/17921    0.114    0.000   42.637    0.002 {len}
         1    0.011    0.011   42.636   42.636 transaction.py:206(inner)
         1    0.012    0.012   42.623   42.623 speedtester.py:25(managed)
       4/2    0.000    0.000   42.605   21.303 query.py:90(__iter__)
       6/4    0.004    0.001   42.605   10.651 query.py:75(__len__)
         2    0.000    0.000   42.357   21.178
 query.py:545(_prefetch_related_objects)
         2    0.005    0.002   42.357   21.178
 query.py:1534(prefetch_related_objects)
         2   19.538    9.769   42.270   21.135
 query.py:1627(_prefetch_one_level)
 10110574/10107823   17.424    0.000   17.629    0.000 {getattr}
      2008    0.042    0.000    4.144    0.002 related.py:449(all)
      2008    0.010    0.000    4.095    0.002 manager.py:115(all)
      2008    0.056    0.000    4.086    0.002
 related.py:434(get_query_set)
      4026    0.054    0.000    2.797    0.001 query.py:837(_clone)
      4026    0.223    0.000    2.713    0.001 query.py:233(clone)
      2010    0.011    0.000    2.370    0.001 query.py:596(filter)
      2012    0.033    0.000    2.360    0.001
 query.py:610(_filter_or_exclude)
 48312/16104    0.737    0.000    2.292    0.000 copy.py:145(deepcopy)
 }}}

 With dict join:
 {{{
 Loop 0, used 0:00:02.813404
 Loop 1, used 0:00:02.763317
 Total: 0:00:05.576659
          1254598 function calls (1162306 primitive calls) in 6.158 CPU
 seconds

    Ordered by: cumulative time

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
         1    0.002    0.002    6.159    6.159 speedtester.py:1(<module>)
 68813/17921    0.114    0.000    5.595    0.000 {len}
         1    0.012    0.012    5.594    5.594 transaction.py:206(inner)
         1    0.012    0.012    5.582    5.582 speedtester.py:25(managed)
       4/2    0.000    0.000    5.563    2.782 query.py:90(__iter__)
       6/4    0.004    0.001    5.563    1.391 query.py:75(__len__)
         2    0.000    0.000    5.314    2.657
 query.py:545(_prefetch_related_objects)
         2    0.005    0.003    5.314    2.657
 query.py:1534(prefetch_related_objects)
         2    0.087    0.044    5.225    2.613
 query.py:1627(_prefetch_one_level)
      2008    0.025    0.000    3.925    0.002 related.py:449(all)
      2008    0.009    0.000    3.897    0.002 manager.py:115(all)
      2008    0.048    0.000    3.888    0.002
 related.py:434(get_query_set)
      4026    0.053    0.000    2.855    0.001 query.py:837(_clone)
      4026    0.211    0.000    2.774    0.001 query.py:233(clone)
 48312/16104    0.739    0.000    2.362    0.000 copy.py:145(deepcopy)
 }}}

 Note that 2/3 of time is used in all().

 Here is still another profile, this time the related objects are set in
 setattr(obj, attname + '_prefetched').

 {{{
 Loop 0, used 0:00:00.811442
 Loop 1, used 0:00:00.756192
 Total: 0:00:01.567570
          425294 function calls (397258 primitive calls) in 2.121 CPU
 seconds

    Ordered by: cumulative time

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
         1    0.002    0.002    2.122    2.122 speedtester.py:1(<module>)
 42709/17921    0.069    0.000    1.593    0.000 {len}
         1    0.005    0.005    1.578    1.578 transaction.py:206(inner)
         1    0.005    0.005    1.573    1.573 speedtester.py:25(managed)
       4/2    0.000    0.000    1.561    0.781 query.py:90(__iter__)
       6/4    0.004    0.001    1.561    0.390 query.py:75(__len__)
         2    0.000    0.000    1.315    0.657
 query.py:545(_prefetch_related_objects)
         2    0.005    0.003    1.314    0.657
 query.py:1534(prefetch_related_objects)
     12052    0.092    0.000    1.285    0.000 query.py:229(iterator)
 }}}

 Last, memory usage per style:

 {{{
 original (nest-loop style join): 39023Kb
 dict-join: 39043Kb
 no related manager: 27811Kb
 }}}

 The memory use is total memory use per process. This means that fetching a
 list of 1000 objects requires over 10Mb of additional memory for the
 queryset caching. I am still of the opinion that saving the prefetched
 objects in obj.books_prefetched instead of obj.books.all() would be a good
 move. More so because you can't actually do anything to the .all()
 queryset (filter, order etc) without losing the benefit of the prefetch.

-- 
Ticket URL: <https://code.djangoproject.com/ticket/16937#comment:18>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en.

Reply via email to