#16937: Prefetch related objects
-------------------------------------+-------------------------------------
Reporter: lukeplant | Owner: nobody
Type: New feature | Status: new
Component: Database layer | Version: 1.3
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 1
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Comment (by anonymous):
Here is the hash-join patch. The test is a little diffferent this time
(lost my original database...)
Profile without dict-join:
{{{
Loop 0, used 0:00:21.332133
Loop 1, used 0:00:21.286314
Total: 0:00:42.618382
11312670 function calls (11220378 primitive calls) in 43.177 CPU
seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 43.178 43.178 speedtester.py:1(<module>)
68813/17921 0.114 0.000 42.637 0.002 {len}
1 0.011 0.011 42.636 42.636 transaction.py:206(inner)
1 0.012 0.012 42.623 42.623 speedtester.py:25(managed)
4/2 0.000 0.000 42.605 21.303 query.py:90(__iter__)
6/4 0.004 0.001 42.605 10.651 query.py:75(__len__)
2 0.000 0.000 42.357 21.178
query.py:545(_prefetch_related_objects)
2 0.005 0.002 42.357 21.178
query.py:1534(prefetch_related_objects)
2 19.538 9.769 42.270 21.135
query.py:1627(_prefetch_one_level)
10110574/10107823 17.424 0.000 17.629 0.000 {getattr}
2008 0.042 0.000 4.144 0.002 related.py:449(all)
2008 0.010 0.000 4.095 0.002 manager.py:115(all)
2008 0.056 0.000 4.086 0.002
related.py:434(get_query_set)
4026 0.054 0.000 2.797 0.001 query.py:837(_clone)
4026 0.223 0.000 2.713 0.001 query.py:233(clone)
2010 0.011 0.000 2.370 0.001 query.py:596(filter)
2012 0.033 0.000 2.360 0.001
query.py:610(_filter_or_exclude)
48312/16104 0.737 0.000 2.292 0.000 copy.py:145(deepcopy)
}}}
With dict join:
{{{
Loop 0, used 0:00:02.813404
Loop 1, used 0:00:02.763317
Total: 0:00:05.576659
1254598 function calls (1162306 primitive calls) in 6.158 CPU
seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 6.159 6.159 speedtester.py:1(<module>)
68813/17921 0.114 0.000 5.595 0.000 {len}
1 0.012 0.012 5.594 5.594 transaction.py:206(inner)
1 0.012 0.012 5.582 5.582 speedtester.py:25(managed)
4/2 0.000 0.000 5.563 2.782 query.py:90(__iter__)
6/4 0.004 0.001 5.563 1.391 query.py:75(__len__)
2 0.000 0.000 5.314 2.657
query.py:545(_prefetch_related_objects)
2 0.005 0.003 5.314 2.657
query.py:1534(prefetch_related_objects)
2 0.087 0.044 5.225 2.613
query.py:1627(_prefetch_one_level)
2008 0.025 0.000 3.925 0.002 related.py:449(all)
2008 0.009 0.000 3.897 0.002 manager.py:115(all)
2008 0.048 0.000 3.888 0.002
related.py:434(get_query_set)
4026 0.053 0.000 2.855 0.001 query.py:837(_clone)
4026 0.211 0.000 2.774 0.001 query.py:233(clone)
48312/16104 0.739 0.000 2.362 0.000 copy.py:145(deepcopy)
}}}
Note that 2/3 of time is used in all().
Here is still another profile, this time the related objects are set in
setattr(obj, attname + '_prefetched').
{{{
Loop 0, used 0:00:00.811442
Loop 1, used 0:00:00.756192
Total: 0:00:01.567570
425294 function calls (397258 primitive calls) in 2.121 CPU
seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 2.122 2.122 speedtester.py:1(<module>)
42709/17921 0.069 0.000 1.593 0.000 {len}
1 0.005 0.005 1.578 1.578 transaction.py:206(inner)
1 0.005 0.005 1.573 1.573 speedtester.py:25(managed)
4/2 0.000 0.000 1.561 0.781 query.py:90(__iter__)
6/4 0.004 0.001 1.561 0.390 query.py:75(__len__)
2 0.000 0.000 1.315 0.657
query.py:545(_prefetch_related_objects)
2 0.005 0.003 1.314 0.657
query.py:1534(prefetch_related_objects)
12052 0.092 0.000 1.285 0.000 query.py:229(iterator)
}}}
Last, memory usage per style:
{{{
original (nest-loop style join): 39023Kb
dict-join: 39043Kb
no related manager: 27811Kb
}}}
The memory use is total memory use per process. This means that fetching a
list of 1000 objects requires over 10Mb of additional memory for the
queryset caching. I am still of the opinion that saving the prefetched
objects in obj.books_prefetched instead of obj.books.all() would be a good
move. More so because you can't actually do anything to the .all()
queryset (filter, order etc) without losing the benefit of the prefetch.
--
Ticket URL: <https://code.djangoproject.com/ticket/16937#comment:18>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/django-updates?hl=en.