[Django] #16679: Speed up signals by caching the reveicers per sender

Django Mon, 22 Aug 2011 15:46:57 -0700

#16679: Speed up signals by caching the reveicers per sender
----------------------------+----------------------------------------------
 Reporter:  akaariai        |          Owner:  nobody
     Type:                  |         Status:  new
  Cleanup/optimization      |      Component:  Database layer (models, ORM)
Milestone:                  |       Severity:  Normal
  Version:  1.3             |   Triage Stage:  Unreviewed
 Keywords:                  |  Easy pickings:  0
Has patch:  1               |
    UI/UX:  0               |
----------------------------+----------------------------------------------
 This is related to ticket #16639, where I tried to speed up
 db.models.Model `__init__` by some kludges in pre_init and post_init
 signals. Here is another try, this time with a more generic approach.


 I have two patches, the first one speeds up just the cases where there are
 no receivers for the current sender. This is done by caching "no
 receivers" when send was called and the sender had no receivers. The cache
 is cleared whenever connect or disconnect is called or whenever a weakref
 signal is removed by garbage collection. Performance is roughly the same
 as trunk when there are receivers for the current sender. In the case when
 there are receivers, but not for the current sender, the performance is
 much better, as in almost no overhead.

 The second patch optimizes things a bit further by not only remembering
 which senders have receivers, but by also remembering the list of
 receivers. This will require more memory, but I believe this optimization
 might be needed for further work (for example deferred models do not
 currently send signals correctly, and correcting this might mean more work
 in signal.send()).

 There are some problems:
   1. The sender object must be hashable. In the trunk code only `__eq__`
 is assumed.
   2. The caches can leak memory. For example the template_rendered signal
 has that property.
   3. Code complexity.
   4. More work if signals change constantly.

 The first two problems can be solved by adding a kwarg use_caching to
 Signal.`__init__` (default False). There are some signals that will
 benefit from caching (model signals at least) and some that would leak
 memory too much with caching (template_rendered, notably). The use_cache
 kwarg is included in the second patch.

 The rough numbers for pre and post second patch for 10000 simple model
 `__init__` (only id) is:
 pre patch:
  - no signals at all: 108ms
  - one pre_init signal to other model: 0.171ms
  - one pre_init and one post_init signal to other model: 0.220ms
  - one pre_init signal to initialized model: 0.215ms
 post patch 2:
  - no signals at all: 103ms
  - one pre_init signal to other model: 0.115ms
  - one pre_init and one post_init signal to other model: 0.115ms
  - one pre_init signal to initialized model: 0.173ms

 In the case when there is one post_init and one pre_init signal in the
 project, one can save around 50% in the best case.

 The numbers are nice, but IMHO the real benefit is that with the second
 patch, having inherited model signals would be possible without paying a
 huge performance penalty. Fixing deferred objects not firing signals would
 also be easier without performance problems.

 The patches are still somewhat WIP. But I'd like to open discussion before
 I polish them and test them more.

-- 
Ticket URL: <https://code.djangoproject.com/ticket/16679>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en.

[Django] #16679: Speed up signals by caching the reveicers per sender

Reply via email to