#33532: Micro-optimisation: Special-case dictionaries for 
CaseInsensitiveMapping.
-------------------------------------+-------------------------------------
               Reporter:  Keryn      |          Owner:  Keryn Knight
  Knight                             |
                   Type:             |         Status:  assigned
  Cleanup/optimization               |
              Component:  HTTP       |        Version:  dev
  handling                           |
               Severity:  Normal     |       Keywords:
           Triage Stage:             |      Has patch:  0
  Unreviewed                         |
    Needs documentation:  0          |    Needs tests:  0
Patch needs improvement:  0          |  Easy pickings:  0
                  UI/UX:  0          |
-------------------------------------+-------------------------------------
 Currently, all data given to `HttpHeaders` and `ResponseHeaders` go
 through `CaseInsensitiveMapping._unpack_items` to unwind in case it's
 ''not'' a dictionary.

 In the case of `HttpHeaders` though, it's ''always'' a dictionary (and
 I've got a plan for `ResponseHeaders` in general, as a separate patch,
 should this one go OK as a pre-requisite), so for the common cases of
 `CaseInsensitiveMapping` being used, we can optimise for the expected use-
 case by **introducing** an additional `isinstance` check to avoid going
 through `_unpack_items`, `__instancecheck__` and `_abc._abc_instancecheck`

 The change itself (I'll add a PR after I get a ticket number) is thus:
 {{{
 -        self._store = {k.lower(): (k, v) for k, v in
 self._unpack_items(data)}
 +        data = data.items() if isinstance(data, dict) else
 self._unpack_items(data)
 +        self._store = {k.lower(): (k, v) for k, v in data}
 }}}

 The isinstance check will **add** roughly `70ns` (for me) to non-dict
 usages of `CaseInsensitiveMapping` (and it's subclasses), but will
 **save**
 roughly `500ns` (for me) when using dictionaries.

 Baseline, main as of `b626c5a9798b045b655d085d59efdd60b5d7a0e3`:
 {{{
 In [1]: from django.http.request import HttpHeaders

 In [2]: from django.utils.datastructures import CaseInsensitiveMapping

 In [3]: %timeit CaseInsensitiveMapping({'Name': 'Jane'})
 1.4 µs ± 7.94 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
 each)

 In [4]: %timeit HttpHeaders({"HTTP_EXAMPLE": 1})
 2.79 µs ± 20.1 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops
 each)

 In [5]: %timeit HttpHeaders({"CONTENT_LENGTH": 1})
 2.56 µs ± 40 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

 In [6]: %prun for _ in range(100000): CaseInsensitiveMapping({"Name":
 "Jane"})
    900009 function calls (900007 primitive calls) in 0.356 seconds
    Ordered by: internal time
    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    100000    0.073    0.000    0.231    0.000
 datastructures.py:306(<dictcomp>)
    200000    0.066    0.000    0.147    0.000
 datastructures.py:328(_unpack_items)
         1    0.063    0.063    0.356    0.356 <string>:1(<module>)
    100000    0.063    0.000    0.293    0.000
 datastructures.py:305(__init__)
    100000    0.030    0.000    0.072    0.000 {built-in method
 builtins.isinstance}
    100000    0.022    0.000    0.041    0.000
 abc.py:117(__instancecheck__)
    100000    0.019    0.000    0.020    0.000 {built-in method
 _abc._abc_instancecheck}
    100000    0.011    0.000    0.011    0.000 {method 'lower' of 'str'
 objects}
    100000    0.010    0.000    0.010    0.000 {method 'items' of 'dict'
 objects}
         1    0.000    0.000    0.356    0.356 {built-in method
 builtins.exec}
       2/1    0.000    0.000    0.000    0.000 {built-in method
 _abc._abc_subclasscheck}
       2/1    0.000    0.000    0.000    0.000
 abc.py:121(__subclasscheck__)
         2    0.000    0.000    0.000    0.000
 _collections_abc.py:409(__subclasshook__)
         1    0.000    0.000    0.000    0.000 {method 'disable' of
 '_lsprof.Profiler' objects}
 }}}

 After adding the patch suggested above:
 {{{
 In [1]: from django.http.request import HttpHeaders

 In [2]: from django.utils.datastructures import CaseInsensitiveMapping

 In [3]: %timeit CaseInsensitiveMapping({'Name': 'Jane'})
 862 ns ± 9.4 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
 each)

 In [4]: %timeit HttpHeaders({"HTTP_EXAMPLE": 1})
 2.05 µs ± 31 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

 In [5]: %timeit HttpHeaders({"CONTENT_LENGTH": 1})
 1.98 µs ± 18.1 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops
 each)

 In [6]: %prun for _ in range(100000): CaseInsensitiveMapping({"Name":
 "Jane"})
    500003 function calls in 0.204 seconds
    Ordered by: internal time
    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    100000    0.084    0.000    0.148    0.000
 datastructures.py:305(__init__)
         1    0.056    0.056    0.204    0.204 <string>:1(<module>)
    100000    0.038    0.000    0.047    0.000
 datastructures.py:307(<dictcomp>)
    100000    0.009    0.000    0.009    0.000 {method 'lower' of 'str'
 objects}
    100000    0.009    0.000    0.009    0.000 {method 'items' of 'dict'
 objects}
    100000    0.008    0.000    0.008    0.000 {built-in method
 builtins.isinstance}
         1    0.000    0.000    0.204    0.204 {built-in method
 builtins.exec}
         1    0.000    0.000    0.000    0.000 {method 'disable' of
 '_lsprof.Profiler' objects}
 }}}

 As you can see from the `cProfile` timing + number of function calls,
 we've saved a not insubstantial amount of work from being done, and that
 looks consistent over both classes and both likely paths through an HTTP
 Header.

 Tests all pass for me locally...

-- 
Ticket URL: <https://code.djangoproject.com/ticket/33532>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/052.30c7dbd98058cff7aa918cfba34c49fc%40djangoproject.com.

Reply via email to