#33532: Micro-optimisation: Special-case dictionaries for
CaseInsensitiveMapping.
-------------------------------------+-------------------------------------
Reporter: Keryn | Owner: Keryn Knight
Knight |
Type: | Status: assigned
Cleanup/optimization |
Component: HTTP | Version: dev
handling |
Severity: Normal | Keywords:
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
Currently, all data given to `HttpHeaders` and `ResponseHeaders` go
through `CaseInsensitiveMapping._unpack_items` to unwind in case it's
''not'' a dictionary.
In the case of `HttpHeaders` though, it's ''always'' a dictionary (and
I've got a plan for `ResponseHeaders` in general, as a separate patch,
should this one go OK as a pre-requisite), so for the common cases of
`CaseInsensitiveMapping` being used, we can optimise for the expected use-
case by **introducing** an additional `isinstance` check to avoid going
through `_unpack_items`, `__instancecheck__` and `_abc._abc_instancecheck`
The change itself (I'll add a PR after I get a ticket number) is thus:
{{{
- self._store = {k.lower(): (k, v) for k, v in
self._unpack_items(data)}
+ data = data.items() if isinstance(data, dict) else
self._unpack_items(data)
+ self._store = {k.lower(): (k, v) for k, v in data}
}}}
The isinstance check will **add** roughly `70ns` (for me) to non-dict
usages of `CaseInsensitiveMapping` (and it's subclasses), but will
**save**
roughly `500ns` (for me) when using dictionaries.
Baseline, main as of `b626c5a9798b045b655d085d59efdd60b5d7a0e3`:
{{{
In [1]: from django.http.request import HttpHeaders
In [2]: from django.utils.datastructures import CaseInsensitiveMapping
In [3]: %timeit CaseInsensitiveMapping({'Name': 'Jane'})
1.4 µs ± 7.94 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
each)
In [4]: %timeit HttpHeaders({"HTTP_EXAMPLE": 1})
2.79 µs ± 20.1 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops
each)
In [5]: %timeit HttpHeaders({"CONTENT_LENGTH": 1})
2.56 µs ± 40 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
In [6]: %prun for _ in range(100000): CaseInsensitiveMapping({"Name":
"Jane"})
900009 function calls (900007 primitive calls) in 0.356 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
100000 0.073 0.000 0.231 0.000
datastructures.py:306(<dictcomp>)
200000 0.066 0.000 0.147 0.000
datastructures.py:328(_unpack_items)
1 0.063 0.063 0.356 0.356 <string>:1(<module>)
100000 0.063 0.000 0.293 0.000
datastructures.py:305(__init__)
100000 0.030 0.000 0.072 0.000 {built-in method
builtins.isinstance}
100000 0.022 0.000 0.041 0.000
abc.py:117(__instancecheck__)
100000 0.019 0.000 0.020 0.000 {built-in method
_abc._abc_instancecheck}
100000 0.011 0.000 0.011 0.000 {method 'lower' of 'str'
objects}
100000 0.010 0.000 0.010 0.000 {method 'items' of 'dict'
objects}
1 0.000 0.000 0.356 0.356 {built-in method
builtins.exec}
2/1 0.000 0.000 0.000 0.000 {built-in method
_abc._abc_subclasscheck}
2/1 0.000 0.000 0.000 0.000
abc.py:121(__subclasscheck__)
2 0.000 0.000 0.000 0.000
_collections_abc.py:409(__subclasshook__)
1 0.000 0.000 0.000 0.000 {method 'disable' of
'_lsprof.Profiler' objects}
}}}
After adding the patch suggested above:
{{{
In [1]: from django.http.request import HttpHeaders
In [2]: from django.utils.datastructures import CaseInsensitiveMapping
In [3]: %timeit CaseInsensitiveMapping({'Name': 'Jane'})
862 ns ± 9.4 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
each)
In [4]: %timeit HttpHeaders({"HTTP_EXAMPLE": 1})
2.05 µs ± 31 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
In [5]: %timeit HttpHeaders({"CONTENT_LENGTH": 1})
1.98 µs ± 18.1 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops
each)
In [6]: %prun for _ in range(100000): CaseInsensitiveMapping({"Name":
"Jane"})
500003 function calls in 0.204 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
100000 0.084 0.000 0.148 0.000
datastructures.py:305(__init__)
1 0.056 0.056 0.204 0.204 <string>:1(<module>)
100000 0.038 0.000 0.047 0.000
datastructures.py:307(<dictcomp>)
100000 0.009 0.000 0.009 0.000 {method 'lower' of 'str'
objects}
100000 0.009 0.000 0.009 0.000 {method 'items' of 'dict'
objects}
100000 0.008 0.000 0.008 0.000 {built-in method
builtins.isinstance}
1 0.000 0.000 0.204 0.204 {built-in method
builtins.exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of
'_lsprof.Profiler' objects}
}}}
As you can see from the `cProfile` timing + number of function calls,
we've saved a not insubstantial amount of work from being done, and that
looks consistent over both classes and both likely paths through an HTTP
Header.
Tests all pass for me locally...
--
Ticket URL: <https://code.djangoproject.com/ticket/33532>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-updates/052.30c7dbd98058cff7aa918cfba34c49fc%40djangoproject.com.