#34681: Optimize memcache_key_warnings()
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Adam
Type: | Johnson
Cleanup/optimization | Status: assigned
Component: Core (Cache system) | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Description changed by Adam Johnson:
Old description:
> The cache functino memcache_key_warnings() iterates the key character-by-
> character, a slow operation in Python because it has to create and
> destroy many individual `str` objects. Instead, we can search the string
> with a regular expression for a 6x speedup.
>
> IPython session comparing old and new approaches:
>
> {{{
> In [1]: import re
> ...:
> ...: MEMCACHE_MAX_KEY_LENGTH = 250
> ...:
> ...:
> ...: def old(key):
> ...: if len(key) > MEMCACHE_MAX_KEY_LENGTH:
> ...: yield (
> ...: "Cache key will cause errors if used with memcached:
> %r "
> ...: "(longer than %s)" % (key, MEMCACHE_MAX_KEY_LENGTH)
> ...: )
> ...: for char in key:
> ...: if ord(char) < 33 or ord(char) == 127:
> ...: yield (
> ...: "Cache key contains characters that will cause
> errors if "
> ...: "used with memcached: %r" % key
> ...: )
> ...: break
> ...:
> ...: memcached_error_chars_re = re.compile(r"[\x00-\x32\x127]")
> ...:
> ...:
> ...: def new(key):
> ...: if len(key) > MEMCACHE_MAX_KEY_LENGTH:
> ...: yield (
> ...: "Cache key will cause errors if used with memcached:
> %r "
> ...: "(longer than %s)" % (key, MEMCACHE_MAX_KEY_LENGTH)
> ...: )
> ...: if memcached_error_chars_re.match(key):
> ...: yield (
> ...: "Cache key contains characters that will cause errors
> if "
> ...: "used with memcached: %r" % key
> ...: )
> ...:
>
> In [2]: %timeit list(old('acme-bookstore-user-1234567-book-78910391'))
> 1.35 µs ± 1.09 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
> each)
>
> In [3]: %timeit list(new('acme-bookstore-user-1234567-book-78910391'))
> 212 ns ± 1.42 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
> each)
>
> In [4]: %timeit list(old('homepage\n'))
> 545 ns ± 1.17 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
> each)
>
> In [5]: %timeit list(new('homepage\n'))
> 209 ns ± 1.03 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
> each)
> }}}
New description:
The cache functino memcache_key_warnings() iterates the key character-by-
character, a slow operation in Python because it has to create and destroy
many individual `str` objects. Instead, we can search the string with a
regular expression for a ~3x speedup on a reasonably sized error-free
cache key.
IPython session comparing old and new approaches:
{{{
In [1]: import re
...:
...: MEMCACHE_MAX_KEY_LENGTH = 250
...:
...:
...: def old(key):
...: if len(key) > MEMCACHE_MAX_KEY_LENGTH:
...: yield (
...: "Cache key will cause errors if used with memcached:
%r "
...: "(longer than %s)" % (key, MEMCACHE_MAX_KEY_LENGTH)
...: )
...: for char in key:
...: if ord(char) < 33 or ord(char) == 127:
...: yield (
...: "Cache key contains characters that will cause
errors if "
...: "used with memcached: %r" % key
...: )
...: break
...:
...: memcached_error_chars_re = re.compile(r"[\x00-\x20\x7f]")
...:
...:
...: def new(key):
...: if len(key) > MEMCACHE_MAX_KEY_LENGTH:
...: yield (
...: "Cache key will cause errors if used with memcached:
%r "
...: "(longer than %s)" % (key, MEMCACHE_MAX_KEY_LENGTH)
...: )
...: if memcached_error_chars_re.search(key):
...: yield (
...: "Cache key contains characters that will cause errors
if "
...: "used with memcached: %r" % key
...: )
In [2]: %timeit list(old('acme-bookstore-user-1234567-book-78910391'))
1.35 µs ± 1.05 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
each)
In [3]: %timeit list(new('acme-bookstore-user-1234567-book-78910391'))
475 ns ± 1.37 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
each)
In [4]: %timeit list(old('homepage\n'))
546 ns ± 2.88 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
each)
In [5]: %timeit list(new('homepage\n'))
413 ns ± 1.35 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
each)
}}}
--
--
Ticket URL: <https://code.djangoproject.com/ticket/34681#comment:2>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-updates/01070188fbaf224a-f76950f1-fabe-464e-9613-9dae97163a2f-000000%40eu-central-1.amazonses.com.