#34681: Optimize memcache_key_warnings()
-------------------------------------+-------------------------------------
     Reporter:  Adam Johnson         |                    Owner:  Adam
         Type:                       |  Johnson
  Cleanup/optimization               |                   Status:  assigned
    Component:  Core (Cache system)  |                  Version:  dev
     Severity:  Normal               |               Resolution:
     Keywords:                       |             Triage Stage:
                                     |  Unreviewed
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Description changed by Adam Johnson:

Old description:

> The cache functino memcache_key_warnings() iterates the key character-by-
> character, a slow operation in Python because it has to create and
> destroy many individual `str` objects. Instead, we can search the string
> with a regular expression for a 6x speedup.
>
> IPython session comparing old and new approaches:
>
> {{{
> In [1]: import re
>    ...:
>    ...: MEMCACHE_MAX_KEY_LENGTH = 250
>    ...:
>    ...:
>    ...: def old(key):
>    ...:     if len(key) > MEMCACHE_MAX_KEY_LENGTH:
>    ...:         yield (
>    ...:             "Cache key will cause errors if used with memcached:
> %r "
>    ...:             "(longer than %s)" % (key, MEMCACHE_MAX_KEY_LENGTH)
>    ...:         )
>    ...:     for char in key:
>    ...:         if ord(char) < 33 or ord(char) == 127:
>    ...:             yield (
>    ...:                 "Cache key contains characters that will cause
> errors if "
>    ...:                 "used with memcached: %r" % key
>    ...:             )
>    ...:             break
>    ...:
>    ...: memcached_error_chars_re = re.compile(r"[\x00-\x32\x127]")
>    ...:
>    ...:
>    ...: def new(key):
>    ...:     if len(key) > MEMCACHE_MAX_KEY_LENGTH:
>    ...:         yield (
>    ...:             "Cache key will cause errors if used with memcached:
> %r "
>    ...:             "(longer than %s)" % (key, MEMCACHE_MAX_KEY_LENGTH)
>    ...:         )
>    ...:     if memcached_error_chars_re.match(key):
>    ...:         yield (
>    ...:             "Cache key contains characters that will cause errors
> if "
>    ...:             "used with memcached: %r" % key
>    ...:         )
>    ...:
>
> In [2]: %timeit list(old('acme-bookstore-user-1234567-book-78910391'))
> 1.35 µs ± 1.09 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
> each)
>
> In [3]: %timeit list(new('acme-bookstore-user-1234567-book-78910391'))
> 212 ns ± 1.42 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
> each)
>
> In [4]: %timeit list(old('homepage\n'))
> 545 ns ± 1.17 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
> each)
>
> In [5]: %timeit list(new('homepage\n'))
> 209 ns ± 1.03 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
> each)
> }}}

New description:

 The cache functino memcache_key_warnings() iterates the key character-by-
 character, a slow operation in Python because it has to create and destroy
 many individual `str` objects. Instead, we can search the string with a
 regular expression for a ~3x speedup on a reasonably sized error-free
 cache key.

 IPython session comparing old and new approaches:

 {{{
 In [1]: import re
    ...:
    ...: MEMCACHE_MAX_KEY_LENGTH = 250
    ...:
    ...:
    ...: def old(key):
    ...:     if len(key) > MEMCACHE_MAX_KEY_LENGTH:
    ...:         yield (
    ...:             "Cache key will cause errors if used with memcached:
 %r "
    ...:             "(longer than %s)" % (key, MEMCACHE_MAX_KEY_LENGTH)
    ...:         )
    ...:     for char in key:
    ...:         if ord(char) < 33 or ord(char) == 127:
    ...:             yield (
    ...:                 "Cache key contains characters that will cause
 errors if "
    ...:                 "used with memcached: %r" % key
    ...:             )
    ...:             break
    ...:
    ...: memcached_error_chars_re = re.compile(r"[\x00-\x20\x7f]")
    ...:
    ...:
    ...: def new(key):
    ...:     if len(key) > MEMCACHE_MAX_KEY_LENGTH:
    ...:         yield (
    ...:             "Cache key will cause errors if used with memcached:
 %r "
    ...:             "(longer than %s)" % (key, MEMCACHE_MAX_KEY_LENGTH)
    ...:         )
    ...:     if memcached_error_chars_re.search(key):
    ...:         yield (
    ...:             "Cache key contains characters that will cause errors
 if "
    ...:             "used with memcached: %r" % key
    ...:         )

 In [2]: %timeit list(old('acme-bookstore-user-1234567-book-78910391'))
 1.35 µs ± 1.05 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
 each)

 In [3]: %timeit list(new('acme-bookstore-user-1234567-book-78910391'))
 475 ns ± 1.37 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
 each)

 In [4]: %timeit list(old('homepage\n'))
 546 ns ± 2.88 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
 each)

 In [5]: %timeit list(new('homepage\n'))
 413 ns ± 1.35 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
 each)
 }}}

--

-- 
Ticket URL: <https://code.djangoproject.com/ticket/34681#comment:2>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/01070188fbaf224a-f76950f1-fabe-464e-9613-9dae97163a2f-000000%40eu-central-1.amazonses.com.

Reply via email to