#35440: Update parse_header_parameters to leverage the parsing logic from 
(stdlib)
email Message.
--------------------------------------+------------------------------------
     Reporter:  Natalia Bidart        |                    Owner:  (none)
         Type:  Cleanup/optimization  |                   Status:  new
    Component:  HTTP handling         |                  Version:  dev
     Severity:  Normal                |               Resolution:
     Keywords:                        |             Triage Stage:  Accepted
    Has patch:  0                     |      Needs documentation:  0
  Needs tests:  0                     |  Patch needs improvement:  0
Easy pickings:  0                     |                    UI/UX:  0
--------------------------------------+------------------------------------
Comment (by Pravin):

 I tried  using a simple string split for normal headers, and only uses
 {{{email.message}}} when it sees complex symbols like " or *.


 {{{
 def _parse_header_params(line):
     """The 'Slower' logic."""
     m = Message()
     m["content-type"] = line
     params = m.get_params()
     if not params:
         return "", {}

     pdict = {}

     key = params.pop(0)[0].lower()

     for name, value in params:
         if not name:
             continue
         if isinstance(value, tuple):
             value = collapse_rfc2231_value(value)
         pdict[name] = value
     return key, pdict

 def parse_header_parameters(line, max_length=MAX_HEADER_LENGTH):
     """
     Parse a Content-type like header.
     Return the main content-type and a dictionary of options.

     If `line` is longer than `max_length`, `ValueError` is raised.
     """
     if not line:
         return "", {}

     if max_length is not None and len(line) > max_length:
         raise ValueError("Unable to parse header parameters (value too
 long).")

     if ";" not in line:
         return line.lower().strip(), {}

     if '"' not in line and "*" not in line:
         parts = line.split(";")
         key = parts[0].lower().strip()
         pdict = {}
         for p in parts[1:]:
             if "=" in p:
                 name, value = p.split("=", 1)
                 pdict[name.lower().strip()] = value.strip()
         return key, pdict

     return _parse_header_params(line)
 }}}


 I ran a benchmark with 50,000 iterations. Here are my results:

 {{{
 Test Case               Existing        New implementation
 Simple (no params)      0.050s          0.016s
 Standard (charset)      0.109s          0.044s
 Complex (quotes)        0.214s          0.681s
 }}}

 for me this make sense , It will rarely happened that there would be a
 complex case for parsing.
-- 
Ticket URL: <https://code.djangoproject.com/ticket/35440#comment:20>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/django-updates/0107019b9d920eec-f138b3ca-65ce-4410-9fc1-e2b6345f312f-000000%40eu-central-1.amazonses.com.

Reply via email to