#36896: Optimize TruncateCharsHTMLParser.process() to avoid redundant sum()
calculation
-------------------------------------+-------------------------------------
     Reporter:  Tarek Nakkouch       |                     Type:
                                     |  Cleanup/optimization
       Status:  new                  |                Component:  Utilities
      Version:  6.0                  |                 Severity:  Normal
     Keywords:                       |             Triage Stage:
                                     |  Unreviewed
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
 The `TruncateCharsHTMLParser.process()` method in `django/utils/text.py`
 recalculates `sum(len(p) for p in self.output)` every time it processes a
 text chunk. For HTML with multiple text nodes, this repeatedly iterates
 over the growing output list unnecessarily.

 {{{#!python
 def process(self, data):
     self.processed_chars += len(data)
     if (self.processed_chars == self.length) and (
         sum(len(p) for p in self.output) + len(data) == len(self.rawdata)
     ):
         self.output.append(data)
         raise self.TruncationCompleted
     output = escape("".join(data[: self.remaining]))
     return data, output
 }}}

 == Suggested optimization ==

 Cache the output length as `self.output_len` and increment it when
 appending to `self.output`:

  * Initialize `self.output_len = 0` in `TruncateHTMLParser.__init__()`
  * Increment in `handle_starttag()`, `handle_endtag()`, `handle_data()`,
 `feed()`, and `process()`
  * Replace `sum(len(p) for p in self.output)` with `self.output_len`

 This eliminates redundant iteration over already-processed output.
-- 
Ticket URL: <https://code.djangoproject.com/ticket/36896>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/django-updates/0107019c144549a3-ced1a644-5b1e-4e01-b7df-b5a417d93eef-000000%40eu-central-1.amazonses.com.

Reply via email to