#36896: Optimize TruncateCharsHTMLParser.process() to avoid redundant sum()
calculation
-------------------------------------+-------------------------------------
     Reporter:  Tarek Nakkouch       |                    Owner:  absyol
         Type:                       |                   Status:  assigned
  Cleanup/optimization               |
    Component:  Utilities            |                  Version:  6.0
     Severity:  Normal               |               Resolution:
     Keywords:                       |             Triage Stage:
                                     |  Unreviewed
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Comment (by absyol):

 Hello, I am new here (and to open source contributions in general). Please
 feel feel free to let me know if there is anything I can improve in.

 I do agree that iterating through every item in `self.output` is redundant
 and we can do it while adding to the list. I have a solution right now
 that implements a helper function to append a items in a list to
 `self.output` and compute `self.output_length` on the fly. Each
 modification to `self.output` will be replaced with this helper function
 so extending the class later will be less error-prone.

 {{{
     def update_output_fields(self, outputs):
         for output in outputs:
             self.output.append(output)
             self.output_length += len(output)
 }}}

 When modifying one of the other functions, we can call it like the
 following:

 {{{
     def handle_data(self, data):
         data, output = self.process(data)
         data_len = len(data)
         if self.remaining < data_len:
             self.remaining = 0

             # call here
             self.update_output_fields([add_truncation_text(output,
 self.replacement)])
             raise self.TruncationCompleted
         self.remaining -= data_len

         # call here
         self.update_output_fields([output])
 }}}

 We take a list as input to support `feed()`:
 {{{
     def feed(self, data):
         try:
             super().feed(data)
         except self.TruncationCompleted:
             self.update_output_fields([f"</{tag}>" for tag in self.tags])
             self.tags.clear()
             self.reset()
         else:
             # No data was handled.
             self.reset()
 }}}

 Would this be sufficient for this optimization?
-- 
Ticket URL: <https://code.djangoproject.com/ticket/36896#comment:2>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/django-updates/0107019c1a37a052-3e0b0a02-a9a1-4b48-ba90-1ad6b8469ded-000000%40eu-central-1.amazonses.com.

Reply via email to