[allura:tickets] #6734 Research using re2 as a replacement for re module

Tim Van Steenburgh Thu, 03 Oct 2013 11:51:19 -0700

allura:tv/6734

TL;DR: `re2` turns out to be 20x slower than `re` for our use cases.


Added `md_perf.py` script for timing/profiling discussion thread rendering.

re2 can't be used as a drop-in replacement in Markdown b/c back-referencing 
regexes are prevalent in Markdown, and not supported by re2.

Profiling showed that for the slow-to-render posts, all the time was being 
spent regex matching in the HtmlPattern class in markdown.inlinepatterns. I 
tried to use re2 just in this class, but the resulting performance was 20x 
slower than with re.

Conclusion: Speeding up our MD-rendering won't be as easy as just dropping in 
re2. Rendering with ForgeExtensions is about 50% slower than with a vanilla 
Markdown() instance, so perf gains there may be possible. Best option may be to 
simply cache the MD-converted values instead of rendering them on-the-fly.


---

** [tickets:#6734] Research using re2 as a replacement for re module**

**Status:** closed
**Created:** Thu Oct 03, 2013 06:37 PM UTC by Tim Van Steenburgh
**Last Updated:** Thu Oct 03, 2013 06:37 PM UTC
**Owner:** Tim Van Steenburgh

In researching slow-rendering discussion threads, it was discovered that some 
individual posts are taking a long time to render. The posts in question are 
sizable (23k) chunks of plain html. See is re2 can be used to speed up the 
rendering.




---

Sent from sourceforge.net because allura-dev@incubator.apache.org is subscribed 
to https://sourceforge.net/p/allura/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/allura/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.

[allura:tickets] #6734 Research using re2 as a replacement for re module

Reply via email to