Good news, I've got this working using html5lib which is an implicit dependency of Mezzanine's already.
Will push soon. On Wed, Jun 4, 2014 at 7:59 AM, Stephen McDonald <[email protected]> wrote: > Although we do want to extract width and height too, so I suspect regex > won't be the right approach. > > I've also realised HTMLParser is probably insufficient without a lot of > work, since we want to modify HTML, not simply parse it. > > Seems like the final result might be using BeautifulSoup with a definition > in the RICHTEXT_FILTERS setting. > > > On Wed, Jun 4, 2014 at 7:56 AM, Stephen McDonald <[email protected]> wrote: > >> That'd be awesome if possible. My understanding is you'll hit an edge >> case eventually where regex won't be capable of parsing what you want. >> Perhaps what we're doing is simple enough to work with regex. There's a >> hilariously famous stackoverflow answer on this: >> http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags >> >> >> On Tue, Jun 3, 2014 at 10:18 PM, Ahmad Khayyat <[email protected]> >> wrote: >> >>> Wouldn't a simple regular expression work here? >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Mezzanine Users" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> >> >> -- >> Stephen McDonald >> http://jupo.org >> > > > > -- > Stephen McDonald > http://jupo.org > -- Stephen McDonald http://jupo.org -- You received this message because you are subscribed to the Google Groups "Mezzanine Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
