On Thu, Jun 5, 2008 at 1:01 PM, Ian Bicking <[EMAIL PROTECTED]> wrote: > > Mike Orr wrote: >> On Thu, Jun 5, 2008 at 11:56 AM, TJ Ninneman <[EMAIL PROTECTED]> wrote: >>> On Jun 5, 2008, at 12:59 PM, Matt Feifarek wrote: >>> >>> I'd like to use something like the "truncate" feature of webhelpers on html >>> data that's being pulled in from an ATOM feed. >>> >>> If I just use a simple truncate, it might leave some html tags opened (like >>> a <div> without a </div>) which is Bad. >>> >>> I figured that this was a common-enough task that I'd ask some experts >>> before trying to roll my own solution. It seems like the kind of thing that >>> might be hidden within the standard library somewhere, below my nose, but >>> outside of my ability to discover. >>> >>> I've found this: >>> http://code.djangoproject.com/browser/django/trunk/django/utils/text.py >>> >>> Looks to be about the right thing, but I'd rather not be dependent on all of >>> Django to do this. >>> >>> Perhaps some ElementTree or LXML wizard knows a quick hack? >>> >>> Thanks! >>> >>> >>> >>> >>> I've had excellent luck stripping HTML with the following: >>> http://www.aminus.net/browser/cleanhtml.py >>> I use it to strip out all the html leaving a nice plain string. It does the >>> best job of any solutions I've seen. >>> >>> TJ >> >> I think he just wants to make sure the HTML is well-formed, not strip >> the tags completely. However, strip_tags() is something WebHelpers >> should provide. I've noticed the lack a couple times. However, I'm >> not sure of the best implementation. > > strip_tags should be easy enough to implement with some regexes -- you > just have to remove <.*?>, then resolve any entities. > > This code does some fairly simplistic rendering of HTML (but better than > what strip_tags would likely do), and might have a better home in > WebHelpers: > http://svn.w4py.org/ZPTKit/trunk/ZPTKit/htmlrender.py
Put in the WebHelpers "unfinished" directory and opened ticket #458 to integrate it. -- Mike Orr <[EMAIL PROTECTED]> --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pylons-discuss" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/pylons-discuss?hl=en -~----------~----~----~----~------~----~------~--~---
