Mike Orr wrote:
> On Thu, Jun 5, 2008 at 11:56 AM, TJ Ninneman <[EMAIL PROTECTED]> wrote:
>> On Jun 5, 2008, at 12:59 PM, Matt Feifarek wrote:
>>
>> I'd like to use something like the "truncate" feature of webhelpers on html
>> data that's being pulled in from an ATOM feed.
>>
>> If I just use a simple truncate, it might leave some html tags opened (like
>> a <div> without a </div>) which is Bad.
>>
>> I figured that this was a common-enough task that I'd ask some experts
>> before trying to roll my own solution. It seems like the kind of thing that
>> might be hidden within the standard library somewhere, below my nose, but
>> outside of my ability to discover.
>>
>> I've found this:
>> http://code.djangoproject.com/browser/django/trunk/django/utils/text.py
>>
>> Looks to be about the right thing, but I'd rather not be dependent on all of
>> Django to do this.
>>
>> Perhaps some ElementTree or LXML wizard knows a quick hack?
>>
>> Thanks!
>>
>>
>>
>>
>> I've had excellent luck stripping HTML with the following:
>> http://www.aminus.net/browser/cleanhtml.py
>> I use it to strip out all the html leaving a nice plain string.  It does the
>> best job of any solutions I've seen.
>>
>> TJ
> 
> I think he just wants to make sure the HTML is well-formed, not strip
> the tags completely.  However, strip_tags() is something WebHelpers
> should provide.  I've noticed the lack a couple times.  However, I'm
> not sure of the best implementation.

strip_tags should be easy enough to implement with some regexes -- you 
just have to remove <.*?>, then resolve any entities.

This code does some fairly simplistic rendering of HTML (but better than 
what strip_tags would likely do), and might have a better home in 
WebHelpers:
http://svn.w4py.org/ZPTKit/trunk/ZPTKit/htmlrender.py

-- 
Ian Bicking : [EMAIL PROTECTED] : http://blog.ianbicking.org

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pylons-discuss" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to