On Thu, Jun 5, 2008 at 1:01 PM, Ian Bicking <[EMAIL PROTECTED]> wrote:
>
> Mike Orr wrote:
>> On Thu, Jun 5, 2008 at 11:56 AM, TJ Ninneman <[EMAIL PROTECTED]> wrote:
>>> On Jun 5, 2008, at 12:59 PM, Matt Feifarek wrote:
>>>
>>> I'd like to use something like the "truncate" feature of webhelpers on html
>>> data that's being pulled in from an ATOM feed.
>>>
>>> If I just use a simple truncate, it might leave some html tags opened (like
>>> a <div> without a </div>) which is Bad.
>>>
>>> I figured that this was a common-enough task that I'd ask some experts
>>> before trying to roll my own solution. It seems like the kind of thing that
>>> might be hidden within the standard library somewhere, below my nose, but
>>> outside of my ability to discover.
>>>
>>> I've found this:
>>> http://code.djangoproject.com/browser/django/trunk/django/utils/text.py
>>>
>>> Looks to be about the right thing, but I'd rather not be dependent on all of
>>> Django to do this.
>>>
>>> Perhaps some ElementTree or LXML wizard knows a quick hack?
>>>
>>> Thanks!
>>>
>>>
>>>
>>>
>>> I've had excellent luck stripping HTML with the following:
>>> http://www.aminus.net/browser/cleanhtml.py
>>> I use it to strip out all the html leaving a nice plain string.  It does the
>>> best job of any solutions I've seen.
>>>
>>> TJ
>>
>> I think he just wants to make sure the HTML is well-formed, not strip
>> the tags completely.  However, strip_tags() is something WebHelpers
>> should provide.  I've noticed the lack a couple times.  However, I'm
>> not sure of the best implementation.
>
> strip_tags should be easy enough to implement with some regexes -- you
> just have to remove <.*?>, then resolve any entities.
>
> This code does some fairly simplistic rendering of HTML (but better than
> what strip_tags would likely do), and might have a better home in
> WebHelpers:
> http://svn.w4py.org/ZPTKit/trunk/ZPTKit/htmlrender.py

Put in the WebHelpers "unfinished" directory and opened ticket #458 to
integrate it.

-- 
Mike Orr <[EMAIL PROTECTED]>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pylons-discuss" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to