wow, this lib is great. danke.
heh, love that feeling that i've been wasting my life coding scrapers
with regexs up until now... :)
patrick k. wrote:
> it´s easy to write a customized sanitizer using beautifulsoup.
> http://www.crummy.com/software/BeautifulSoup/
>
> 1) place beautifulsoup.p
well, but sometimes you want them to be able to enter HTML. style
items, simple links, etc...
[EMAIL PROTECTED] wrote:
> Yes it is much safer to reject rather than sanitize. If bad tags are
> detected then reject the input out of hand. If you don't your
> sanitizer could be turned against you
Yes it is much safer to reject rather than sanitize. If bad tags are
detected then reject the input out of hand. If you don't your
sanitizer could be turned against you and end up changing slightly
dangerous tags into really dangerous tags. What happens here
ipt> when a sanitizer is set to remov
Brett Parker wrote:
> On Fri, Jul 13, 2007 at 11:48:50AM +0100, Nic James Ferrier wrote:
>> Brett Parker <[EMAIL PROTECTED]> writes:
>>
>>> On Fri, Jul 13, 2007 at 11:18:18AM +0100, Nic James Ferrier wrote:
Derek Anderson <[EMAIL PROTECTED]> writes:
> hey all,
>
> could anyon
On Fri, Jul 13, 2007 at 11:48:50AM +0100, Nic James Ferrier wrote:
>
> Brett Parker <[EMAIL PROTECTED]> writes:
>
> > On Fri, Jul 13, 2007 at 11:18:18AM +0100, Nic James Ferrier wrote:
> >>
> >> Derek Anderson <[EMAIL PROTECTED]> writes:
> >>
> >> > hey all,
> >> >
> >> > could anyone point me
Brett Parker <[EMAIL PROTECTED]> writes:
> On Fri, Jul 13, 2007 at 11:18:18AM +0100, Nic James Ferrier wrote:
>>
>> Derek Anderson <[EMAIL PROTECTED]> writes:
>>
>> > hey all,
>> >
>> > could anyone point me to a python html sanitizer implementation (or
>> > example)? i don't mean to strip al
On Fri, Jul 13, 2007 at 11:18:18AM +0100, Nic James Ferrier wrote:
>
> Derek Anderson <[EMAIL PROTECTED]> writes:
>
> > hey all,
> >
> > could anyone point me to a python html sanitizer implementation (or
> > example)? i don't mean to strip all html, just tags and attributes not
> > on a whit
Derek Anderson <[EMAIL PROTECTED]> writes:
> hey all,
>
> could anyone point me to a python html sanitizer implementation (or
> example)? i don't mean to strip all html, just tags and attributes not
> on a whitelist, such as I/B/A href/U/etc.
I use libxml2/libxslt, something like:
doc = li
it´s easy to write a customized sanitizer using beautifulsoup.
http://www.crummy.com/software/BeautifulSoup/
1) place beautifulsoup.py somewhere in your pythonpath
2) build your sanitizer and save it somewhere on your pythonpath
in my case it´s called eatMe and looks like this:
http://dpaste.com/
hey all,
could anyone point me to a python html sanitizer implementation (or
example)? i don't mean to strip all html, just tags and attributes not
on a whitelist, such as I/B/A href/U/etc.
danke,
derek
--~--~-~--~~~---~--~~
You received this message because y
10 matches
Mail list logo