-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Reedick, Andrew wrote:

>> -----Original Message-----
>> From: [EMAIL PROTECTED] [mailto:python-
>> [EMAIL PROTECTED] On Behalf Of Michel Bouwmans
>> Sent: Wednesday, April 09, 2008 5:44 PM
>> To: python-list@python.org
>> Subject: RE: Stripping scripts from HTML with regular expressions
>> 
>> 
>> Thanks! That did the trick. :) I was trying to use HTMLParser but that
>> choked on the script-blocks that didn't contain comment-indicators.
>> Guess I
>> can now move on with this script, thank you.
>> 
> 
> 
> Soooo.... you asked for help with a regex workaround, but didn't ask for
> help with the original problem, namely HTMLParser?  ;-)
> 
> 
> 
> *****
> 
> The information transmitted is intended only for the person or entity to
> which it is addressed and may contain confidential, proprietary, and/or
> privileged material. Any review, retransmission, dissemination or other
> use of, or taking of any action in reliance upon this information by
> persons or entities other than the intended recipient is prohibited. If
> you received this in error, please contact the sender and delete the
> material from all computers. GA625

I don't think HTMLParser was doing anything wrong here. I needed to parse a
HTML document, but it contained script-blocks with document.write's in
them. I only care for the content outside these blocks but HTMLParser will
choke on such a block when it isn't encapsulated with HTML-comment markers
and it tries to parse the contents of the document.write's. ;)

MFB
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQFH/kvEDpaqHmOKFdQRAgHgAJ4s2YUN6yynUS+8aunhVUR94rs2yQCgrn94
tAFx/dylzEI0TclRDSTRbJI=
=k8SN
-----END PGP SIGNATURE-----
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to