-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Reedick, Andrew wrote:
>> -----Original Message----- >> From: [EMAIL PROTECTED] [mailto:python- >> [EMAIL PROTECTED] On Behalf Of Michel Bouwmans >> Sent: Wednesday, April 09, 2008 5:44 PM >> To: python-list@python.org >> Subject: RE: Stripping scripts from HTML with regular expressions >> >> >> Thanks! That did the trick. :) I was trying to use HTMLParser but that >> choked on the script-blocks that didn't contain comment-indicators. >> Guess I >> can now move on with this script, thank you. >> > > > Soooo.... you asked for help with a regex workaround, but didn't ask for > help with the original problem, namely HTMLParser? ;-) > > > > ***** > > The information transmitted is intended only for the person or entity to > which it is addressed and may contain confidential, proprietary, and/or > privileged material. Any review, retransmission, dissemination or other > use of, or taking of any action in reliance upon this information by > persons or entities other than the intended recipient is prohibited. If > you received this in error, please contact the sender and delete the > material from all computers. GA625 I don't think HTMLParser was doing anything wrong here. I needed to parse a HTML document, but it contained script-blocks with document.write's in them. I only care for the content outside these blocks but HTMLParser will choke on such a block when it isn't encapsulated with HTML-comment markers and it tries to parse the contents of the document.write's. ;) MFB -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQFH/kvEDpaqHmOKFdQRAgHgAJ4s2YUN6yynUS+8aunhVUR94rs2yQCgrn94 tAFx/dylzEI0TclRDSTRbJI= =k8SN -----END PGP SIGNATURE----- -- http://mail.python.org/mailman/listinfo/python-list