Re: Deleting HTML code from a text file.

$Bill Luebkert Thu, 04 Sep 2003 20:38:22 -0700

Sara wrote:

> I have a couple of text files with html code in them.. e.g.
>  
> ---------- Text File --------------
> <html>
>     <head>
>         <title>This is Test File</title>
>     </head>
> <body>
> <font size=2 face=arial>This is the test file contents<br>
> <p>
> blah blah blah.........
> </body>
> </html>
>  
> -----------------------------------------
>  
> What I want to do is to remove/delete HTML code from the text file from
> a certain tag upto certain tag.
>  
> For example; I want to delete the code completely that comes in between
> <head> and </head> (including any style tags and embedded javascripts etc)
>  
> Any ideas?


Several.  I would try this one first (untested):

        Assuming you read the entire file into $content:

        $content =~ s/\s*<head>.*?<\/head>\s*//is;

-- 
  ,-/-  __      _  _         $Bill Luebkert    Mailto:[EMAIL PROTECTED]
 (_/   /  )    // //       DBE Collectibles    Mailto:[EMAIL PROTECTED]
  / ) /--<  o // //      Castle of Medieval Myth & Magic http://www.todbe.com/
-/-' /___/_<_</_</_    http://dbecoll.tripod.com/ (My Perl/Lakers stuff)

_______________________________________________
ActivePerl mailing list
[EMAIL PROTECTED]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Re: Deleting HTML code from a text file.

Reply via email to