> Using the excellent example in the an earlier post from david:
> RE: Removing HTML Tags
> 
> I came up with this slightly modified version based on the 
> post and some cpan documentation and it works. 
> It just brought up a few more questions.
> Basically I'm just trying to grab the body contents without 
> comments or script stuff.
> 
> So far this module is really cool and handy!!
> 
> #!/usr/bin/perl
> 
> use HTML::Parser;
> 
> my $text = <<HTML;
> 
> <html><head>
> <title> HI Title </title>
> heaD STUFF
> </head>
> <body bodytag=attributes>
> hI HERE'S CONTENT i WANT
> <!-- i WANT TO STRIP COMMENTS OUT -->
> <SCRIPT>
> 
> i DON'T WANT THIS SCRIPT EITHER
> 
> </SCRIPT>
> 
> </BODY>
> </HTMl>
> 
> HTML
> 
> my $html = HTML::Parser->new(
>                 api_version => 3,
>                 text_h      => [sub{ print shift;}, 'dtext'],
>                 start_h     => [sub{ print shift;}, 'text'],
>                 end_h       => [sub{ print shift;}, 'text']);


Ok I see why it's printing. I tell it to right here!
Instead of print shift; I do $temp .= shift; and now $temp holds that data.
One down two to go!


> 
> #Q) Before I kill the head section or body tags below how do 
> I grab these parts of it?
> #     1 - my $title = ???? IE the text between title tags
> #     2 - get body tag attributes my $body_attributes = ???? 
> IE in this example it'd be 'bodytag=attributes'
> 
> $html->ignore_elements(qw(head script)); 
> $html->ignore_tags(qw(html body));
> 
> $html->parse($text);
> $html->eof;
> 
> ####
> 
> It automatically prints the modified version of $text without 
> any print statement.
> Q) Why is that? 
> Q) How can I save the new version of $text to a new variable 
> instead of automatically printing it to the screen? 
>       ( so I can remove empty lines and have my way with it )
> Q) I wanted any comments removed too but I didn't do anything 
> special to it and they are gone anyway, are comments removed 
> automatically then?
> 
> OUTPUT ::
> (dmuey@q42(~):21)$ ./html.pl 
> 
> 
> 
> hI HERE'S CONTENT i WANT
> 
> 
> 
> 
> 
> 
> (dmuey@q42(~):22)$ 
> 
> 
> Thanks
> 
> Dan
> 
> -- 
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to