> Using the excellent example in the an earlier post from david: > RE: Removing HTML Tags > > I came up with this slightly modified version based on the > post and some cpan documentation and it works. > It just brought up a few more questions. > Basically I'm just trying to grab the body contents without > comments or script stuff. > > So far this module is really cool and handy!! > > #!/usr/bin/perl > > use HTML::Parser; > > my $text = <<HTML; > > <html><head> > <title> HI Title </title> > heaD STUFF > </head> > <body bodytag=attributes> > hI HERE'S CONTENT i WANT > <!-- i WANT TO STRIP COMMENTS OUT --> > <SCRIPT> > > i DON'T WANT THIS SCRIPT EITHER > > </SCRIPT> > > </BODY> > </HTMl> > > HTML > > my $html = HTML::Parser->new( > api_version => 3, > text_h => [sub{ print shift;}, 'dtext'], > start_h => [sub{ print shift;}, 'text'], > end_h => [sub{ print shift;}, 'text']);
Ok I see why it's printing. I tell it to right here! Instead of print shift; I do $temp .= shift; and now $temp holds that data. One down two to go! > > #Q) Before I kill the head section or body tags below how do > I grab these parts of it? > # 1 - my $title = ???? IE the text between title tags > # 2 - get body tag attributes my $body_attributes = ???? > IE in this example it'd be 'bodytag=attributes' > > $html->ignore_elements(qw(head script)); > $html->ignore_tags(qw(html body)); > > $html->parse($text); > $html->eof; > > #### > > It automatically prints the modified version of $text without > any print statement. > Q) Why is that? > Q) How can I save the new version of $text to a new variable > instead of automatically printing it to the screen? > ( so I can remove empty lines and have my way with it ) > Q) I wanted any comments removed too but I didn't do anything > special to it and they are gone anyway, are comments removed > automatically then? > > OUTPUT :: > (dmuey@q42(~):21)$ ./html.pl > > > > hI HERE'S CONTENT i WANT > > > > > > > (dmuey@q42(~):22)$ > > > Thanks > > Dan > > -- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]