Sweet! Thanks. I'll give her a try and study it to understand it better. Thanks!
Dan
> Dan Muey wrote:
>
> >
> > #Q) Before I kill the head section or body tags below how do I grab
> > these #parts of it? 1 - my $title = ???? IE the text between title
> > tags #2 - get body tag attributes my $body_attributes = ???? IE in
> > this example #it'd be 'bodytag=attributes'
> >
>
> grabs the title and body text and attributes:
>
> #!/usr/bin/perl -w
> use strict;
>
> use HTML::Parser;
>
> my $text = <<HTML;
> <html><head>
> <title> HI Title </title>
> heaD STUFF
> </head>
> <body bodytag=attributes>
> hI HERE'S CONTENT i WANT
> <!-- i WANT TO STRIP COMMENTS OUT -->
> <SCRIPT>
>
> i DON'T WANT THIS SCRIPT EITHER
>
> </SCRIPT>
>
> </BODY>
> </HTMl>
> HTML
>
> my $body = 0;
> my $title = 0;
> my @body;
> my @title;
>
> my $html = HTML::Parser->new(api_version => 3,
> text_h => [\&text,'dtext'],
> start_h => [\&open_tag,
> 'tagname,attr'],
> end_h => [\&close_tag, 'tagname']);
> $html->ignore_elements(qw(script));
> $html->parse($text);
> $html->eof;
>
> print "TITLE @title\n";
> print "BODY @body\n";
>
> sub text{
>
> my $text = shift;
>
> return unless($text =~ /\w/);
>
> if($title){
> push(@title,$text);
> }elsif($body){
> push(@body,$text);
> }
> }
>
> sub open_tag{
>
> my $tagname = shift;
> my $attr = shift;
>
> $title = 1 if($tagname eq 'title');
>
> $body = 1,push(@body,join('=',%{$attr}))
> if($tagname eq 'body');
> }
>
> sub close_tag{
>
> my $tagname = shift;
>
> $title = 0 if($tagname eq 'title');
> $body = 0 if($tagname eq 'body');
> }
>
> __END__
>
> prints:
>
> TITLE HI Title
> BODY bodytag=attributes
> hI HERE'S CONTENT i WANT
>
> there are many ways of doing the same thing.
>
> >
> > It automatically prints the modified version of $text without any
> > print statement. Q) Why is that?
>
> no. it doesn't print it automatically. i have print statment
> for this to
> print out.
>
> > Q) How can I save the new version of $text to a new
> variable instead
> > of automatically printing it to the screen? ( so I can remove empty
> > lines and have my way with it ) Q) I wanted any comments
> removed too
> > but I didn't do anything special to it and they are gone
> anyway, are
> > comments removed automatically then?
>
> just remove the print statment and store it as you want.
> comments are not removed by default, i don't think.
>
> david
>
> --
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]