On Sat, 1 Jun 2002, Malcolm Debono wrote: > I am trying (not an expert) to write a script that will, to begin with, > write the meta tags to a file from an html file. > I have managed so far (after many hours) but when the tags are written in > the html on seperate lines it doesn't write all of the tag to the file. Help > please. > > Malcolm > > EXAMPLE:- > <meta NAME="keywords" CONTENT=" > bread,rolls,flour,dough,baking,crust,crusty,and,on,and,on,and,on,and,cheese, > , home business, market,farmers,eggs,rice,and all,things,bright,and > beautiful"> > <meta NAME="description" CONTENT="The Baker of the United Kingdom"> > > It prints all of the html to the file if it is on the same line but if it is > not it doesn't print all the tag probably because there is a break or > newline ................ > > <meta NAME="keywords" CONTENT=" > <meta NAME="description" CONTENT="The Baker of the United Kingdom"> > > > print "Content-type:text/html\n\n"; > #THIS OPENS THE HTML FILE > open(HTML,"testhtml.htm") || die"Can't open testhtml.html: $!"; > @lines = <HTML>; > close(HTML); > foreach $lines (@lines) > { > chomp($line); > # THE PRINT PRINTS HTML TO SCREEN / BROWSER > print "$lines\n"; > > } > > #THIS OPENS A FILE TO WRITE THE HTML TO > open (RECORD, ">testhtml3.dat") || die "Error - cannot open testhtml3.dat: > $!"; > if($lockit){flock(RECORD,2);} > foreach $line(@lines) > { > chomp($line); > $match = META; > $match2 = meta; > > # THIS PRINT PRINTS THE HTML TO A FILE > if ($line =~ $match){ > print RECORD "$line\n"; > > } > if ($line =~ $match2){ > print RECORD "$line\n"; > > } > > > > } # End foreach > > if($lockit){flock(RECORD,8);} > close (RECORD);
You need to look for the > that closes the meta tag. If it's not on the same line where the meta tag is, then you need to print the following lines until you find it, e.g. $i=0; while ($i<@lines) { #avoid foreach in this case $line=$lines[$i]; if ($line=~/^\s*<\s*META /i) { #matches META,meta,Meta,MEta, MetA, etc. print RECORD $line; #if $line doesn't have the closing > ,print the next line while($line !~ />\s*$/) { $line=$lines[++$i]; print RECORD $line; } } ++$i; } If you don't like the idea of iterating over @lines with the $i variable, here's an alternative: while($line=shift(@lines)) { if ($line=~/^\s*<\s*META /i) { print RECORD $line; while($line !~ />\s*$/) { $line=shift(@lines); print RECORD $line; } } } The only consideration is that after this loop is done, the @lines array will be empty. If that is a problem, prior to this loop,copy the @lines array to another array, e.g. @metalines and then use @metalines in the above loop instead of @lines. To copy @lines to @metalines all you have to do is: @metalines=@lines; Of course, everything depends on an HTML file with the < prior to the meta keyword on the same line and eventually a line ending with the > character to close the meta tags. If the HTML file is malformed, the result will be equally unsatisfactory. BTW, you could have written both the RECORD file and the STDOUT file with just one loop without reading all the lines in to the @lines array. **** [EMAIL PROTECTED] <Carl Jolley> **** All opinions are my own and not necessarily those of my employer **** _______________________________________________ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs