Re: Why is first line always missing ?

John W. Krahn Fri, 25 Jan 2008 04:29:31 -0800

[EMAIL PROTECTED] wrote:

I am a total newbie in perl


Hello and welcome.

I have a html file with some junk after </html?

So I am trying to clean it.


You might want the htmlclean program:

http://search.cpan.org/~lindner/HTML-Clean-0.8/bin/htmlclean

Or the HTML::Clean module:

http://search.cpan.org/~lindner/HTML-Clean-0.8/lib/HTML/Clean.pm

This is how I started out. Its inside a unix shell script so I must
test on a command line like this:

% cat file.html | perl -ne '{$/="</HTML>" ; if ($_ =~ m#</html>#i)
{ print $_ } }'

$ perl -MO=Deparse -ne '{$/="</HTML>" ; if ($_ =~ m#</html>#i) { print$_ } }'

LINE: while (defined($_ = <ARGV>)) {
    {
        $/ = '</HTML>';
        if ($_ =~ m[</html>]i) {
            print $_;
        }
    }
}
-e syntax OK

You are setting the Input Record Separator ($/) to "</HTML>" after thefirst line is read so the first line will never be printed. Also if thetag is not exactly '</HTML>' then it will not work. And you are using'cat' when you don't need to. You probably want something like this:


perl -ne'print if 1 .. m[</html>]i' file.html

OK I wrote it by imitating other examples.

I dont know why I use switch -n . These are not described in man perl.
It only lists all switches in syntax line.


The command line switches are listed in perlrun:

perldoc perlrun

Or:

man perlrun



John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order.                            -- Larry Wall

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Why is first line always missing ?

Reply via email to