Re: [regexp] Multi-level pattern matching

Adam W Fri, 21 Apr 2006 17:18:52 -0700

Chad Perrin wrote:

Y'know, I was playing around with this to see if I could come up with a
reasonably elegant solution, and I noticed a problem:


Your code doesn't seem to actually work.  Am I missing something?

I went back to check, and it seems to only be working in very specificcases. For example, it changed <META NAME="... to <meta name="... andleft the contents of the "..." part alone. However, it did not change<TITLE> to <title>, presumably because there were no '""' to match. Itweaked it a little, to make sure that it picked up everything beginningwith a '<'. It now will catch things like <TITLE> and <HTML>, but I hadto cheat (as you can see in the code below).

As far as I can tell, this will work for the cases mentioned above, aswell as cases that have multiple tags on one line (something thatprevious snippit didn't do). Now see if it works for you.


Adam

Here's the full code:

    while (<$fh>) {
        if ($. == 1) {
            print "$new_head\n" unless /^<\!/;    # "$new_head" contains
                                                  # DOCTYPE info
            s/<\!(.*?)>/$new_head/;               # puts in new DOCTYPE
            }                                     # info

        if (!/<(img|meta|link|(h|b)r)(.*?)\/>/) {       # finds

s/<(img|meta|link|(h|b)r)(.*?)>/<$1$3 \/>/; # self-closing


                                                        # tags and adds
                                                        # an '/' if
                                                        # it's lacking
        }
        if (/<(.*?)>/g && !/<\!/g) { # Should find all tags unless they
                                     # are <! DOCTYPE... or comments
                                     # (is this even necessary?).
            s/<(HTML|HEAD|BODY|TITLE)>/<\L$1>/ig; # Cheating to get easy
                                                  # ones.
            s/<(.*?)"(.*?)"/<\L$1\E"$2"/g;        # Replace anything
                                                  # outside of ' "..." '
                                                  # with lowercase.
        }
        print;
    }

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: [regexp] Multi-level pattern matching

Reply via email to