Ok, I can get it to work by putting a space after the final ]

        m/href\s*=\s*['|"](.*)['|"] /i

instead of 

        m/href\s*=\s*['|"](.*)['|"]/i

But when I try 

        m/href\s*=\s*['|"](.*)['|"]\s*/i
or
        m/href\s*=\s*['|"](.*)['|"] */i

It doesn't work. Does anyone know why? 

Ade

-----Original Message-----
From: Adrian Lynch [mailto:[EMAIL PROTECTED]
Sent: 12 June 2003 16:39
To: Dev (E-mail)
Subject: [ cf-dev ] OT - Perl RegEx


I'm trying to extract the href from a <a> tag. So from the one below I want
http://www.google.com

        my $description = 'Some text <a href = "http://www.google.com";
target="_blank">www.google.com</a>';

        my $href = "";

        if ( $description =~ m/href\s*=\s*['|"](.*)['|"]/i ) {
                $href = $1;
        } else {
                $href = "No find";
        }

        print "href: $href\n";

This works if the <a> tag only has the href attribute, but when another is
added, like the target above, it pulls out:

        http://www.google.com"; target="_blank

which isn't what I want.

        m/href\s*=\s*['|"](.*)['|"]/i

My understanding of the above line is,

        m/href          match starting from "href",
        \s*             then zero or more white space,
        =               then "=",
        \s*             then zero or more white space,
        ['|"]           then either a single or double quote,
        (.*)            then anything(this being the bit I want to pull
out),
        ['|"]           then either a single or double quote
        /i              non case sensitive

How do I tell it to stop at the first single or double quote after the
first?

Also, I'll be going on to try and allow no quotes around the href attribute,
a client will be supplying the input for this. Can anyone think of other
ways of improving this?

Thanks

------------------------------------------------------------------ 
Adrian Lynch 
Web Application Developer 
Thoughtbubble Ltd 
Full Service Agency
------------------------------------------------------------------ 
<http://www.thoughtbubble.com>
Tel: +44 (0) 20 7387 8890 (ex. 23)
Fax: +44 (0) 20 7383 2220
------------------------------------------------------------------ 
The information in this email and in any attachments is confidential and
intended solely for the attention and use of the named addressee(s). Any
views or opinions presented are solely those of the author and do not
necessarily represent those of Thoughtbubble. This information may be
subject to legal, professional or other privilege and further distribution
of it is strictly prohibited without our authority. If you are not the
intended recipient, you are not authorised to disclose, copy, distribute, or
retain this message. Please notify us on +44 (0)207 387 8890.


-- 
** Archive: http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
For human help, e-mail: [EMAIL PROTECTED]

-- 
** Archive: http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
For human help, e-mail: [EMAIL PROTECTED]

Reply via email to