On Jun 7, Adrian Pang said:
>I'm trying to write a regex expression so it will extract the attribute
>names from a tag. For example,
>
><P attr1="hello world" attr2 attr3="hi" attr4>
>
>The regex should return attr1, attr2, attr3 and attr4
>Is there anyway to write these into one regex expression?
You should really be using a real HTML parser, but for this type of thing
you can use the following code:
# match an HTML attribute
$attr = qr{
\G
\s*
(\w+)
(?:
\s* = \s*
(?:
" [^"]* " |
' [^']* ' |
[^\s>]+
)
)?
}x;
$TAG = q{<img border=0 ismap src='/foo.gif' alt="FOO!">};
$TAG =~ /<\w+/g; # position the \G anchor after the "<img"
@attrs = $TAG =~ /$attr/g;
--
Jeff "japhy" Pinyan [EMAIL PROTECTED] http://www.pobox.com/~japhy/
I am Marillion, the wielder of Ringril, known as Hesinaur, the Winter-Sun.
Are you a Monk? http://www.perlmonks.com/ http://forums.perlguru.com/
Perl Programmer at RiskMetrics Group, Inc. http://www.riskmetrics.com/
Acacia Fraternity, Rensselaer Chapter. Brother #734
** Manning Publications, Co, is publishing my Perl Regex book **