On May 7, Johan Groth said:
>I want to strip a variable of all characters including a token, i.e.
>aaa_bbb_ccc_ddd would become bbb_ccc_ddd. As you can see I want to get rid of
>aaa_. Does anyone know how to acomplish this in Perl?
>
>I've tried:
>$tmp = "aaa_bbb_ccc_ddd"
>$tmp =~ /^\w+_/
>$tmp = $';
>
>but that results in $tmp eq "ddd" instead of "bbb_ccc_ddd".
Please do not use $` $& and $'. They cause slow-downs ($& not as badly,
but still) to the rest of your program's regular expressions. Use,
instead, either the $1, $2, ... variables, or in your specific case, the
s/// operator.
$tmp =~ s/^[^\W_]+_//;
As for why your regex fails, \w matches the '_' character. Here's output
from 'explain':
======================================================================
The regular expression:
(?-imsx:^\w+_)
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
^ the beginning of the string
----------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
_ '_'
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
======================================================================
Using the modified regex presented above, we have [^\W_] to mean "any
character EXCEPT non-word characters and _" which is a nifty way of saying
"all word characters except _".
======================================================================
The regular expression:
(?-imsx:^[^\W_]+_)
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
^ the beginning of the string
----------------------------------------------------------------------
[^\W_]+ any character except: non-word characters
(all but a-z, A-Z, 0-9, _), '_' (1 or more
times (matching the most amount possible))
----------------------------------------------------------------------
_ '_'
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
======================================================================
--
Jeff "japhy" Pinyan [EMAIL PROTECTED] http://www.pobox.com/~japhy/
Are you a Monk? http://www.perlmonks.com/ http://forums.perlguru.com/
Perl Programmer at RiskMetrics Group, Inc. http://www.riskmetrics.com/
Acacia Fraternity, Rensselaer Chapter. Brother #734