Brilliant, thanks Peter. One day I will "click" with regex!

Cheers,

Dave

2008/11/26 Peter Boughton <[EMAIL PROTECTED]>:
>> If you have time could you explain what each part of the regex is
>> doing, so that I can learn for future?
>
> Sure. :)
>
>
> Starting with the original PHP one:
> /(\s)(\w+)$/i
>
> the /.../i means "a regular expression, using the i flag"
> the i flag is "case-Insensitive"
>
> the (\s) is "a group consisting of a single whiteSpace character
> (space,tab,newline)"
>
> the (\w+) is "a group consisting of one or more Word characters"
> the + part of that means "one or more"
>
> the $ at the end means "end of line" or "end of expression", depending
> on if the expression is multi-line or not. It matches the position,
> not a character (it is a zero-width match).
>
>
> The slashes are used for PHP when you want to specify a flag - if you
> had no flags, the slashes are not necessary.
> Some languages (JavaScript) can work with non-quoted slashes, whilst
> others (CFML) do not use the slash convention at all.
>
> In all of these, there is an alternative way to specify flags, using
> the construct (?i) - note that anything (?...) is a special group,
> that acts differently to other groups. (note: this construct can be
> used for a lot more than just flags - primarily lookarounds, but other
> stuff too)
>
> In CFML, you can use reReplaceNoCase instead of the i flag, which is
> more readable for people that don't know regex.
>
>
> In the replace string of " and $2" or " and \2" in the PHP/CFML ones,
> the $2 or \2 is a backreference referring to group 2 - i.e. the second
> pair of parentheses, containing the \w+ ("one or more word
> characters")
>
>
> In the simplified version, I removed the flag, removed the groups,
> removed the space, and changed the replace string, so we ended up
> with:
>
> reReplace( wkstr , '\w+$','and \0' )
>
> The flag was unnecessary - the \w means "any word character" and is
> not case sensitive.
>
> The \s was unnecessary because the \w+ will capture all of the
> characters upto the space, and we were just putting the space back in,
> so I took both the \s out and the space before the "and" out also.
>
> With the \s gone, there was no need to group the characters and use \2
> (or \1 as it would have become) - instead I used \0 in the replace
> string, which means "the entire content of the match", rather than one
> of the groups within it.
>
>
> So, I think that covers everything - hopefully it all makes sense, and
> wasn't information overload.
> Let me know if you'd like any part clarified/re-worded. :)
>
> 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to 
date
Get the Free Trial
http://ad.doubleclick.net/clk;207172674;29440083;f

Archive: http://www.houseoffusion.com/groups/regex/message.cfm/messageid:1202
Subscription: http://www.houseoffusion.com/groups/regex/subscribe.cfm
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.21

Reply via email to