> -----Original Message-----
> From: Howie Hamlin [mailto:[EMAIL PROTECTED]
> Sent: Sunday, March 20, 2005 4:03 PM
> To: CF-Talk
> Subject: Regex to find CF vars
> 
> First, Regex is definitely not my bag :)
> 
> I have a string that contains CF vars and I need to get a list of these
> vars.  For example:
> 
> The name of item ###id# is #somevalue#.
> 
> What I would want is a return of "id" and "somevalue".
> 
> I tried to match #*# but that doesn't seem to work.

At the very least you'd have to do "##.+?##":

+) The doubled pounds are to escape them in CF (otherwise CF will see then
as vars).

+) The period is the "any character" wild card in RegEx (not the asterisk).

+) The plus says "find one or more occurrences of the previous set.

+) The Question mark (used as it is here) makes the expression "non-greedy"
- in other words it will stop at the first "end pound" it sees rather than
the last.  (Note that versions of CF prior to MX don't support this - but
lordy do it make life easier.)

Still that ones not quite right anyway... CF variable names can only start
with a currency symbol, an underscore or a letter.  Then they can only have
letters, numbers, underscores and currency symbols in them.

So a snippet to find a CF var name looks like this (this is from a custom
type validator I have).  First I set a variable to a list of Unicode
currency symbols:

<cfset CurSyms = Chr(36) & Chr(162) & Chr(163) & Chr(164) & Chr(165) &
Chr(2546) & Chr(2547) & Chr(8352) & Chr(8353) & Chr(8354) & Chr(8355) &
Chr(8356) & Chr(8357) & Chr(8358) & Chr(8359) & Chr(8360) & Chr(8361) &
Chr(8362) & Chr(8363) & Chr(8364) & Chr(8365) & Chr(8366) & Chr(8367) &
Chr(8368) & Chr(8369) & Chr(3647) & Chr(6107) />

Then the rex ex to determine a good CF variable name would be:

"^[[:alpha:]_#CurSyms#][[:alnum:]_#CurSyms#]*"

+) The caret ("^") in this case "pins" the regex to the beginning of the
search (this regex looks at a single value and determines if it's a valid
variable name, not across a whole document).

So this one is basically saying "The first character must be a letter, an
underscore or a currency symbol followed by any number of letters, numbers
or currency symbols".

Even that's not exactly right since CF vars can't really be of any length...
but since I don't what the upper limit is it works for now.

So - trying tp put them together might yield this (this also assumes that
the currency symbols have been set):

"##[[:alpha:]_#CurSyms#][[:alnum:]_#CurSyms#]*?##"

All told I think that will work... I'm not sure tho - give her a try and let
us know how it works out!

The main problem I can see is that there now way for the RegEx to know
"where" in the document you are.  You'll almost definitely pick up false
positives from this when dealing with pound signs for inner-page anchors and
the like.

In short there's really no way for a single regex to ensure that you're in a
CF tag when it checks.  You might be able to pull it off with a bunch of
tags but recursive parsing is really the only way to determine the document
structure enough to figure it out (and even then things go screwy sometimes
with badly formed code).

Jim Davis




~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:199497
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations & Support: http://www.houseoffusion.com/tiny.cfm/54

Reply via email to