And is there a way to make the regex more readable with line breaks that don't become part of the expression?
Sure.
m/ \w+ # one or more word chararcters \s{2} # two whitespace characters \d+ # one or more digit characters /x;
The key is the /x modifier. It allows whitespace and comments, but then requires that you use something like my \s above, if you really want to match spaces.
I'm trying to check some date input to a Web form. The formats I accept are 'mm/dd/yy, m/d/yy, mm/dd/yyyy, m/d/yyyy'.
Here's the regex I'm using to check:
( $date=~/^((\d\d?)\/(\d\d?)\/ (((\d{2,2}))|(\d{4,4})))($|(,(\s*)((\d\d?)\/(\d\d?)\/ (((\d{2,2}))|(\d{4,4}))))+$)/ )
Basic component is:
((\d\d?)\/(\d\d?)\/(((\d{2,2}))|(\d{4,4})))
Some thoughts.
First, that's a whole lot of parenthesis. Do you really need to capture all of that? I assume you are mostly using them for clustering (grouping tokens), not to capture the values. There is a clustering operator (?:TOKENS_GO_HERE). It works just like parenthesis, but the value is not captured.
Okay, let's talk about about your quantifiers. You have {2, 2}, which means match at least 2 of these and no more than 2 of these. Better is {2} which means, match exactly 2 of these.
Finally, if you're going to have a bunch of / characters in your expression, take advantage of Perl's ability to change the delimiters. Your backslash key will be grateful.
Given those, here's one possible way to rewrite your basic component match:
m! ^(\d\d?) # capture month /(\d\d?) # capture day /(\d\d(?:\d\d)?) #capture year !x;
That puts the month in $1, day is $2, and year in $3. This doesn't really match only the examples you gave since it would allow something like 3/01/03 or 6/3/1999. Your expression was the same though, as far as that goes.
Finally, as a different way of thinking about this problem, have you considered split()? It was the first thing I thought of. I would use it something like this:
my @dates = split /, ?/, $dates; # separate dates
foreach (@dates) {
my($m, $d, $y) = split /\//, $_; # separate m/d/y
# validate data, replace die with whatever is better for you
die unless $m =~ /^\d\d?$/ && $d =~ /^\d\d?$/ && $y =~ /^\d\d(?:\d\d)?$/;
# process m, d, y here... }
I don't think this makes handling the individual dates any easier, since we have to add a step to validate them. I do believe it would make handling a lot of dates much easier though, and you seem to be doing just that.
Well, maybe that will help you along a little. Good luck.
James
-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]