Michael Holzt wrote:
I'm still trying to decode that regexp (will have a look in the camel book later)
Does this help?
$ perl -Mre=debug -e "qr'(?:[a-zA-Z0-9](?:[-a-zA-Z0-9]*[a-zA-Z0-9])?)';" Freeing REx: `","'
Compiling REx `(?:[a-zA-Z0-9](?:[-a-zA-Z0-9]*[a-zA-Z0-9])?)'
size 39 Got 316 bytes for offset annotations.
first at 1
1: ANYOF[0-9A-Za-z](12)
12: CURLYX[0] {0,1}(38)
14: STAR(26)
15: ANYOF[\-0-9A-Za-z](0)
26: ANYOF[0-9A-Za-z](37)
37: WHILEM(0)
38: NOTHING(39)
39: END(0)
stclass `ANYOF[0-9A-Za-z]' minlen 1
How about this ?
my $subdomain = qr'
(?: # group but no backreferences
[a-zA-Z0-9] # match a single ALPHA / DIGIT
(?: # group but no backreferences
[-a-zA-Z0-9]* # match zero or more ALPHA / DIGIT / HYPHEN
[a-zA-Z0-9] # followed by a single ALPHA / DIGIT
)? # but only optionally match this group
)'x;However, now that I diagram this out, I think this is still too limiting. Here is the BNF notation for the subdomain term:
# sub-domain = Let-dig [Ldh-str] # Let-dig = ALPHA / DIGIT # Ldh-str = *( ALPHA / DIGIT / "-" ) Let-dig
so we have to match
u.domain.edu
u-u.domain.edu
university-something.domain.eduwhich I don't think that regex manages. Isn't this better?
my $subdomain = qr'
(?: # group but no backreferences
[a-zA-Z0-9] # match a single ALPHA / DIGIT
(?: # group but no backreferences
-(?=[a-zA-Z0-9]) # match HYPHEN when followed by ALPHA / DIGIT
)? # but only optionally match this group
[a-zA-Z0-9]* # followed by a zero or more ALPHA / DIGIT
)'x;Right???
John
