Hi there,
This might be a good chance to plug my regular expression emulator,
search-text.r :)
http://www.rebol.org/utility/search-text.r
As far as I understand it, this should be the direct translation of the
BNF rule for hostnames as given by /PeO :
let-char [*[let-digit-hyph-char] let-digit-char]
should be:
>> l: to-bits #!a-z ; TO-BITS is included in search-text.r
== make bitset! #{ ; ! means upper and lower case
0000000000000000FEFFFF07FEFFFF0700000000000000000000000000000000
}
>> ld: to-bits #!a-z0-9
== make bitset! #{
000000000000FF03FEFFFF07FEFFFF0700000000000000000000000000000000
}
>> ldh: to-bits #!a-z0-9\- ; \ escapes the hyphen
== make bitset! #{
000000000020FF03FEFFFF07FEFFFF0700000000000000000000000000000000
}
>> name-rule: [l 0 1 [any ldh ld]]
== [l 0 1 [any ldh ld]]
Unfortunately this almost never returns true when it should:
>> parse/all "abc-ef" name-rule
== false
>> parse/all "abcef" name-rule
== false
>> parse/all "a" name-rule
== true
This is very easy to do with SEARCH from search-text.r, and you don't
even have to prepare the bitsets beforehand:
>> search "abcdef" [head #!a-z maybe [any #!a-z0-9\- #!a-z] tail]
== [1 6 "abcdef"]
>> search "abc-def" [head #!a-z maybe [any #!a-z0-9\- #!a-z] tail]
== [1 7 "abc-def"]
>> search "abc--def" [head #!a-z maybe [any #!a-z0-9\- #!a-z] tail]
== [1 8 "abc--def"]
>> search "abcdef-" [head #!a-z maybe [any #!a-z0-9\- #!a-z] tail]
== none
>> search "0abcdef" [head #!a-z maybe [any #!a-z0-9\- #!a-z] tail]
== none
SEARCH emulates the backtracking behavior of regular expressions.
PARSE on the other hand will match to the end of the line with
LET-DIGIT-HYPH-CHAR, leaving nothing for the last LET-DIGIT-CHAR to
match.
Actually, I have to admit this isn't so difficult with PARSE either.
You just have to look for sequences of any number of optional hyphens
followed by one or more alphanumerics:
>> name-rule: [l any [any "-" some ld ]]
== [l any [any "-" ld]]
>> parse/all "abcdef" name-rule
== true
>> parse/all "abc-def" name-rule
== true
>> parse/all "abc--def" name-rule
== true
>> parse/all "abcdef-" name-rule
== false
>> parse/all "0abcdef" name-rule
== false
See you,
Eric