Hi there,

This might be a good chance to plug my regular expression emulator,
search-text.r :)

    http://www.rebol.org/utility/search-text.r

As far as I understand it, this should be the direct translation of the
BNF rule for hostnames as given by /PeO :

   let-char [*[let-digit-hyph-char] let-digit-char]

should be:

>> l: to-bits #!a-z            ; TO-BITS is included in search-text.r
== make bitset! #{             ; ! means upper and lower case
0000000000000000FEFFFF07FEFFFF0700000000000000000000000000000000
}
>> ld: to-bits #!a-z0-9
== make bitset! #{
000000000000FF03FEFFFF07FEFFFF0700000000000000000000000000000000
}
>> ldh: to-bits #!a-z0-9\-     ; \ escapes the hyphen
== make bitset! #{
000000000020FF03FEFFFF07FEFFFF0700000000000000000000000000000000
}
>> name-rule: [l 0 1 [any ldh ld]]
== [l 0 1 [any ldh ld]]

Unfortunately this almost never returns true when it should:

>> parse/all "abc-ef" name-rule
== false
>> parse/all "abcef" name-rule
== false
>> parse/all "a" name-rule
== true


This is very easy to do with SEARCH from search-text.r, and you don't
even have to prepare the bitsets beforehand:

>> search "abcdef" [head #!a-z  maybe [any #!a-z0-9\- #!a-z] tail]
== [1 6 "abcdef"]
>> search "abc-def" [head #!a-z  maybe [any #!a-z0-9\- #!a-z] tail]
== [1 7 "abc-def"]
>> search "abc--def" [head #!a-z  maybe [any #!a-z0-9\- #!a-z] tail]
== [1 8 "abc--def"]
>> search "abcdef-" [head #!a-z  maybe [any #!a-z0-9\- #!a-z] tail]
== none
>> search "0abcdef" [head #!a-z  maybe [any #!a-z0-9\- #!a-z] tail]
== none



SEARCH emulates the backtracking behavior of regular expressions.
PARSE on the other hand will match to the end of the line with
LET-DIGIT-HYPH-CHAR, leaving nothing for the last LET-DIGIT-CHAR to
match.


Actually, I have to admit this isn't so difficult with PARSE either.
You just have to look for sequences of any number of optional hyphens
followed by one or more alphanumerics:

>> name-rule: [l any [any "-" some ld ]]
== [l any [any "-" ld]]
>> parse/all "abcdef" name-rule
== true
>> parse/all "abc-def" name-rule
== true
>> parse/all "abc--def" name-rule
== true
>> parse/all "abcdef-" name-rule
== false
>> parse/all "0abcdef" name-rule
== false


See you,
Eric

Reply via email to