Hi,
I really get sick of making bitsets with this kind of statement:
>> non-alpha: complement charset [#"a" - #"z" #"A" - #"Z"]
== make bitset! #{
FFFFFFFFFFFFFFFF010000F8010000F8FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
}
So I'm working on a shortcut, that will process strings with
metacharacters. TO-BITS is what I've come up with so far.
In the first one or two characters:
~ means to get the complement
! means to include both upper and lower case
Thereafter:
a-z means all the characters from A to Z
\ means quote the next character, or include the character
with the following ascii value
Examples:
>> to-bits "!a-z"
== make bitset! #{
0000000000000000FEFFFF07FEFFFF0700000000000000000000000000000000
}
>> to-bits "~!a-z"
== make bitset! #{
FFFFFFFFFFFFFFFF010000F8010000F8FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
}
>> to-bits "\128-\255"
== make bitset! #{
00000000000000000000000000000000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
}
>> to-bits "~\128-\255"
== make bitset! #{
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF00000000000000000000000000000000
}
Any suggestions? - Also including a function to examine what's in bitsets:
>> unroll-bitset to-bits "!aeiou"
== "AEIOUaeiou"
These are included in search-text.r, on rebol.org . That's a little out of
date, but I'm still working on some other stuff before I update it again.
Hope it's of some use,
Eric
=========
to-bits: func [
{convert a string to a bitset with:
~ for complement, ! for upper and lower case, - for character ranges,
\ as escape character, or to convert following ASCII value}
s [any-string!] "string to convert"
/local r c comp ignore alpha digit
] compose [
alpha: (make bitset! [#"A" - #"Z" #"a" - #"z"])
digit: (make bitset! [#"0" - #"9"])
r: copy []
parse s [
0 2 [#"~" (comp: true) | #"!" (ignore: true)]
some [
#"\" [
copy c some digit (append r to char! to integer! c) |
copy c skip (append r to char! c)
] |
#"-" (append r '-) |
copy c skip (append r to char! c)
]
]
either ignore [
rr: copy []
foreach c r [
append rr either all [ char? c find alpha c ]
[either c > #"_" [c - 32] [c + 32] ] [c]
]
r: union make bitset! r make bitset! rr
][ r: make bitset! r ]
either comp [complement r][r]
]
unroll-bitset: func [
{return string listing all characters in B}
b [bitset!]
/ul "return B with all characters set for upper and lower case"
/local s i
][
b: copy b
s: copy ""
i: 0
while [ i <= 255 ] [
if find b to char! i [insert tail s to char! i]
i: i + 1
]
s: head s
either ul [
insert b uppercase s
insert b lowercase s
][s]
]