On Thu, Apr 03, 2003 at 07:30:10AM -0700, Luke Palmer wrote: > > just an aside, and a bit off-topic, but has anybody considered > > hijacking the regular expression engine in perl6 and turning it into > > its opposite, namely making *productions* of strings/sounds/whatever > > that could possibly match the regular expression? ie: > > > > a* > > > > producing > > > > '' > > a > > aa > > aaa > > aaaa > > > > etc. > > > > I could think of lots of uses for this: > > > > 1) test data for databases and programs. > > 2) input for genetic algorithms/code generators > > 3) semantic rules/production of random sentences. > > > > In fact, I foresee a combo of 2,3 and some expert system somewhere > > producing the first sentient perl program. ;-) > > Yeah, it seems like a neat idea. It is if you generate it > right... but, fact is, you probably won't. For anything that's more > complex than your /a*/ example, it breaks down (well, mostly): > > /\w+: \d+/ > > Would most likely generate: > > a: 0 > a: 00 > a: 000 > a: 0000 > > Or: > > a: 0 > a: 1 > ... > a: 9 > a: 00 > a: 01 > > ad infinitum, never getting to even aa: .*
But that's the point - I don't want it to be just able to generate all possibilities, I want it to be able to generate a subset of valid possibilities. And have: a) a default heuristic for doing so, based on a regex b) user defined heuristics for doing so Although I disagree with you on the idea that it has no uses as is - generating all possible combinations. You could do: my @list is Regex::Generator(/([1-6])([1-6^\1])([1-6^\1\2])/) to return a list of all combinations of numbers between 1 and 6 and: my @words = qw( word list number one ); my @words2 = qw( word list number two ); my @list is Regex::Generator(/ (@words) (@words2) /); to generate all possible combinations of words. You could also test hard to understand rexen by simplifying and generating all possible combinations: my $_doublestring = q$(?:\"(?>[^\\\"]+|\\\.)*\")$; becomes my $_doublestring = q$(?:\"(?>[notdq]+|\\\")*\")$; to generate: "" "n" "o" "t" ... "\"" > > But I guess then you'd see a lot more quantifiers and such. > > /\w+<8>: \d<4>/ or substituting \w for something more manageable like [a-f] and \d for [1-2]. > Is finite (albeit there are 63**8 * 10**4 == 2,481,557,802,675,210,000 > combinations). References to the heat death of the universe, anyone? > > And then there's Unicode. %-/ > In reality, I don't think it would be that useful. Theoretically, > though, you *can* look inside the regex parse tree and create a > generator out of it... so, some module, somewhere. Of course, it would need a little elbow grease to be truly useful. The syntax for making heuristics in generating useful productions would take some work. But I can think of a dozen uses for it. Ex: Right now, I'm writing a generator to generate sample programming problems - for a book I'm writing. It spits out both the problem, and the code to answer the problem.. Using a production engine like the one above, and this problem generator becomes 20 lines of code. Ed