"Paul McGuire" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED]
On Dec 22, 8:30 am, "BJörn Lindqvist" <[EMAIL PROTECTED]> wrote: > With regexps you can search for strings matching it. For example, > given the regexp: "foobar\d\d\d". "foobar123" would match. I want to > do the reverse, from a regexp generate all strings that could match > it. > > The regexp: "[A-Z]{3}\d{3}" should generate the strings "AAA000", > "AAA001", "AAA002" ... "AAB000", "AAB001" ... "ZZZ999". > > Is this possible to do? Here is a first cut at your problem (http://pyparsing-public.wikispaces.com/space/showimage/invRegex.py). I used pyparsing to identify repeatable ranges within a regex, then attached generator-generating classes to parse actions for each type of regex element. Some limitations: - unbounded '*' and '+' repetition is not allowed - only supports \d, \w, and \s macros ===================== Download the latest version of this file. It is now importable as its own module, with the invert method that takes a regexp string and returns a generator that yields all the possible matching strings. This file also includes a simple count method, which returns the number of elements returned by a generator (as opposed to calling len(list(invert("..."))), which generates an intermediate list just to invoke len on it). The reg exp features that have been added are: - alternation using '|' - min-max repetition using {min,max} format - '.' wildcard character Also fixed some repetition bugs, where "foobar{2}" was treated like "(foobar){2}" - now both cases are handled correctly. -- Paul
-- http://mail.python.org/mailman/listinfo/python-list