"Paul McGuire" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]

On Dec 22, 8:30 am, "BJörn Lindqvist" <[EMAIL PROTECTED]> wrote:
> With regexps you can search for strings matching it. For example,
> given the regexp: "foobar\d\d\d". "foobar123" would match. I want to
> do the reverse, from a regexp generate all strings that could match
> it.
>
> The regexp: "[A-Z]{3}\d{3}" should generate the strings "AAA000",
> "AAA001", "AAA002" ... "AAB000", "AAB001" ... "ZZZ999".
>
> Is this possible to do?

Here is a first cut at your problem
(http://pyparsing-public.wikispaces.com/space/showimage/invRegex.py).
I used pyparsing to identify repeatable ranges within a regex, then
attached generator-generating classes to parse actions for each type of
regex element.  Some limitations:
- unbounded '*' and '+' repetition is not allowed
- only supports \d, \w, and \s macros

=====================

Download the latest version of this file.  It is now importable as its own 
module, with the invert method that takes a regexp string and returns a 
generator that yields all the possible matching strings.  This file also 
includes a simple count method, which returns the number of elements 
returned by a generator (as opposed to calling len(list(invert("..."))), 
which generates an intermediate list just to invoke len on it).

The reg exp features that have been added are:
- alternation using '|'
- min-max repetition using {min,max} format
- '.' wildcard character

Also fixed some repetition bugs, where "foobar{2}" was treated like 
"(foobar){2}" - now both cases are handled correctly.

-- Paul



-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to