From: Edward Peschko <[EMAIL PROTECTED]>
To: Jeff Clites <[EMAIL PROTECTED]>, [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Bcc: 
Subject: Re: S5 updated
Message-ID: <[EMAIL PROTECTED]>
Reply-To: 
In-Reply-To: <[EMAIL PROTECTED]>

ok,

I'm going to answer both you and Luke in the same message to save time. 
I'm also taking it to perl6 language as suggested.

First, jeff -- 

On Thu, Sep 23, 2004 at 08:15:08AM -0700, Jeff Clites wrote:
> On Sep 22, 2004, at 5:06 PM, Edward Peschko wrote:
> 
> >>How do you do that?  Generation and matching are two different things
> >>algorithmically.
> >
> >yes, but they are intimately linked. just like the transformation of a 
> >string
> >into a number, and from a number to a string. Two algorithmically 
> >different
> >things as well, but they'd damn-well better be exact inverses of the 
> >other.
> 
> But they're not:
> 
>   "  3 foo" --> 3 --> "3"

I'd say that that's a caveat of implementation, sort of a side effect of handling 
an error condition. By your criteria there are very few inverses - you could 
say that multiplication isn't an inverse of division because of zero, for example.

If you add the further caveat that everything in the string to be 
converted has to be an integer, then they *are* direct inverses.

> >My point is that if inputting strings into grammars is low level
> >enough to be an op, why isn't generating strings *from* grammars?
> 
> Maybe, because it's a less common thing to want to do? (Which is a bit 
> ironic, since technically grammars are typically characterized as sets 
> of rules for how to generate all the acceptable strings of the language 
> they define, and parsing is sort of running that in reverse.)

Well, there re two responses to the "that's not a common thing to want to do":

    1) its not a common thing to want to do because its not a useful thing to do.
    2) its not a common thing to want to do because its too damn difficult to do.

I'd say that #2 is what holds. *Everybody* has difficulties with regular 
expressions - about a quarter of my job is simply looking at other 
people's regex used in data transformations and deciding what small 
bug is causing them to fail given a certain input. 

There's simply no way to graphically show regexes now. Even use re 'debug' is 
terribly cryptic. The best way to deal with them right now is to burn a regex
parser into your brain.


Running a regular expression in reverse has IMO the best potential for making
regexes transparent - you graphically see how they work and what they match. 
So would this get used? Yes - far more IMO than *other* parts of the 
language that already are sanctified: continuations, for example.

And as as you said, at the logical level, grammars are generators first,
*not* parsers.  So not only would generators be useful and be used heavily, 
but they are the basic framework to phrase automata.  Why shouldn't that 
be reflected in the language itself?


> But you seemed to be saying (to which Luke replied the "How do you do 
> that?" above) that they should somehow share an implementation, so that 
> they can't accidentally diverge. But algorithmically it seems they 
> can't share an implementation, so making them both fundamental ops 
> doesn't achieve the goal of ensuring parity.

But coupling generators with the regex engine *does* achieve parity.
You make the generator generate strings to test the regex engine,
and you make the regex engine have coverage testing to test the generator.

Its not the same algorithmically, true. But that doesn't mean that the two
couldn't be linked together in a significant way.


now, luke, I'll start with your first principle --

"Because nobody's implemented it yet. [[ I'll refer to this later, let's call
  it [Because]"


Yes, it hasn't been implemented in perl5 as a standard feature. But, Brad Bowman 
pointed out to me that there *is* Regexp::Genex which is a good seed to grow from. 

I'd also say that having a tool to generate thousands of regression tests goes 
a long way to solve your 'Not that we ever have enough regex tests' problem.


And as *you* implied - building such a generator was easy.  So it should get done 
some day - and a good way to *get* it done is to put it into a list of 'things 
to do' so that some enterprising person could pick it up as a project.


As for speed (what does coding it in IMCC give you)? It gives you space savings 
, and it gives you the ability to use said function in other ways - say for 
temporary file generation:

        my $tempfile = g" (/tmp/filename.[a-z]**{10}) ";

serving up temporary URLS:

        return( g"(http://mysite/myfolder/[a-z]**{10}.html)");

for combinations

        my @diceroll = g" ([0-6])**{3} ";

random number generation:

        my $number = g" ( 0 | 3 | 5 | 7 ) ";

or even for shuffling cards:

        my @cards = qw (0..52);
        my @card_hand = g:permute/ ( @cards )**{5} /

without too much overhead. There's a hell of a lot of stuff that you 
could use this for. I could even see it being used for simulated annealing or
genetic algorithms or bioinformatics. After all, there's a *reason* why 
computer scientists made automata in the first place..





As for your bit of unintended sarcasm (I'll alleviate their pain by bringing 
it it upon my shoulders, perl -MCPAN -e 'install "Grammar::Generate", etc), 
how do you propose that the generator be used for testing perl6's regular 
expression engine (and vice versa) *unless* is bundled with perl itself? 

At the risk of a bit of unintented sarcasm myself, do you expect perl6 
to bootstrap itself off the net and pick up pieces that are needed to do 
a 'make test'? If so, what about standalone installs?


Ed

(
    ps - Regexp::Genexis the perfect example of why it should be a 
    'standard' module or operator.  (thanks for the pointer btw, Brad ). Look 
    at the limitations - no anchors, no lookahead, lookbehind, code elements
    or conditionals.

    If the generator was used as the primary way to testing the regex 
    engine, do you really think that any of these limitations would exist? 
)

Reply via email to