date:20020828

Re: Hypothetical synonyms

2002-08-28 Thread Trey Harris


In a message dated 27 Aug 2002, Uri Guttman writes:

  LW == Larry Wall [EMAIL PROTECTED] writes:

   LW On 27 Aug 2002, Uri Guttman wrote: : and quoteline might even
   LW default to  for its delim which would make : that line:
   LW :
   LW : my ($fields) = /(quotelike|\S+)/;

   LW That just looks like:

   LW my $field = /shellword/;

 where is the grabbing there? if there was more than just shellword would
 you have to () it for a grab? wouldn't that assign a boolean like perl5
 or is the boolean result only returned in a boolean context?

Note--no parens around $field.  We're not capturing here, not in the
Perl 5 sense, anyway.

When a pattern consisting of only a named rule invokation (possibly
quantified) matches, it returns the result object, which in boolean
context returns true, but in string context returns the entire captured
text from the named rule (so, one hopes that the Cshellword rule
captures only the quoted text, not the quotes surrounding it).

I think this is more generalizable.  I believe that if one matches an
arbitrary rule which does not contain capturing parentheses, it returns
the result object as well, which should contain the entire match (as if
one put parens around the entire thing).  Correct?

So:

   my $vers = _ /after Perl \s* 6/;

should cause $vers to contain either 6 or .  A successful match object
is true in boolean context, so

   my $vers = /after Perl \s* \d/;

would cause $vers to be true, even if the digit matched was zero.

Here's an interesting one:

   my $vers = _ /after Perl \s* \d/; # Stringify...
   print yes! if $vers;  # ... and booleanize

If $vers contained 0, would it still be true?  That is, does the is
true property of the result object survive stringification?  It might be
useful if it did.  On the other hand, of course, one can also imagine:

   my $flag = _ (/after ^^ debug \s* = \s* [01]/
 or die No debug setting!);
   print yes! if $flag;

where one would want the truth value to follow old conventions.  Perhaps
you could write:

   my $flag = /after ^^ debug \s* = \s*
   [ 0 :: { $0 is false }
   | 1
   ]/;

But then you have no way short of another string comparison for teasing
out the difference between a failed match and a zero match, which is what
we were trying to get away from.

Maybe I'm just making this too complicated

 what happens to $field if no match was found? undef? the old boolean
 false of a null string wouldn't be good as that could be the result of a
 match. i assume undef could never be the result of a match unless some
 included perl code returned undef to the match object. then coder emptor
 would be the rule.

If the pattern doesn't match... will it return the undefined value, or
will it return a false (and stringwise empty) result object?  I could see
it going either way, but a failed pattern result object is fairly useless,
isn't it?

 this is gonna make all the groups that copied perl5 regexes blow their
 lids. just think about all the neat canned regexes that will be
 done. like Regex::Common but even more so. we will need a CPAN just for
 these alone. full blown *ML parsers, email verifiers, formatted data
 extractors, etc.

More and more lately, I've been finding myself getting syntax errors when
I've wishfully put Perl 6 into my code. :-)

Trey

Re: auto deserialization

2002-08-28 Thread david



 Will there be automatic calling of the deserialization method 
 for objects, so that code like this DWIMs...

  my Date $bday = 'June 25, 2002';

 Err... what do you mean it to do?

Wow, this is nice. He means (I think) that this will be translated into

my Date $bday = Date-new('June 25, 2002');

As far as I've understood during my hours of lurking, it has been decided that this 
will not happen, but now is also the first time I am even slightly convinced that it 
is a good idea. 

I think it is really pretty, although it could be argued that:

a) As typing the translated code yourself would be easy, the whole idea would be a 
useless complication.

  or 

b) The translation should only happen if the class defined the method 
NEW_FROM_STRING() or some such, and that method would be used instead of new(). I 
might be afraid that would also tend to bloat the class system, but I think there must 
be a way.

Maybe a class could define the method new_from($obj) which would be called if it 
existed, and whose return value would be what was assigned to the class-hinted 
variable.


Is this going to be still-born?


david
--
(unbalanced brackets are really annoying

Re: auto deserialization

2002-08-28 Thread [EMAIL PROTECTED]


From:  [EMAIL PROTECTED]
 Wow, this is nice. He means (I think) that this will be translated into
 my Date $bday = Date-new('June 25, 2002');


I rather like it too, but it hinges on how strictly typing is enforced.  If
typing is strictly enforced then it works because the VM can always know
that since Date isn't a String, it should call the FROM_STRING static
method if such a method is available.

However, it appears that typing won't be so strictly enforced, in which
case the intent becomes ambiguous.  Does the line mean to instantiate Date
using the string, or to just assign the string to $bday and just have the
wrong type?

Is there some kind of third option?  I have to admit I've always found Java
commands like Date bday = new Date('June 25, 2002') somehow redundant.

-Miko


mail2web - Check your email from the web at
http://mail2web.com/ .

Re: Hypothetical synonyms

2002-08-28 Thread Aaron Sherman


On Wed, 2002-08-28 at 03:23, Trey Harris wrote:

 Note--no parens around $field.  We're not capturing here, not in the
 Perl 5 sense, anyway.
 
 When a pattern consisting of only a named rule invokation (possibly
 quantified) matches, it returns the result object, which in boolean
 context returns true, but in string context returns the entire captured
 text from the named rule (so, one hopes that the Cshellword rule
 captures only the quoted text, not the quotes surrounding it).

Ok, just to be certain:

$_ = 0;
my $zilch = /0/ || 1;

Is $zilch C0 or 8?

If C0, does it continue to be true? What about:

$_ = 0;
my $zilch = /0/ || 1;
die Failed to match zero unless $zilch;

Is that a bug?

Re: Hypothetical synonyms

2002-08-28 Thread Trey Harris


In a message dated 28 Aug 2002, Aaron Sherman writes:
 Ok, just to be certain:

   $_ = 0;
   my $zilch = /0/ || 1;

 Is $zilch C0 or 8?

8?  How do you get 8?  You'd get a result object which stringified was 0
and booleanfied was true.  So here, you'd get a result object vaguely
isomorphic to 0 but true.

 If C0, does it continue to be true? What about:

   $_ = 0;
   my $zilch = /0/ || 1;
   die Failed to match zero unless $zilch;

 Is that a bug?

Yes, it's a bug, as I don't see any way to actually die there.  I don't
understand the presence of the C|| 1 there.  I think you'd just write
Cmy $zilch = /0/;.  If you really truly wanted it to be one if it
failed, but you still wanted the die to work, you'd write:

  $_ = 0;
  my $zilch = /0/ || 1 but false;
  die Failed to match zero unless $zilch;

Or, more comprehensibly, just

  $_ = 0;
  my $zilch = /0/
   or die Failed to match zero;

Trey

Re: Hypothetical synonyms

2002-08-28 Thread Steffen Mueller


Piers Cawley wrote:
 Uri Guttman [EMAIL PROTECTED] writes:
{...]
 couldn't that be reduced to:

 m{^\s* $stuff := [ (.*?) | (\S+) ] };

 the | will only return one of the grabbed chunks and the result of
 the [] group would be assigned to $stuff.

 Hmm... is this the first Perl 6 golf post?

Well, no, for two reasons:
a) There's whitespace.
b) The time's not quite ready for Perl6 golf because Larry's the only one
who would qualify as a referee.

And we all know that's not a recreational task :)

Steffen
--
@n=(544290696690,305106661574,116357),$b=16,@c=' ,JPacehklnorstu'=~
/./g;for$n(@n){map{$h=int$n/$b**$_;$n-=$b**$_*$h;$c[@c]=$h}c(0..9);
push@p,map{$c[$_]}@c[c($b..$#c)];$#c=$b-1}print@p;sub'c{reverse @_}

Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Deven T. Corzine


I have no objection to pattern operators like ::: in principle, but I do
have a potential concern about them.

Given that the operators are actually defined in terms of backtracking 
within the RE engine, does this constrain the implementation such that it 
MUST be a backtracking implementation to behave correctly?

If these operators are purely effeciency optimization hints, that would be 
one thing, but I get the sense that ignoring the hints might lead to 
incorrect behavior.  (The cut operator might be a special concern.)

Suppose, for the sake of argument, that someone wanted to make a pattern 
engine implementation, compatible with Perl 6 patterns, which was highly 
optimized for speed at the expense of memory, by RE-NFA-DFA construction 
for simultaneous evaluation of multiple alternatives without backtracking.

This might be extremely expensive in memory, but there may be some niche 
applications where run-time speed is paramount, and a pattern is used so 
heavily in such a critical way that the user might be willing to expend 
hundreds of megabytes of RAM to make the patterns execute several times 
faster than normal.  (Obviously, such a tradeoff would be unacceptable in
the general case!)

Would it be _possible_ to create a non-backtracking implementation of a 
Perl 6 pattern engine, or does the existence of backtracking-related 
operators preclude this possibility in advance?

I hope we're not constraining the implementation options by the language 
design, but I'm worried that this might be the case with these operators.

Shouldn't it be an implementation decision whether to use backtracking?

Deven

Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Simon Cozens


[EMAIL PROTECTED] (Deven T. Corzine) writes:
 Would it be _possible_ to create a non-backtracking implementation of a 
 Perl 6 pattern engine

I don't believe that it is, but not just because of : and friends.
Why does it matter?

-- 
Life sucks, but it's better than the alternative.
-- Peter da Silva

Re: auto deserialization

2002-08-28 Thread Erik Steven Harrison



From:  [EMAIL PROTECTED]
 Wow, this is nice. He means (I think) that this will be translated into
 my Date $bday = Date-new('June 25, 2002');

I don't think this is going to work. First off, there 
is no predefined constructor name in Perl. Secondly, 
you can have multiple constructors in the same class. 
And thirdly Date.new (for better or for worse) does 
not have to return a Date object.

Finally, if these problems could be surmounted (ie 
Perl 6 defines an implicit constructor), then we get 
very subtle bugs like this

my Dog $spot = Poodle.new;


$spot is typed to accept Dog subclasses, right? But 
what if Dog.new is typed to accept an object as it's 
first argument? Or, worse, has no argument list? Does 
this construct turn into


my Dog $spot = Dog.new( Poodle.new );


or


my $spot is 'Dog';

$spot = new Poodle.new

-Erik


Is your boss reading your email? Probably
Keep your messages private by using Lycos Mail.
Sign up today at http://mail.lycos.com

Re: auto deserialization

2002-08-28 Thread David Wheeler


On Wednesday, August 28, 2002, at 06:11  AM, [EMAIL PROTECTED] 
wrote:

 Is there some kind of third option?  I have to admit I've always found 
 Java
 commands like Date bday = new Date('June 25, 2002') somehow 
 redundant.

I have to agree with this. Ideally, IMO, there'd be some magic going on 
behind the scenes (maybe with a pragma?) that automatically typed 
variables so we wouldn't have to be so redundant, the code would look 
more like (most) Perl 5 OO stuff, and I'd save my tendonitis. What I 
mean (ignoring for the moment the even simpler syntax suggested earlier 
in this thread) is this:

   my $date = Date.new('June 25, 2002');

Would automatically type C$date as a Date object.

Thoughts?

Regards,

David

-- 
David Wheeler AIM: dwTheory
[EMAIL PROTECTED] ICQ: 15726394
http://david.wheeler.net/  Yahoo!: dew7e
Jabber: [EMAIL PROTECTED]

Re: auto deserialization

2002-08-28 Thread Dan Sugalski


At 10:36 AM +0200 8/28/02, [EMAIL PROTECTED] wrote:
   Will there be automatic calling of the deserialization method
  for objects, so that code like this DWIMs...

   my Date $bday = 'June 25, 2002';

  Err... what do you mean it to do?

Wow, this is nice. He means (I think) that this will be translated into

my Date $bday = Date-new('June 25, 2002');

That's really unlikely. More likely what'll happen is:

   my Date $bday;
   $bday = 'June 25, 2002';

and it'll be up to $bday's string assignment code to decide what to 
do when handed a string that looks like a date.

That should work OK for a variety of reasons. $bday is strongly typed 
since you told perl what type it was in the my declaration. Date can 
also override string assignment, thus Doing The Right Thing (pitching 
a fit or taking a date) when you assign to it.

I can see downsides to it, though--it means you lose the compile-time 
type checking, since just because we're getting the wrong type 
doesn't mean it's really an error. OTOH it's not like we have strong 
compile-time type checking now...
-- 
 Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Dan Sugalski


At 10:57 AM -0400 8/28/02, Deven T. Corzine wrote:
Would it be _possible_ to create a non-backtracking implementation of a
Perl 6 pattern engine, or does the existence of backtracking-related
operators preclude this possibility in advance?

In general, no of course it's not possible to create a 
non-backtracking perl regex engine. Far too much of perl's regexes 
requires backtracking.

That doesn't mean you can't write one for a specific subset of perl's 
regexes, though. A medium-term goal for the regex engine is to note 
where a DFA would give correct behaviour and use one, rather than 
going through the more expensive generalized regex engine we'd 
otherwise use.

If you want to head over to [EMAIL PROTECTED] and pitch in on 
the regex implementation (it's being worked on now) that'd be great.
-- 
 Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

Re: auto deserialization

2002-08-28 Thread Nicholas Clark


On Wed, Aug 28, 2002 at 12:17:55PM -0400, Dan Sugalski wrote:
 At 10:36 AM +0200 8/28/02, [EMAIL PROTECTED] wrote:
Will there be automatic calling of the deserialization method
   for objects, so that code like this DWIMs...
 
my Date $bday = 'June 25, 2002';
 
   Err... what do you mean it to do?
 
 Wow, this is nice. He means (I think) that this will be translated into
 
 my Date $bday = Date-new('June 25, 2002');
 
 That's really unlikely. More likely what'll happen is:
 
my Date $bday;
$bday = 'June 25, 2002';
 
 and it'll be up to $bday's string assignment code to decide what to 
 do when handed a string that looks like a date.

op wise, how is that different from the original suggestion of

my Date $bday = 'June 25, 2002';

?

 That should work OK for a variety of reasons. $bday is strongly typed 
 since you told perl what type it was in the my declaration. Date can 
 also override string assignment, thus Doing The Right Thing (pitching 
 a fit or taking a date) when you assign to it.
 
 I can see downsides to it, though--it means you lose the compile-time 
 type checking, since just because we're getting the wrong type 
 doesn't mean it's really an error. OTOH it's not like we have strong 
 compile-time type checking now...

If the compiler were able to see that my Date $bday = 'June 25, 2002';
is one statement that both types $bday as Date, and then assigns a constant
to it, is it possible to do the conversion of that constant to a constant
$bday object at compile time? (and hence get compile time checking)
Without affecting general run time behaviour.

Nicholas Clark

Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Deven T. Corzine


On 28 Aug 2002, Simon Cozens wrote:

 [EMAIL PROTECTED] (Deven T. Corzine) writes:
  Would it be _possible_ to create a non-backtracking implementation of a 
  Perl 6 pattern engine
 
 I don't believe that it is, but not just because of : and friends.
 Why does it matter?

I'm not saying we should dump the operators -- if we get more power by 
assuming a backtracking implementation, maybe that's a worthwhile tradeoff.

On the other hand, if we can keep the implementation possibilities more 
open, that's always a worthwhile goal, even if we're not sure if or when 
we'll ever take advantage of those possibilities, or if we even could...

It seems like backtracking is a Bad Thing, in that it leads to reprocessing 
data that we've already looked at.  On the other hand, it seems to be a 
Necessary Evil because of the memory costs of avoiding backtracking, and 
because we might have to give up valuable features without backtracking.

It may be that backreferences already demand backtracking.  Or some other 
feature might.  I don't know; I haven't thought it through.

If we must have backtracking, so be it.  But if that's a tradeoff we're 
making for more expressive and powerful patterns, we should still at least 
make that tradeoff with our eyes open.  And if the tradeoff can be avoided, 
that's even better.

Deven

Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Deven T. Corzine


On Wed, 28 Aug 2002, Dan Sugalski wrote:

 At 10:57 AM -0400 8/28/02, Deven T. Corzine wrote:
 Would it be _possible_ to create a non-backtracking implementation of a
 Perl 6 pattern engine, or does the existence of backtracking-related
 operators preclude this possibility in advance?
 
 In general, no of course it's not possible to create a 
 non-backtracking perl regex engine. Far too much of perl's regexes 
 requires backtracking.

Given that Perl 5 regex's are no longer regular (much less Perl 6), I'm 
sure this is probably true.  There may be a regular subset which could be 
implemented without backtracking if problematic features are avoided, but 
surely a complete non-backtracking implementation is beyond reach.

On the other hand, :, ::, ::: and commit don't necessarily need to be a 
problem if they can be treated as hints that can be ignored.  If allowing 
the normal engine to backtrack despite the hints would change the results, 
that might be a problem.  I don't know; cut may pose special problems.

Even if the new operators can't work without backtracking, maybe it doesn't 
matter, since there's surely a few others inherited from Perl 5 as well...

 That doesn't mean you can't write one for a specific subset of perl's 
 regexes, though. A medium-term goal for the regex engine is to note 
 where a DFA would give correct behaviour and use one, rather than 
 going through the more expensive generalized regex engine we'd 
 otherwise use.

I think this is a more realistic goal, and more or less what I had in mind.

I believe there are many subpatterns which might be beneficial to compile 
to a DFA (or DFA-like) form, where runtime performance is important.  For 
example, if a pattern is matching dates, a (Jan|Feb|Mar|Apr|...) subpattern
would be more efficient to implement as a DFA than with backtracking.  With 
a large amount of data to process, that represent significant savings...

 If you want to head over to [EMAIL PROTECTED] and pitch in on 
 the regex implementation (it's being worked on now) that'd be great.

I'd like to do that, if I can find the time.  It would be interesting to 
make a small experimental prototype to see if DFA construction could really 
improve performance over backtracking, but it would probably need to be a 
very restricted subset of regex operations to test the idea...

However, while I'm still on perl6-language, I have two language issues to 
discuss first:

(1) Can we have a :study modifier in Perl 6 for patterns?

It could be a no-op if necessary, but it could take the place of Perl 5's 
study operator and indicate that the programmer WANTS the pattern 
optimized for maximum runtime speed, even at the cost of compile time or 
memory.  (Hey, how about a :cram modifier for extreme optimization? :-)

(2) Would simple alternation impede DFA-style optimizations?

Currently, a pattern like (Jun|June) would never match June because the 
leftmost match Jun would always take precedence, despite the normal 
longest-match behavior of regexes in general.  This example could be 
implemented in a DFA; would that always be the case?

Would it be better for the matching of (Jun|June) to be undefined and 
implementation-dependent?  Or is it best to require leftmost semantics?

Deven

Re: auto deserialization

2002-08-28 Thread Larry Wall


On Wed, 28 Aug 2002, David Wheeler wrote:
: I have to agree with this. Ideally, IMO, there'd be some magic going on 
: behind the scenes (maybe with a pragma?) that automatically typed 
: variables so we wouldn't have to be so redundant, the code would look 
: more like (most) Perl 5 OO stuff, and I'd save my tendonitis. What I 
: mean (ignoring for the moment the even simpler syntax suggested earlier 
: in this thread) is this:
: 
:my $date = Date.new('June 25, 2002');
: 
: Would automatically type C$date as a Date object.

Assignment is wrong for conferring compile-time properties, I think.
Maybe something more like:

my Date $date is new('June 25, 2002');

except that this implies the constructor args would be evaluated at
compile time.  We need to suppress that somehow.  We almost need some
kind of topicalization:

my Date $date = .new('June 25, 2002');

but I think that's taking topicalization a bit too far.  The ordinary
way to suppress early evaluation is by defining a closure.  I've argued
before for something like a topicalized closure property:

my Date $date is first { .init 'June 25, 2002' };

though first might be too early.  The init should be inline with
the declaration, so maybe it's

my Date $date is now { .init 'June 25, 2002' };

That might be so common that we could make syntactic sugar for it:

my Date $date { .init 'June 25, 2002' };

That's evaluating the closure for a side effect.  Or we could evaluate
it for its return value, factoring the init out into the implementation
of now, and just get:

my Date $date { 'June 25, 2002' };

Either way, this makes data declarations more like sub declarations
in syntax, though the semantics of what you do with the final closure
when are obviously different.  That is, for ordinary data a bare {...}
is equivalent to is now, while for a subroutine definition it's more
like is on_demand.

Whatever.  My coffee stream hasn't yet suppressed my stream of consciousness.

Larry

Re: auto deserialization

2002-08-28 Thread Dan Sugalski


At 5:29 PM +0100 8/28/02, Nicholas Clark wrote:
On Wed, Aug 28, 2002 at 12:17:55PM -0400, Dan Sugalski wrote:
  At 10:36 AM +0200 8/28/02, [EMAIL PROTECTED] wrote:
 Will there be automatic calling of the deserialization method
for objects, so that code like this DWIMs...
  
 my Date $bday = 'June 25, 2002';
  
Err... what do you mean it to do?
  
  Wow, this is nice. He means (I think) that this will be translated into
  
  my Date $bday = Date-new('June 25, 2002');

  That's really unlikely. More likely what'll happen is:

 my Date $bday;
 $bday = 'June 25, 2002';

  and it'll be up to $bday's string assignment code to decide what to
  do when handed a string that looks like a date.

op wise, how is that different from the original suggestion of

 my Date $bday = 'June 25, 2002';

It isn't. It was mostly to stem the followup eight zillion flavors 
of new cascade that was sure to follow. :)

   That should work OK for a variety of reasons. $bday is strongly typed
  since you told perl what type it was in the my declaration. Date can
  also override string assignment, thus Doing The Right Thing (pitching
  a fit or taking a date) when you assign to it.

  I can see downsides to it, though--it means you lose the compile-time
  type checking, since just because we're getting the wrong type
  doesn't mean it's really an error. OTOH it's not like we have strong
  compile-time type checking now...

If the compiler were able to see that my Date $bday = 'June 25, 2002';
is one statement that both types $bday as Date, and then assigns a constant
to it, is it possible to do the conversion of that constant to a constant
$bday object at compile time? (and hence get compile time checking)
Without affecting general run time behaviour.

That's possible, yes. We could construct the object at compiletime 
and store a real serialized version in the bytecode, and deserialize 
at execution time. We probably will do that, though maybe not for the 
first version of the compilers.
-- 
 Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Sean O'Rourke


On Wed, 28 Aug 2002, Deven T. Corzine wrote:
 Would it be better for the matching of (Jun|June) to be undefined and
 implementation-dependent?  Or is it best to require leftmost semantics?

For an alternation spelled out explicitly in the pattern, it seems like
undefined matching would be confusing.  I regularly order the branches of
regexes assuming they are tried left-to-right.

On the other hand, and on a related note of constrained implementation, do
we require leftmost matching for interpolated arrays of literals (e.g.
/x/)?  If, as with hyper-operators, we said the order of evaluation is
undefined, we could use a fast algorithm (Aho-Corasick?) that doesn't
preserve order.

/s

Re: auto deserialization

2002-08-28 Thread David Wheeler


On Wednesday, August 28, 2002, at 09:56  AM, Larry Wall wrote:

 my Date $date { 'June 25, 2002' };

 Either way, this makes data declarations more like sub declarations
 in syntax, though the semantics of what you do with the final closure
 when are obviously different.  That is, for ordinary data a bare {...}
 is equivalent to is now, while for a subroutine definition it's more
 like is on_demand.

I actually rather like that as a sort of compromise. Syntactic sugar, 
good.

I'm assuming, however, that the difference in syntax between the two 
different uses of {...} would be easily identifiable via the assignment 
operator, viz:

   my Date $date { 'June 25, 2002' };

vs.

   my $sub = { ... };

Correct?

Also, this leads me to wonder, is a closure is actually a typed object?

   my Closure $sub = { ... };

And if so, does it matter?

 Whatever.  My coffee stream hasn't yet suppressed my stream of 
 consciousness.

I think we're all the better for it! :-)

Regards,

David

-- 
David Wheeler AIM: dwTheory
[EMAIL PROTECTED] ICQ: 15726394
http://david.wheeler.net/  Yahoo!: dew7e
Jabber: [EMAIL PROTECTED]

Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Larry Wall


On Wed, 28 Aug 2002, Deven T. Corzine wrote:
: I'm not saying we should dump the operators -- if we get more power by 
: assuming a backtracking implementation, maybe that's a worthwhile tradeoff.
: 
: On the other hand, if we can keep the implementation possibilities more 
: open, that's always a worthwhile goal, even if we're not sure if or when 
: we'll ever take advantage of those possibilities, or if we even could...

That is a worthy consideration, but expressiveness takes precedence
over it in this case.  DFAs are really only good for telling you
*whether* and *where* a pattern matches as a whole.  They are
relatively useless for telling you *how* a pattern matches.
For instance, a DFA can tell you that you have a valid computer
program, but can't hand you back the syntax tree, because it has
no way to decide between shifting and reducing.  It has to do both
simultaneously.

: It seems like backtracking is a Bad Thing, in that it leads to reprocessing 
: data that we've already looked at.  On the other hand, it seems to be a 
: Necessary Evil because of the memory costs of avoiding backtracking, and 
: because we might have to give up valuable features without backtracking.
: 
: It may be that backreferences already demand backtracking.  Or some other 
: feature might.  I don't know; I haven't thought it through.

I believe you are correct that backrefs require backtracking.  Maybe some smart
person will find a way to trace back through the states by which a DFA matched
to retrieve backref info, but that's probably worth a PhD or two.

Minimal matching is also difficult in a DFA, I suspect.

: If we must have backtracking, so be it.  But if that's a tradeoff we're 
: making for more expressive and powerful patterns, we should still at least 
: make that tradeoff with our eyes open.  And if the tradeoff can be avoided, 
: that's even better.

I refer you to http:://history.org/grandmother/teach/egg/suck.  :-)

That's a tradeoff I knowingly made in Perl 1.  I saw that awk had a
DFA implementation, and suffered for it as a useful tool.

And it's not just the backrefs.  An NFA can easily incorporate
strategies such as Boyer-Moore, which actually skips looking at many of
the characters, or the scream algorithm used by study, which can skip
even more.  All the DFAs I've seen have to look at every character,
albeit only once.  I suppose it's theoretically possible to compile
to a Boyer-Moore-ish state machine, but I've never seen it done.

Add to that the fact that most real-life patterns don't generally do
much backtracking, because they're written to succeed, not to fail.
This pattern never backtracks, for instance:

my ($num) = /^Items: (\d+)/;

I'm not against applying a DFA implementation where it's useful
and practical, but just because it's the best in some limited
theoretical framework doesn't cut it for me.  Humans do a great
deal of backtracking in real life, once the limits of their parallel
pattern matching circuits are exceeded.  Even in language we often
have to reparse sentences that are garden pathological.  Why should
computers be exempt? :-)

Larry

need help on perl scripts #1 newuser.pl

2002-08-28 Thread frank crowley


#!/usr/local/bin/perl
$mail_prog = '/usr/lib/sendmail' ;
# This script was generated automatically by Perl
Builder(tm): http://www.solutionsoft.com

# ***ENDAUTOGEN:HEADER*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:INPUT*** Do NOT modify this line!! Do NOT
enter custom code in this section.

GetFormInput;

# The intermediate variables below make your script
more readable
# but somewhat less efficient since they are not
really necessary.
# If you do not want to use these variables, clear the
# Intermediate Variables checkbox in the Tools |
Options dialog box, CGI Wizard tab.

$fmc = $field{'fmc'} ;   
$name = $field{'name'} ; 
$email = $field{'email'} ;   
$address1 = $field{'address1'} ; 
$address2 = $field{'address2'} ; 
$city = $field{'city'} ; 
$state = $field{'state'} ;   
$zip = $field{'zip'} ;   
$country = $field{'country'} ;   
$username = $field{'username'} ; 
$password = $field{'password'} ; 
$confpassword = $field{'confpassword'} ; 
$NewUser = $field{'NewUser'} ;   

$message =  ;
$found_err =  ;

# ***ENDAUTOGEN:INPUT*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:VALIDATE*** Do NOT modify this line!! Do
NOT enter custom code in this section.

$errmsg = p/p\n ;

if (length($fmc)  5) {
$message = $message.$errmsg ;
$found_err = 1 ; }

if (length($fmc)  1952935525) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = pPlease enter a valid email
address/p\n ;

if ($name !~ /.+\@.+\..+/) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = p/p\n ;

if (length($address2)  112076456) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = p/p\n ;

if (length($city)  1830843236) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = p/p\n ;

if (length($state)  112079212) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = p/p\n ;

if (length($zip)  168650098) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = p/p\n ;

if (length($country)  542395983) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = p/p\n ;

if (length($username)  112001069) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = p/p\n ;

if (length($password)  332) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = p/p\n ;

if (length($confpassword)  50528256) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = p/p\n ;

if (length($NewUser)  112137472) {
$message = $message.$errmsg ;
$found_err = 1 ; }

if ($found_err) {
PrintError; }


# ***ENDAUTOGEN:VALIDATE*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:LOGFILE*** Do NOT modify this line!! Do
NOT enter custom code in this section.

# ***ENDAUTOGEN:LOGFILE*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:EMAIL*** Do NOT modify this line!! Do NOT
enter custom code in this section.

# ***ENDAUTOGEN:EMAIL*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:HTML*** Do NOT modify this line!! Do NOT
enter custom code in this section.
print Content-type: text/html\n\n;
print '?xml version=1.0 encoding=utf-8?'.\n ;
print !DOCTYPE html\n ;
print ' PUBLIC -//W3C//DTD XHTML Basic 1.0//EN'.\n
;
print '
http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd;'.\n
;
print 'html xmlns=http://www.w3.org/1999/xhtml;
lang=en-USheadtitlejovens mtg auction - New
User/title'.\n ;
print 'link rev=made
href=mailto:joven_20%40yahoo.com; /'.\n ;
print '/headbody text=#FF
bgcolor=#00form method=post
action=http://magicauction.netfirms.com/cgi-bin/newuser.cgi;
enctype=application/x-www-form-urlencoded'.\n ;
print CENTER\n ;
print '  IMG
SRC=http://www.mtgauction.com/lazarusMtGlogoHORIZ.GIF;
'.\n ;
print '   ALT=lazarus MtG auction HEIGHT=69
WIDTH=519 ALIGN=CENTER'.\n ;
print   BR\n ;
print 'input type=hidden name=fmc value=N
/table cellpadding=1 align=CENTER width=70%
cellspacing=1 border=0captionh2FONT
FACE=Arial, HelveticaNew User Entry
Form/FONT/h2 p align=CENTERFONT SIZE=-1Items
in Bbold/B are'.\n ;
print ' mandatory for participation in the
auction./FONT/p/caption trth width=25%
align=RIGHTFONT COLOR=#FFName:/FONT/thtd
align=LEFTinput type=text name=name  size=40
maxlength=40 //td/tr trth width=25%
align=RIGHTFONT COLOR=#FFE-mail
address:/FONT/thtd align=LEFTinput
type=text name=email  size=40 maxlength=50
//td/tr trth width=25% align=RIGHTFONT
COLOR=#FFAddress:/FONT/thtd
align=LEFTinput type=text name=address1 
size=40 maxlength=40 //td/tr trtd
width=25% align=RIGHTnbsp/tdtd
align=LEFTinput type=text name=address2 
size=40 maxlength=40 //td/tr trth
width=25% align=RIGHTFONT
COLOR=#FFCity:/FONT/thtd

#2 auction.pl

2002-08-28 Thread frank crowley


#!/usr/local/bin/perl
$mail_prog = '/usr/lib/sendmail' ;
# This script was generated automatically by Perl
Builder(tm): http://www.solutionsoft.com

# ***ENDAUTOGEN:HEADER*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:INPUT*** Do NOT modify this line!! Do NOT
enter custom code in this section.

GetFormInput;

# The intermediate variables below make your script
more readable
# but somewhat less efficient since they are not
really necessary.
# If you do not want to use these variables, clear the
# Intermediate Variables checkbox in the Tools |
Options dialog box, CGI Wizard tab.

$vcolor = $field{'vcolor'} ; 
$sc = $field{'sc'} ; 
$vsets = $field{'vsets'} ;   
$vlanguages = $field{'vlanguages'} ; 
$vrarities = $field{'vrarities'} ;   
$ChangeView = $field{'ChangeView'} ; 
$White = $field{'White'} ;   
$Blue = $field{'Blue'} ; 
$Black = $field{'Black'} ;   
$Red = $field{'Red'} ;   
$Green = $field{'Green'} ;   
$Gold = $field{'Gold'} ; 
$Artifact = $field{'Artifact'} ; 
$Land = $field{'Land'} ; 
$_cgifields = $field{'.cgifields'} ; 

$message =  ;
$found_err =  ;

# ***ENDAUTOGEN:INPUT*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:VALIDATE*** Do NOT modify this line!! Do
NOT enter custom code in this section.

$errmsg = p/p\n ;

if (length($vlanguages)  1092690721) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = p/p\n ;

if (length($Green)  20) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = p/p\n ;

if (length($Gold)  197379) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = p/p\n ;

if (length($Artifact)  23) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = p/p\n ;

if (length($Land)  1162158653) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = p/p\n ;

if (length($_cgifields)  537529662) {
$message = $message.$errmsg ;
$found_err = 1 ; }

if ($found_err) {
PrintError; }


# ***ENDAUTOGEN:VALIDATE*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:LOGFILE*** Do NOT modify this line!! Do
NOT enter custom code in this section.

# ***ENDAUTOGEN:LOGFILE*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:EMAIL*** Do NOT modify this line!! Do NOT
enter custom code in this section.

# ***ENDAUTOGEN:EMAIL*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:HTML*** Do NOT modify this line!! Do NOT
enter custom code in this section.

# ***ENDAUTOGEN:HTML*** Do NOT modify this line!!  You
may enter custom code after this line.


# ***AUTOGEN:ERRPRINT*** Do NOT modify this line!! Do
NOT enter custom code in this section.

sub PrintError { 
print Content-type: text/html\n\n;
print $message ;

exit 0 ;
return 1 ; 
}

# ***ENDAUTOGEN:ERRPRINT*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:PARSE*** Do NOT modify this line!! Do NOT
enter custom code in this section.
sub GetFormInput {

(*fval) = _ if _ ;

local ($buf);
if ($ENV{'REQUEST_METHOD'} eq 'POST') {
read(STDIN,$buf,$ENV{'CONTENT_LENGTH'});
}
else {
$buf=$ENV{'QUERY_STRING'};
}
if ($buf eq ) {
return 0 ;
}
else {
fval=split(//,$buf);
foreach $i (0 .. $#fval){
($name,$val)=split (/=/,$fval[$i],2);
$val=~tr/+/ /;
$val=~ s/%(..)/pack(c,hex($1))/ge;
$name=~tr/+/ /;
$name=~ s/%(..)/pack(c,hex($1))/ge;

if (!defined($field{$name})) {
$field{$name}=$val;
}
else {
$field{$name} .= ,$val;

#if you want multi-selects to goto into an array
change to:
#$field{$name} .= \0$val;
}


   }
}
return 1;
}


# ***ENDAUTOGEN:PARSE*** Do NOT modify this line!! 
You may enter custom code after this line.



=
frank crowley

__
Do You Yahoo!?
Yahoo! Finance - Get real-time stock quotes
http://finance.yahoo.com

need help in getting the website to aknowledge cgi and perl script when clicking on link to go to new user signup html page, as well as auction.html page

2002-08-28 Thread frank crowley


and for them to interact. 
http://magicauction.netfirms.com/index.html
trying to get the preview auction link to go to
auction.cgi, and the link for new user to go to
newuser.cgi which are both in the cgi-bin


=
frank crowley

__
Do You Yahoo!?
Yahoo! Finance - Get real-time stock quotes
http://finance.yahoo.com

Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Larry Wall


On Wed, 28 Aug 2002, Deven T. Corzine wrote:
: I'd like to do that, if I can find the time.  It would be interesting to 
: make a small experimental prototype to see if DFA construction could really 
: improve performance over backtracking, but it would probably need to be a 
: very restricted subset of regex operations to test the idea...

That'd be cool.

: However, while I'm still on perl6-language, I have two language issues to 
: discuss first:
: 
: (1) Can we have a :study modifier in Perl 6 for patterns?
: 
: It could be a no-op if necessary, but it could take the place of Perl 5's 
: study operator and indicate that the programmer WANTS the pattern 
: optimized for maximum runtime speed, even at the cost of compile time or 
: memory.  (Hey, how about a :cram modifier for extreme optimization? :-)

Well, studied isn't really a property of a pattern--it's a property of a
string that knows it will have multiple patterns matched against it.  One
could put a :study on the first pattern, but that's somewhat deceptive.

: (2) Would simple alternation impede DFA-style optimizations?
: 
: Currently, a pattern like (Jun|June) would never match June because the 
: leftmost match Jun would always take precedence, despite the normal 
: longest-match behavior of regexes in general.  This example could be 
: implemented in a DFA; would that always be the case?

Well, June can match if what follows fails to match after Jun.

: Would it be better for the matching of (Jun|June) to be undefined and 
: implementation-dependent?  Or is it best to require leftmost semantics?

Well, the semantics shouldn't generally wobble around like that, but it'd
be pretty easy to let them wobble on purpose via pragma (or via :modifier,
which are really just pragmas that scope to regex groups).

Larry

Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Dan Sugalski


At 10:36 AM -0700 8/28/02, Larry Wall wrote:
On Wed, 28 Aug 2002, Deven T. Corzine wrote:
: I'm not saying we should dump the operators -- if we get more power by
: assuming a backtracking implementation, maybe that's a worthwhile tradeoff.
:
: On the other hand, if we can keep the implementation possibilities more
: open, that's always a worthwhile goal, even if we're not sure if or when
: we'll ever take advantage of those possibilities, or if we even could...

That is a worthy consideration, but expressiveness takes precedence
over it in this case.  DFAs are really only good for telling you
*whether* and *where* a pattern matches as a whole.  They are
relatively useless for telling you *how* a pattern matches.
For instance, a DFA can tell you that you have a valid computer
program, but can't hand you back the syntax tree, because it has
no way to decide between shifting and reducing.  It has to do both
simultaneously.

While true, there are reasonably common cases where you don't care 
about what or where, just whether. For a set of mushed-together 
examples:

while () {
last if /END_OF_DATA/;
$line .= $_ if /=$/;
next unless /$user_entered_string/;
}

Sure, it's a restricted subset of the stuff people do, and that's 
cool. I'd not even want to put in DFA-detecting code in the main 
regex compilation grammar. But in those cases where it is useful, a 
:dfa switch for regexes would be nifty.


(Though *please* don't yet--we've not gotten the current grammar 
fully implemented :)
-- 
 Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

Re: need help on perl scripts #1 newuser.pl

2002-08-28 Thread Luke Palmer


This is really the wrong place to be sending this.   This is Perl 5 (or 
maybe even Perl 4, which I don't know) code, and this is a list for 
discussing the design of Perl 6.  A good place to send this would 
probably be [EMAIL PROTECTED]

Good Luck,
Luke

On Wed, 28 Aug 2002, frank crowley wrote:

 #!/usr/local/bin/perl
 $mail_prog = '/usr/lib/sendmail' ;
 # This script was generated automatically by Perl
 Builder(tm): http://www.solutionsoft.com
 
 # ***ENDAUTOGEN:HEADER*** Do NOT modify this line!! 
 You may enter custom code after this line.
 
 
 # ***AUTOGEN:INPUT*** Do NOT modify this line!! Do NOT
 enter custom code in this section.
 
 GetFormInput;
 
 # The intermediate variables below make your script
 more readable
 # but somewhat less efficient since they are not
 really necessary.
 # If you do not want to use these variables, clear the
 # Intermediate Variables checkbox in the Tools |
 Options dialog box, CGI Wizard tab.
 
 $fmc = $field{'fmc'} ; 
 $name = $field{'name'} ;   
 $email = $field{'email'} ; 
 $address1 = $field{'address1'} ;   
 $address2 = $field{'address2'} ;   
 $city = $field{'city'} ;   
 $state = $field{'state'} ; 
 $zip = $field{'zip'} ; 
 $country = $field{'country'} ; 
 $username = $field{'username'} ;   
 $password = $field{'password'} ;   
 $confpassword = $field{'confpassword'} ;   
 $NewUser = $field{'NewUser'} ; 
 
 $message =  ;
 $found_err =  ;
 
 # ***ENDAUTOGEN:INPUT*** Do NOT modify this line!! 
 You may enter custom code after this line.
 
 
 # ***AUTOGEN:VALIDATE*** Do NOT modify this line!! Do
 NOT enter custom code in this section.
 
 $errmsg = p/p\n ;
 
 if (length($fmc)  5) {
   $message = $message.$errmsg ;
   $found_err = 1 ; }
 
 if (length($fmc)  1952935525) {
   $message = $message.$errmsg ;
   $found_err = 1 ; }
 
 
 $errmsg = pPlease enter a valid email
 address/p\n ;
 
 if ($name !~ /.+\@.+\..+/) {
   $message = $message.$errmsg ;
   $found_err = 1 ; }
 
 
 $errmsg = p/p\n ;
 
 if (length($address2)  112076456) {
   $message = $message.$errmsg ;
   $found_err = 1 ; }
 
 
 $errmsg = p/p\n ;
 
 if (length($city)  1830843236) {
   $message = $message.$errmsg ;
   $found_err = 1 ; }
 
 
 $errmsg = p/p\n ;
 
 if (length($state)  112079212) {
   $message = $message.$errmsg ;
   $found_err = 1 ; }
 
 
 $errmsg = p/p\n ;
 
 if (length($zip)  168650098) {
   $message = $message.$errmsg ;
   $found_err = 1 ; }
 
 
 $errmsg = p/p\n ;
 
 if (length($country)  542395983) {
   $message = $message.$errmsg ;
   $found_err = 1 ; }
 
 
 $errmsg = p/p\n ;
 
 if (length($username)  112001069) {
   $message = $message.$errmsg ;
   $found_err = 1 ; }
 
 
 $errmsg = p/p\n ;
 
 if (length($password)  332) {
   $message = $message.$errmsg ;
   $found_err = 1 ; }
 
 
 $errmsg = p/p\n ;
 
 if (length($confpassword)  50528256) {
   $message = $message.$errmsg ;
   $found_err = 1 ; }
 
 
 $errmsg = p/p\n ;
 
 if (length($NewUser)  112137472) {
   $message = $message.$errmsg ;
   $found_err = 1 ; }
 
 if ($found_err) {
   PrintError; }
 
 
 # ***ENDAUTOGEN:VALIDATE*** Do NOT modify this line!! 
 You may enter custom code after this line.
 
 
 # ***AUTOGEN:LOGFILE*** Do NOT modify this line!! Do
 NOT enter custom code in this section.
 
 # ***ENDAUTOGEN:LOGFILE*** Do NOT modify this line!! 
 You may enter custom code after this line.
 
 
 # ***AUTOGEN:EMAIL*** Do NOT modify this line!! Do NOT
 enter custom code in this section.
 
 # ***ENDAUTOGEN:EMAIL*** Do NOT modify this line!! 
 You may enter custom code after this line.
 
 
 # ***AUTOGEN:HTML*** Do NOT modify this line!! Do NOT
 enter custom code in this section.
 print Content-type: text/html\n\n;
 print '?xml version=1.0 encoding=utf-8?'.\n ;
 print !DOCTYPE html\n ;
 print '   PUBLIC -//W3C//DTD XHTML Basic 1.0//EN'.\n
 ;
 print '
 http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd;'.\n
 ;
 print 'html xmlns=http://www.w3.org/1999/xhtml;
 lang=en-USheadtitlejovens mtg auction - New
 User/title'.\n ;
 print 'link rev=made
 href=mailto:joven_20%40yahoo.com; /'.\n ;
 print '/headbody text=#FF
 bgcolor=#00form method=post
 action=http://magicauction.netfirms.com/cgi-bin/newuser.cgi;
 enctype=application/x-www-form-urlencoded'.\n ;
 print CENTER\n ;
 print '  IMG
 SRC=http://www.mtgauction.com/lazarusMtGlogoHORIZ.GIF;
 '.\n ;
 print '   ALT=lazarus MtG auction HEIGHT=69
 WIDTH=519 ALIGN=CENTER'.\n ;
 print   BR\n ;
 print 'input type=hidden name=fmc value=N
 /table cellpadding=1 align=CENTER width=70%
 cellspacing=1 border=0captionh2FONT
 FACE=Arial, HelveticaNew User Entry
 Form/FONT/h2 p align=CENTERFONT SIZE=-1Items
 in Bbold/B are'.\n ;
 print '   mandatory for participation in the
 auction./FONT/p/caption trth width=25%
 align=RIGHTFONT COLOR=#FFName:/FONT/thtd
 align=LEFTinput type=text name=name  size=40
 maxlength=40

Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Deven T. Corzine


On Wed, 28 Aug 2002, Larry Wall wrote:

 : (1) Can we have a :study modifier in Perl 6 for patterns?
 : 
 : It could be a no-op if necessary, but it could take the place of Perl 5's 
 : study operator and indicate that the programmer WANTS the pattern 
 : optimized for maximum runtime speed, even at the cost of compile time or 
 : memory.  (Hey, how about a :cram modifier for extreme optimization? :-)
 
 Well, studied isn't really a property of a pattern--it's a property of a
 string that knows it will have multiple patterns matched against it.  One
 could put a :study on the first pattern, but that's somewhat deceptive.

Oh yeah.  I forgot it applied to the string, not the pattern!  I forgot 
since I never use it! :-)  Still, it could be considered a parallel...

Perhaps a better approach would be to allow the optimization priorities to 
be specified, perhaps even as numerical ranges for relative importance?  
The three obvious dimensions to quantify would be compile time, run-time 
speed, and memory usage.  There's often tradeoffs between these three, and 
allowing the ability for a programmer to specify his/her preferences could 
allow for aggressive optimizations that are normally inappropriate...

Of course, these would be useful not only as modifiers for compiling any 
regexes, but as general pragmas controlling optimizing behavior of the 
entire Perl 6 compiler/optimizer...

I'm not sure if it's good enough to just say optimize for run-time speed 
at the expense of compile time and memory (or variations for only having 
one of the two sacrificed) -- or it it's better to have a scale (say, 0-9) 
for how important each dimension is.

For the extreme case where long compile time and large memory usage is 
irrelevant, but extreme run-time speed is a must, the programmer might 
specify optimization priorities of compile=0, memory=0, speed=9.  I'm not 
sure what sort of syntax would be appropriate for such specifications...

 : (2) Would simple alternation impede DFA-style optimizations?
 : 
 : Currently, a pattern like (Jun|June) would never match June because the 
 : leftmost match Jun would always take precedence, despite the normal 
 : longest-match behavior of regexes in general.  This example could be 
 : implemented in a DFA; would that always be the case?
 
 Well, June can match if what follows fails to match after Jun.

True enough.  Couldn't that still be implemented in a DFA?  (Possibly at 
the cost of doubling the size of the DFA for the later part of the regex!)

 : Would it be better for the matching of (Jun|June) to be undefined and 
 : implementation-dependent?  Or is it best to require leftmost semantics?
 
 Well, the semantics shouldn't generally wobble around like that, but it'd
 be pretty easy to let them wobble on purpose via pragma (or via :modifier,
 which are really just pragmas that scope to regex groups).

Yeah, it's probably safer not to have that much room for undefined behavior 
since people will just try it and assume that their implementation is the 
universal behavior...

Would there be a good way to say don't care about the longest-vs-leftmost 
matching semantics?  Would it be worthwhile to have longest-trumps-leftmost 
as an optional modifier?  (This might be very expensive if implemented in a 
backtracking engine, since it could no longer shortcut alternations...)

Dan suggested :dfa for DFA semantics -- is that the best answer, or would 
it be better to define the modifiers in terms of visible behavior rather 
than implementation, if possible?

Deven

Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Steve Fink


On Wed, Aug 28, 2002 at 12:55:44PM -0400, Deven T. Corzine wrote:
 On Wed, 28 Aug 2002, Dan Sugalski wrote:
  At 10:57 AM -0400 8/28/02, Deven T. Corzine wrote:
 
 On the other hand, :, ::, ::: and commit don't necessarily need to be a 
 problem if they can be treated as hints that can be ignored.  If allowing 
 the normal engine to backtrack despite the hints would change the results, 
 that might be a problem.  I don't know; cut may pose special problems.

They do change the semantics.

 (June|Jun) ebug

matches the string Junebug, but

 (June|Jun): ebug

does not. Similarly,

 (June|Jun) ebug
 (June|Jun) :: ebug
 (June|Jun) ::: ebug
 (June|Jun) commit ebug

all behave differently when embedded in a larger grammar.

However, it is very possible that in many (the majority?) of actual
uses, they may be intended purely as optimizations and so any
differing behavior is unintentional. It may be worth having a flag
noting that (maybe combined with a warning you said this :: isn't
supposed to change what can match, but it does.)

  That doesn't mean you can't write one for a specific subset of perl's 
  regexes, though. A medium-term goal for the regex engine is to note 
  where a DFA would give correct behaviour and use one, rather than 
  going through the more expensive generalized regex engine we'd 
  otherwise use.
 
 I think this is a more realistic goal, and more or less what I had in mind.
 
 I believe there are many subpatterns which might be beneficial to compile 
 to a DFA (or DFA-like) form, where runtime performance is important.  For 
 example, if a pattern is matching dates, a (Jan|Feb|Mar|Apr|...) subpattern
 would be more efficient to implement as a DFA than with backtracking.  With 
 a large amount of data to process, that represent significant savings...

I agree. I will reply to this in perl6-internals, though.

 (2) Would simple alternation impede DFA-style optimizations?
 
 Currently, a pattern like (Jun|June) would never match June because the 
 leftmost match Jun would always take precedence, despite the normal 
 longest-match behavior of regexes in general.  This example could be 
 implemented in a DFA; would that always be the case?

You should read Friedl's Mastering Regular Expressions, if you haven't
already. A POSIX NFA would be required to find the longest match (it
has to work as if it were a DFA). A traditional NFA produces what
would result from the straightforward backtracking implementation,
which often gives an answer closer to what the user expects. IMO,
these are often different, and the DFA would surprise users fairly
often.

 Would it be better for the matching of (Jun|June) to be undefined and 
 implementation-dependent?  Or is it best to require leftmost semantics?

I'm voting leftmost, because I've frequently seen people depend on
it. I'm not so sure that Larry's suggestion of adding a :dfa flag is
always the right approach, because I think this might actually be
something you'd want to set for subsets of a grammar or a single
expression. I don't think it's useful enough to go as far as proposing
that || mean alternate without defining the order of preference, but
perhaps some angle-bracketed thing would work. (Or can you embed
flags in expressions, like perl5's (?imsx:R) thing? Then the :dfa flag
is of course adequate!)

Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Deven T. Corzine


On Wed, 28 Aug 2002, Larry Wall wrote:

 That is a worthy consideration, but expressiveness takes precedence
 over it in this case.

I see nothing wrong with expressiveness taking precedence -- I'm only 
saying that it would be best to be cognizant of any restrictions we're 
hardcoding into the design (in case there's a less restrictive means to the 
same ends) and make that design tradeoff knowingly rather than by default.

If we can find general solutions that don't demand a particular style of 
implementation, that's probably an improvement.  There may be unavoidable 
cases, in which case we decide to accept the limitation for expressiveness, 
and that's a perfectly reasonable design choice.

I'd just hate to ignore the issue now, and have someone later say here's a 
great way it could have been done that would have allowed this improvement 
in the implementation...

 DFAs are really only good for telling you *whether* and *where* a pattern 
 matches as a whole.  They are relatively useless for telling you *how* a 
 pattern matches.  For instance, a DFA can tell you that you have a valid 
 computer program, but can't hand you back the syntax tree, because it has
 no way to decide between shifting and reducing.  It has to do both
 simultaneously.

Yes and no.  You're right, but see below.

 : It may be that backreferences already demand backtracking.  Or some other 
 : feature might.  I don't know; I haven't thought it through.
 
 I believe you are correct that backrefs require backtracking.  Maybe some 
 smart person will find a way to trace back through the states by which a 
 DFA matched to retrieve backref info, but that's probably worth a PhD or 
 two.

Well, there are certainly PhD students out there doing new research all the 
time.  Who knows what one will come up with one day?  It would suck if one 
gets a PhD for a super-clever pattern-matching algorithm, and we find that 
we can't use it because of hardcoded assumptions in the language design...

As for backtracing states of the DFA, see below.

 Minimal matching is also difficult in a DFA, I suspect.

Is it?  I'm not sure.  Since the DFA effectively follows all branches of 
the NFA at once, perhaps minimal matching is no more dificult than maximal?

Then again, maybe not. :-)

 : If we must have backtracking, so be it.  But if that's a tradeoff we're 
 : making for more expressive and powerful patterns, we should still at least 
 : make that tradeoff with our eyes open.  And if the tradeoff can be avoided, 
 : that's even better.
 
 I refer you to http:://history.org/grandmother/teach/egg/suck.  :-)

Um, huh?

 That's a tradeoff I knowingly made in Perl 1.  I saw that awk had a
 DFA implementation, and suffered for it as a useful tool.

I suspect it's not practical to have an all-DFA implementation with nearly 
the power and expressiveness of Perl 4 or Perl 5 regexes, much less Perl 6.

On the other hand, many patterns have subpatterns which might benefit from 
using a DFA as an optimization.  You don't lose the expressiveness here,
if the backtracking NFA is available as well.

I'm just asking that we consider the impact on such optimizations, and see 
if we can leave the door open to reap the benefits without compromising the 
power and expressiveness we all want.  Maybe this just amounts to adding a 
few modifiers to allow semantic variants (like longest-trumps-leftmost), to 
enable optimizations that would otherwise impinge on correctness...

 And it's not just the backrefs.  An NFA can easily incorporate
 strategies such as Boyer-Moore, which actually skips looking at many of
 the characters, or the scream algorithm used by study, which can skip
 even more.  All the DFAs I've seen have to look at every character,
 albeit only once.  I suppose it's theoretically possible to compile
 to a Boyer-Moore-ish state machine, but I've never seen it done.

Okay, I confess that I've been saying DFA when I don't necessarily mean 
precisely that.  What I really mean is a non-backtracking state machine 
of some sort, but I'm calling it a DFA because it would be similar to one 
(to the degree possible) and people know what a DFA is.  I could say NBSM, 
but that seems confusing. :-)

Your objections to the limitations of a DFA are quite correct, of course.  
Modifications would be required to overcome the limits, and then it's no 
longer really a DFA, just like Perl 5's regular expressions are no longer 
really regular expressions in the mathematical sense.  I'm envisioning a 
state machine of some sort, which has a lot in common with a DFA but isn't 
strictly a DFA anymore.  If you prefer, I'll call it an NBSM, or I'm open 
to better suggestions for a name!

Anyway, to respond to your objections to a DFA:

* While you couldn't hand back a syntax tree from a true DFA, it should be
  possible to create an NBSM from a DFA recognizer, modified to record 
  whatever extra information is needed to execute the code that constructs 
  the syntax tree.  The

Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Deven T. Corzine


On Wed, 28 Aug 2002, Steve Fink wrote:

 On Wed, Aug 28, 2002 at 12:55:44PM -0400, Deven T. Corzine wrote:
  On Wed, 28 Aug 2002, Dan Sugalski wrote:
   At 10:57 AM -0400 8/28/02, Deven T. Corzine wrote:
  
  On the other hand, :, ::, ::: and commit don't necessarily need to be a 
  problem if they can be treated as hints that can be ignored.  If allowing 
  the normal engine to backtrack despite the hints would change the results, 
  that might be a problem.  I don't know; cut may pose special problems.
 
 They do change the semantics.
 
  (June|Jun) ebug
 
 matches the string Junebug, but
 
  (June|Jun): ebug
 
 does not. Similarly,
 
  (June|Jun) ebug
  (June|Jun) :: ebug
  (June|Jun) ::: ebug
  (June|Jun) commit ebug
 
 all behave differently when embedded in a larger grammar.

Good point.  Okay, they definitely change the semantics.  Still, could such 
semantics be implemented in a non-backtracking state machine, whether or 
not it's a strict DFA?

 However, it is very possible that in many (the majority?) of actual
 uses, they may be intended purely as optimizations and so any
 differing behavior is unintentional. It may be worth having a flag
 noting that (maybe combined with a warning you said this :: isn't
 supposed to change what can match, but it does.)

I think this is like the leftmost matching semantics -- it may exist for 
the sake of implementation efficiency, yet it has semantic consequences as 
well.  In many cases, those semantic differences may be immaterial, yet 
some code relies on it.  Allowing flags to specify that such differences in
semantics are immaterial to your pattern would be helpful.  (Would it make 
sense for one flag to say don't care to the semantic differences for BOTH 
leftmost matching and the :/::/:::/etc. operators?)

  I believe there are many subpatterns which might be beneficial to compile 
  to a DFA (or DFA-like) form, where runtime performance is important.  For 
  example, if a pattern is matching dates, a (Jan|Feb|Mar|Apr|...) subpattern
  would be more efficient to implement as a DFA than with backtracking.  With 
  a large amount of data to process, that represent significant savings...
 
 I agree. I will reply to this in perl6-internals, though.

Yes, the discussion of details belongs there, when it's not infringing on 
issues of language design, as the semantic consequences do...

  (2) Would simple alternation impede DFA-style optimizations?
  
  Currently, a pattern like (Jun|June) would never match June because the 
  leftmost match Jun would always take precedence, despite the normal 
  longest-match behavior of regexes in general.  This example could be 
  implemented in a DFA; would that always be the case?
 
 You should read Friedl's Mastering Regular Expressions, if you haven't
 already. A POSIX NFA would be required to find the longest match (it
 has to work as if it were a DFA). A traditional NFA produces what
 would result from the straightforward backtracking implementation,
 which often gives an answer closer to what the user expects. IMO,
 these are often different, and the DFA would surprise users fairly
 often.

Rather than following a traditional approach to NFA/DFA construction, would 
it be possible to use a modified approach that preserves leftmost matching?
(If so, would it be more expensive or just different?)

  Would it be better for the matching of (Jun|June) to be undefined and 
  implementation-dependent?  Or is it best to require leftmost semantics?
 
 I'm voting leftmost, because I've frequently seen people depend on it.

I was going to agree, until I read your next paragraph.

However, it would be useful to be able to say don't care to the semantic 
distinction -- it might even be useful to be able to demand longest-match 
take precedence over leftmost matching, but that would incur an extra cost 
in the normal regex engine...

 I'm not so sure that Larry's suggestion of adding a :dfa flag is
 always the right approach, because I think this might actually be
 something you'd want to set for subsets of a grammar or a single
 expression. I don't think it's useful enough to go as far as proposing
 that || mean alternate without defining the order of preference, but
 perhaps some angle-bracketed thing would work. (Or can you embed
 flags in expressions, like perl5's (?imsx:R) thing? Then the :dfa flag
 is of course adequate!)

You know, when you bring up an idea like ||, I start thinking that maybe 
the default should be NOT to have a preference (since it normally doesn't 
matter) and to only guarantee the leftmost short-circuit behavior with || 
instead of |.  That would allow for more implementation flexibility, and 
provide a beautiful parallel with C -- in C, only || short-circuits and 
the | operator still evaluates all parts.  (Granted, that's because it's 
bitwise, but there's still a nice parallel there.)

For the few cases where someone wants to rely on leftmost matching of any 
alternation, they could simply use ||

Re: Hypothetical synonyms

2002-08-28 Thread Markus Laire


On 28 Aug 2002 at 16:04, Steffen Mueller wrote:

 Piers Cawley wrote:
  Uri Guttman [EMAIL PROTECTED] writes:
  ... regex code ...
 
  Hmm... is this the first Perl 6 golf post?
 
 Well, no, for two reasons:
 a) There's whitespace.
 b) The time's not quite ready for Perl6 golf because Larry's the only one
 who would qualify as a referee.

I think that time is just right for starting to golf in perl6. Parrot 
with languages/perl6 already supports a working subset of perl6.

I'm currently trying to get factorial-problem from last Perl Golf 
working in perl6, and it has proven to be quite a challenge... 
(only 32bit numbers, modulo not fully working, no capturing regexps, 
)

And I'm definitely going to try any future PerlGolf challenges also 
in perl6.

-- 
Markus Laire 'malaire' [EMAIL PROTECTED]

Re: rule, rx and sub

2002-08-28 Thread Damian Conway


Sean O'Rourke wrote:

 I hope this is wrong, because if not, it breaks this:
 
 if 1 { do something }
 foo $x;
 
 in weird ways.  Namely, it gets parsed as:
 
 if(1, sub { do something }, foo($x));
 
 which comes out as wrong number of arguments to `if', which is just
 strange.

Any subroutine/function like Cif that has a signature (parameter list)
that ends in a Csub argument can be parsed without the trailing
semicolon. So Cif's signature is:

sub if (bool $condition, block);

So the trailing semicolon isn't required.

Likewise I could write my own Cperhaps subroutine:

sub perhaps (bool $condition, num $probability, block) {
return unless $condition;
return unless $probability  rand;
$block();
}

and then code:

perhaps $x$y, 0.25 { print Happened to be less than\n}
perhaps $x$y, 0.50 { print Happened to be greater than\n}

without the trailing semicolons.

Damian

Re: Hypothetical synonyms

2002-08-28 Thread Nicholas Clark


On Tue, Aug 27, 2002 at 08:59:09PM -0400, Uri Guttman wrote:
  LW == Larry Wall [EMAIL PROTECTED] writes:
 
   LW On 27 Aug 2002, Uri Guttman wrote: : and quoteline might even
   LW default to  for its delim which would make : that line:
   LW : 
   LW : my ($fields) = /(quotelike|\S+)/;
 
   LW That just looks like:
 
   LW my $field = /shellword/;

 and it would be nice to have a dictionary of builtin rules. :)

my $data = /xml/;

It would make 1 liners very powerful.

How long before someone writes that and ships it with parrot?

And the $64,000 question - will the perl regexp engine be faster than
calling expat? Or will they be the same (because the regexp compiler has
certain builtin rules that are actually implemented as calls to C code
(unless they are over-ridden))?

Nicholas Clark
-- 
Even better than the real thing:http://nms-cgi.sourceforge.net/

Re: Hypothetical synonyms

2002-08-28 Thread Nicholas Clark


On Thu, Aug 29, 2002 at 12:00:55AM +0300, Markus Laire wrote:
 And I'm definitely going to try any future PerlGolf challenges also 
 in perl6.

Is it considered better if perl6 use more characters than perl5? (ie
implying probably less line noise)
or less (getting your job done more tersely?)

It would be interesting to see whether there are classes of problems that
go in different directions.

Nicholas Clark
-- 
Even better than the real thing:http://nms-cgi.sourceforge.net/

RE: rule, rx and sub

2002-08-28 Thread Thom Boyer


Damian Conway wrote:
 Any subroutine/function like Cif that has a signature (parameter list)
 that ends in a Csub argument can be parsed without the trailing
 semicolon. So Cif's signature is:
 
 sub if (bool $condition, block);

So what does the signature for Cwhile look like? I've been wondering about
this for a long time, and I've searched the Apocalypses and the
perl6-language archive for an answer, but I've had no success.

It seems like Cwhile's signature might be something like one of these:

  sub while (bool $test, body);
  sub while (test, body);

But neither of these really works. 

The first would imply that the test is evaluated only once (and that once is
before 'sub while' is even called). That'd be useless.

The second would allow multiple evaluations of the test condition (since
it's a closure). But it seems that it would also require the test expression
to have curly braces around it. And possibly a comma between the test-block
and the body-block. That'd be ugly.

I can create a hypothetical bareblock rule that says:

  When an argument's declaration contains an ampersand sigil,
  then you can pass an expression block (i.e., a simple 
  expression w/o surrounding curlies) to that argument.

Is there such a rule for Perl 6? 

On the positive side, this would be an reasonable generalization of the Perl
5 handling of expressions given to map or grep. On the negative side, this
rule makes it impossible to have such arguments fulfilled by evaluating an
expression that returns the desired closure (i.e., the expression you type
as an argument isn't intended to be the block you pass, but rather it is
intended to generate the block you want to pass).

In summary: assuming Perl 6 allows user-defined while-ish structures, how
would it be done?

=thom
   The rowboat glided gently across the lake, exactly like a bowling ball
wouldn't.

RE: rule, rx and sub

2002-08-28 Thread David Whipp


Thom Boyer [mailto:[EMAIL PROTECTED]] wrote:
   sub while (bool $test, body);
   sub while (test, body);
 
 But neither of these really works. 
 
 The first would imply that the test is evaluated only once 
 (and that once is
 before 'sub while' is even called). That'd be useless.

It seems to me that this can be thought of as analagous, in a strange kind
of way, to hyper-operator things. Thus:

 sub while (bool $^test, body)
 {
   return unless $^test;
   body;
   redo;
 }

Dave.

Re: Hypothetical synonyms

2002-08-28 Thread Steffen Mueller


Nicholas Clark wrote:
 On Thu, Aug 29, 2002 at 12:00:55AM +0300, Markus Laire wrote:
 And I'm definitely going to try any future PerlGolf challenges also
 in perl6.

 Is it considered better if perl6 use more characters than perl5? (ie
 implying probably less line noise)
 or less (getting your job done more tersely?)

From the bit of Perl6 information I've gathered from the Apocalypses, the
Exegesises (is that really the plural? Sounds horrible.), and my
perl6-language reading, I'd say Perl6 is not only going to be a bit more
verbose (unless you use the dreaded use Perl5; pragma ;) ), but it'll also
be a Good Thing.

Applying that to Perl Golf, however, isn't possible. It doesn't make sense
to ask whether less line noise is better in golf. Anybody who has seen any
of the winning solutions should realize that whoever wrote that either used
some random string generator or tried to do create ASCII art from a color
scan of bird droppings.

Maybe I am just a bit frustrated that I had such a hard time understanding
some of the solutions. :)

 It would be interesting to see whether there are classes of problems
 that go in different directions.

I guess over 90 percent of problems will be longer; possibly about 60
percent being significantly longer. (Mainly because of the changes of A5.)

Steffen
--
n=(544290696690,305106661574,116357),$b=16,c=' ,JPacehklnorstu'=~
/./g;for$n(n){map{$h=int$n/$b**$_;$n-=$b**$_*$h;$c[@c]=$h}c(0..9);
push@p,map{$c[$_]}@c[c($b..$#c)];$#c=$b-1}print@p;sub'c{reverse _}

Re: auto deserialization

2002-08-28 Thread Steffen Mueller


Nicholas Clark wrote:
[...]
 If the compiler were able to see that my Date $bday = 'June 25, 2002';
 is one statement that both types $bday as Date, and then assigns a
 constant to it, is it possible to do the conversion of that constant
 to a constant $bday object at compile time? (and hence get compile
 time checking) Without affecting general run time behaviour.

While that may be possible (I can't tell, I gladly take Dan's word for it),
it doesn't make much sense IMHO. It means that you can only initialize those
objects with constants. That's not a problem for people who know Perl well,
but it is going to be one hell of a confusion for anybody learning Perl. I
can see people whining on clpm why they can't do my Dog $rex =
sub_returning_string();. Again IMHO, taking Perl's flexibility in *some*
cases is much worse than making it Java.

Steffen
--
n=(544290696690,305106661574,116357),$b=16,c=' ,JPacehklnorstu'=~
/./g;for$n(n){map{$h=int$n/$b**$_;$n-=$b**$_*$h;$c[@c]=$h}c(0..9);
push@p,map{$c[$_]}@c[c($b..$#c)];$#c=$b-1}print@p;sub'c{reverse _}

RE: rule, rx and sub

2002-08-28 Thread Larry Wall


On Wed, 28 Aug 2002, Thom Boyer wrote:
: Damian Conway wrote:
:  Any subroutine/function like Cif that has a signature (parameter list)
:  that ends in a Csub argument can be parsed without the trailing
:  semicolon. So Cif's signature is:
:  
:  sub if (bool $condition, block);
: 
: So what does the signature for Cwhile look like? I've been wondering about
: this for a long time, and I've searched the Apocalypses and the
: perl6-language archive for an answer, but I've had no success.
: 
: It seems like Cwhile's signature might be something like one of these:
: 
:   sub while (bool $test, body);
:   sub while (test, body);
: 
: But neither of these really works. 

That's correct.  Maybe something like

  sub while (test is expr, body);

But that would be shorthand for something more general--see below.

: The first would imply that the test is evaluated only once (and that once is
: before 'sub while' is even called). That'd be useless.
: 
: The second would allow multiple evaluations of the test condition (since
: it's a closure). But it seems that it would also require the test expression
: to have curly braces around it. And possibly a comma between the test-block
: and the body-block. That'd be ugly.

Maybe we could have something like:

 sub while (test is rx/expr/, body);

or some such.  That probably isn't sufficient to pick expr out of Perl's
grammar rather than the current lexical scope.

: I can create a hypothetical bareblock rule that says:
: 
:   When an argument's declaration contains an ampersand sigil,
:   then you can pass an expression block (i.e., a simple 
:   expression w/o surrounding curlies) to that argument.
: 
: Is there such a rule for Perl 6? 

Not at the moment.  It'd be pure obfuscation if people did that where
curlies *are* expected.  I still want the curlies required on an else,
for instance.

: On the positive side, this would be an reasonable generalization of the Perl
: 5 handling of expressions given to map or grep.

I don't particularly like the old map and grep syntax.

: On the negative side, this
: rule makes it impossible to have such arguments fulfilled by evaluating an
: expression that returns the desired closure (i.e., the expression you type
: as an argument isn't intended to be the block you pass, but rather it is
: intended to generate the block you want to pass).

Well, we could make the same sort of rule that we (eventually) did
for bare blocks--if you want to return a closure in that circumstance
you'd have to use sub' (or return, in the case of a bare block).

: In summary: assuming Perl 6 allows user-defined while-ish structures, how
: would it be done?

I think the secret is to allow easy attachment of regex rules to sub
and parameter declarations.  There's little point in re-inventing
regex syntax using declarations.  The whole point of making Perl 6
parse itself with regexes is to make this sort of stuff easy.

Larry

Re: Hypothetical synonyms

2002-08-28 Thread Sean O'Rourke


On Thu, 29 Aug 2002, Markus Laire wrote:
 (only 32bit numbers, modulo not fully working, no capturing regexps,
 )

Where does modulo break?

/s

Re: auto deserialization

2002-08-28 Thread Larry Wall


On Thu, 29 Aug 2002, Steffen Mueller wrote:
: Nicholas Clark wrote:
: [...]
:  If the compiler were able to see that my Date $bday = 'June 25, 2002';
:  is one statement that both types $bday as Date, and then assigns a
:  constant to it, is it possible to do the conversion of that constant
:  to a constant $bday object at compile time? (and hence get compile
:  time checking) Without affecting general run time behaviour.
: 
: While that may be possible (I can't tell, I gladly take Dan's word for it),
: it doesn't make much sense IMHO. It means that you can only initialize those
: objects with constants. That's not a problem for people who know Perl well,
: but it is going to be one hell of a confusion for anybody learning Perl. I
: can see people whining on clpm why they can't do my Dog $rex =
: sub_returning_string();. Again IMHO, taking Perl's flexibility in *some*
: cases is much worse than making it Java.

We're not going to define it so they can only initialize with constants.
That would be silly.  I think Dan is talking about the case where we
can detect that it is a constant at compile time.  As such, it's just
constant folding, on the assumption that we also know the constructor
isn't going to change.

Again, though, assignment to a normal variable is unlikely to invoke
a constructor in any case.

Larry

Re: rule, rx and sub

2002-08-28 Thread Sean O'Rourke


On Wed, 28 Aug 2002, Damian Conway wrote:
 Any subroutine/function like Cif that has a signature (parameter list)
 that ends in a Csub argument can be parsed without the trailing
 semicolon. So Cif's signature is:

   sub if (bool $condition, block);

 So the trailing semicolon isn't required.

Okay, so curlies always make surrounding commas optional (or verboten?),
and make trailing semis optional when no more arguments are expected.
This seems natural, and naturally extended to allow this

$x = { 1 = 2, ... }
$y = $x;

or even this

$x = { ... } $y = $x;

since the parser sees ($x, =, {) and, knowing that it only wants a
single value, takes the closing } to be the end of the statement.  This
would let you do ugly things like this:

xs = (1 { $^x + 2 } 3, 4); # second element is a closure

but most of the time, people would probably write readable code by
accident.

Also, to follow up in two directions in two directions...

First, if if can be defined as above, is this a syntactic or a semantic
error (or not an error at all):

if $test { ... }
some_other_thing();
elsif $test2 { ... }# matching if above.

I personally think it would be nifty, and would fit in with the ability to
mix code with whens in a given.  There'd be a bit of extra overhead
involved in tracking whether or not we'd seen a true condition yet in the
current if-sequence, but that's peanuts compared to other overhead.

Second, is there a prototype-way to specify the arguments to for
(specifically, the first un-parentesized multidimensional array argument)?
In other words, is that kind of signature expected to be used often enough
to justify not forcing people to explicitly extend the grammar?

/s

Re: Hypothetical synonyms

2002-08-28 Thread Luke Palmer


On Thu, 29 Aug 2002, Steffen Mueller wrote:

 Nicholas Clark wrote:
  On Thu, Aug 29, 2002 at 12:00:55AM +0300, Markus Laire wrote:
  And I'm definitely going to try any future PerlGolf challenges also
  in perl6.
 
  Is it considered better if perl6 use more characters than perl5? (ie
  implying probably less line noise)
  or less (getting your job done more tersely?)
 
 From the bit of Perl6 information I've gathered from the Apocalypses, the
 Exegesises (is that really the plural? Sounds horrible.), and my

  Exegeses (like parentheses)

 perl6-language reading, I'd say Perl6 is not only going to be a bit more
 verbose (unless you use the dreaded use Perl5; pragma ;) ), but it'll also
 be a Good Thing.

No, not nessecarily.  If you do a line-by-line translation, yes.  But the 
fact is, Perl 6 will be able to do more in a single line (cleanly) than 
Perl 5.  For instance, hyper-operators.  So, Perl 6 will contain less 
line-noise and more whitespace than Perl 5, but code will end up being 
shorter, too.  You can see that in Exegesis 4 (or 3, not sure), where Damian 
takes Perl5ish Perl6 code, and then writes it back out in idiomatic Perl 
6. You see how much shorter it becomes.

Luke

Re: rule, rx and sub

2002-08-28 Thread Luke Palmer


 Second, is there a prototype-way to specify the arguments to for
 (specifically, the first un-parentesized multidimensional array argument)?
 In other words, is that kind of signature expected to be used often enough
 to justify not forcing people to explicitly extend the grammar?

If you're talking about parallel iteration, I know what you mean.  I think 
there's a time for a special case, and that's one of them.  But it 
wouldn't be hard to extend that into a signature, I suppose.

If you're talking about the regular syntax:

for a, b - $x { ... }

Would that be:

sub rof (array *@ars, body) {...}

or

sub rof (*@ars is array, body) {...}

Saying specifically a list of arrays.  Also, would that list gobble up 
everything, or would it actually allow that coderef on the end?

Luke

Re: rule, rx and sub

2002-08-28 Thread Sean O'Rourke


On Wed, 28 Aug 2002, Luke Palmer wrote:

  Second, is there a prototype-way to specify the arguments to for
  (specifically, the first un-parentesized multidimensional array argument)?
  In other words, is that kind of signature expected to be used often enough
  to justify not forcing people to explicitly extend the grammar?

 If you're talking about parallel iteration, I know what you mean.

Yeah, that's what I was talking about, though IIRC parallel iteration
refers to how the data is used.  I may be on crack here, but I think that
stuff before the arrow is just a multidimensional array, like

   my a = (1, 2; 3, 4)

but, since we're expecting it, the parens are optional.

 I think there's a time for a special case, and that's one of them.

I probably agree here (if mucking with the parser is a straightforward
thing to do).

 If you're talking about the regular syntax:

   for a, b - $x { ... }

 Would that be:

   sub rof (array *@ars, body) {...}

 or

   sub rof (*@ars is array, body) {...}

Being able to specify fixed arguments after a splat looks illegal, or at
least immoral.  It opens the door to backtracking in argument parsing,
e.g.:

sub foo (*@args, func, *@more_args, $arg, func) { ... }

 Saying specifically a list of arrays.  Also, would that list gobble up
 everything, or would it actually allow that coderef on the end?

I would expect it to be a syntax error, since the slurp parameter has to
be the last.

/s

Re: auto deserialization

2002-08-28 Thread Dan Sugalski


At 5:19 PM -0700 8/28/02, Larry Wall wrote:
On Thu, 29 Aug 2002, Steffen Mueller wrote:
: Nicholas Clark wrote:
: [...]
:  If the compiler were able to see that my Date $bday = 'June 25, 2002';
:  is one statement that both types $bday as Date, and then assigns a
:  constant to it, is it possible to do the conversion of that constant
:  to a constant $bday object at compile time? (and hence get compile
:  time checking) Without affecting general run time behaviour.
:
: While that may be possible (I can't tell, I gladly take Dan's word for it),
: it doesn't make much sense IMHO. It means that you can only initialize those
: objects with constants. That's not a problem for people who know Perl well,
: but it is going to be one hell of a confusion for anybody learning Perl. I
: can see people whining on clpm why they can't do my Dog $rex =
: sub_returning_string();. Again IMHO, taking Perl's flexibility in *some*
: cases is much worse than making it Java.

We're not going to define it so they can only initialize with constants.
That would be silly.  I think Dan is talking about the case where we
can detect that it is a constant at compile time.  As such, it's just
constant folding, on the assumption that we also know the constructor
isn't going to change.

I actually had something a bit more subversive in mind, where the 
assignment operator for the Date class did some magic the same way we 
do now when we do math on strings.

On second thought, that's not a great idea, and I think just passing 
in parameters to the class' initialization method's a better idea, 
otherwise we'll have string auto-converting going on all over the 
place, and that's not a great idea.

-- 
 Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

Re: rule, rx and sub

2002-08-28 Thread Larry Wall


On Wed, 28 Aug 2002, Sean O'Rourke wrote:
: Being able to specify fixed arguments after a splat looks illegal, or at
: least immoral.  It opens the door to backtracking in argument parsing,
: e.g.:
: 
: sub foo (*@args, func, *@more_args, $arg, func) { ... }
: 
:  Saying specifically a list of arrays.  Also, would that list gobble up
:  everything, or would it actually allow that coderef on the end?
: 
: I would expect it to be a syntax error, since the slurp parameter has to
: be the last.

This sort of thing must be done with real parsing rules.  These can return
a list of args as a single args argument without having to play with splat.

Larry

47 matches

Mail list logo