Re: Disappearing code

2003-01-12 Thread Ken Fox
Damian Conway wrote:

sub debug is immediate is exported (@message) {
return $debugging ?? { print $*STDERR: @message; } :: {;}
}


Won't @message need lazy evaluation? How will Perl know to
delay interpolation until the result of the macro is called
at run time?

- Ken




Re: right-to-left pipelines

2002-12-10 Thread Ken Fox
Damian Conway wrote:

For that reason, even if we can solve this puzzle, it might be far kinder
just to enforce parens.


I might be weird, but when I use parens to clarify code in Perl, I
like to use the Lisp convention:

  (method $object args)

Hopefully that will still work even if Perl 6 requires parens.

- Ken




Use CPAN to help define built-ins and globals?

2002-12-09 Thread Ken Fox
Smylers wrote:

Ken Fox wrote:

How about formalizing global namespace pollution with something like
the Usenet news group formation process?  Ship Perl 6 with a very
small number of global symbols and let it grow naturally.


If the initial release of Perl 6 doesn't have commonly-required
functions then people will write their own.  People will do these in
incompatible ways ...


You're taking that out of context. Ship the commonly required
functionality, but don't introduce new global symbols.

A global Cpush function:

  push @array, 1

A class Cpush method:

  push @array: 1

Methods work well with AUTOLOAD, so they probably don't require
Cuse statements. Anyways, I'd rather have Cuse statements than
globals. I know others disagree -- I even disagree when I'm
trying to write a one-liner on the command line.

Perl 6 is the community rewrite. One of the pillars of the
community is CPAN. Could CPAN help resolve simple library and
namespace issues? Adding Cpurge or Cpart is not a language
design issue.

- Ken




Re: purge: opposite of grep

2002-12-08 Thread Ken Fox
Damian Conway wrote:

sub part ($classifier, *@list) {



return @parts;
}


Given the original example

  (@foo,@bar,@zap) := part [ /foo/, /bar/, /zap/ ] @source;

this binds the contents of @parts to (@foo,@bar,@zap)? The
array refs in @parts are not flattened though. Is it correct
to think of flattening context as a lexical flattening? i.e.
only terms written with @ are flattened and the types of
the terms can be ignored?

BTW, if part were declared as an array method, the syntax
becomes

  @source.part [ /foo/, /bar/, /zap/ ]

or

  part @source: [ /foo/, /bar/, /zap/ ]

Can part be a multi-method defined in the array class
so the original example syntax can be used? (I'd prefer
the code too because the switch statement is eliminated.)


sub convert_to_sub ($classifier is topic) is cached {


Very nice.


for @classifiers.kv - $index, test {


An array::kv method? Very useful for sparse arrays, but
is this preferred for all arrays? An explicit index counter
seems simpler in this case.


my @indices = map { defined .key()($nextval) ?? .value 
:: () } %classifiers;

That map body looks like a syntax error, but it isn't. Can I add
extra syntax like

  map { defined(.key.($nextval)) ?? .value :: () }

to emphasize the fact that .key is returning a code ref?

Last, but not least, the Hash case returns a junction (most
likely of a single value). Junctions don't collapse like
superpositions, so I'm wondering what really happens.

Can you describe the evaluation? I'm really interested in how
long the junction lasts (how quickly it turns into an integer
index), and what happens with a duplicate (ambiguous?) index.

Sorry for so many questions. The code you wrote was just a
really, really good example of many Perl 6 features coming
together.

[This is out of order; Damian wrote it in another message.]
 Everything doesn't. Everything shouldn't be. Just the really common,
 important stuff.

So CGI.pm is in?

I don't think really common, important is a good criteria for
being in the core. IMHO it should be language defining, awkward or
impossible to implement as a module.

Perhaps the part method can be implemented as a mix-in module that
extends array without subclassing it? AUTOLOAD can do that now
for packages. Are classes sealed or will they use AUTOLOAD too?

- Ken




Re: purge: opposite of grep

2002-12-07 Thread Ken Fox
Michael Lazzaro wrote:

(@foo,@bar,@zap) := classify { /foo/ ;; /bar/ ;; /zap/ } @source;
A shorthand for:

  for @source {
  given {
  when /foo/ { push @foo, $_ }
  when /bar/ { push @bar, $_ }
  when /zap/ { push @zap, $_ }
  }
  }


How about just

  (@foo,@bar,@zap) := classify [ rx/foo/, rx/bar/, rx/zap/ ] @source;

and implement classify as a normal sub? Why does everything
have to be built into the first version of Perl 6?

Is there any reason classify can't be a normal sub? e.g. can
a sub return ( [], [], [] ) and have that bound to 3 array
variables? What about return @AoA when @AoA = ( [], [], [] )?

- Ken




Re: Continuations

2002-11-18 Thread Ken Fox
Damian Conway wrote:

my $iter = fibses();
for  $iter  {...}

(Careful with those single angles, Eugene!)


Operator  isn't legal when the grammar is expecting an
expression, right? The  must begin the circumfix  operator.

Is the grammar being weakened so that yacc can handle it? The
rule engine is still talked about, but sometimes I get the
feeling that people don't want to depend on it.

That  $iter  syntax reminds me too much of C++.

- Ken




Re: Continuations

2002-11-18 Thread Ken Fox
Damian Conway wrote:

Ken Fox wrote:

The  must begin the circumfix  operator.


Or the circumfix ... operator. Which is the problem here.


This is like playing poker with God. Assuming you can get over
the little hurdles of Free Will and Omniscience, there's still
the problem of Him pulling cards out of thin air.

What does the circumfix ... operator do? [1]

Here docs are re-syntaxed and the  introducer was stolen
for the ... operator? [2]


Yes. But since iterating an iterator to get another iterator that
is immediately iterated will (I sincerely hope!) be a very rare
requirement, I doubt it will be anything like the serious inconvenience
it is in C++.


True. I suppose even multi-dimensional data structures will
rarely be iterated over with a simple:

  for  $array  {
  }

Most people will probably want more control:

  for $array {
 for $_ {
 }
  }

Anyways, I was wondering about the general principle of using
C++ style hacks to make yacc happy. I should have known better
coming from the author of C++ Resyntaxed. Did the immodest
proposal fixsyntax? ;)

- Ken

[1] I can't google for . Anybody know if Google can add perl6
operators to their word lists? Seriously!

[2] Hmm. Will the uproar on here docs beat string concatenation?




Re: Continuations

2002-11-18 Thread Ken Fox
Damian Conway wrote:

It's [...] the ASCII synonym for the «...» operator, which
is a synonym for the qw/.../ operator.



Nope. Heredocs still start with .


Hey! Where'd *that* card come from? ;)

Seriously, that's a good trick. How does it work? What do these
examples do?

  print a b c;

  print a
  b
  c;
  a

Is it illegal now to use quotes in qw()?

- Ken




Re: String concatentation operator

2002-11-14 Thread Ken Fox
Andy Wardley wrote:


Can we overload + in Perl 6 to work as both numeric addition
and string concatenation ...


Isn't there some nifty Unicode operator perl6 could enlist? ;)

How about concatenating adjacent operands? ANSI C does this
with string constants and it works very well. It would become
one of those great Perl sound bites too: the community
couldn't decide on the operator, so perl6 left it out.

- Ken




Re: String concatentation operator

2002-11-14 Thread Ken Fox
Michael G Schwern wrote:

Before this starts up again, I hereby sentence all potential repliers to
first read:

string concatenation operator - please stop
http://archive.develooper.com/perl6-language;perl.org/msg06710.html


The bike shed thing is like Godwin's Law. Only I don't know
which side loses. ;)

Wasn't one of the main problems with Jarkko's juxtaposition
proposal that it would kill indirect objects? Have we chased
our tail on this subject after the colon became required for
indirect objects?

If the assignment variant of the invisible concatentation
operator could be solved, juxtaposition seems like a reasonable
approach. (Line ending juxtaposition problems could be fixed with
a special rule similar to the '} by itself' rule.)

- Ken




Keywords global or only in context?

2002-11-05 Thread Ken Fox
Me wrote:


YAK for marking something.


I've been assuming that a keyword will only have
meaning in contexts where the keyword is valid.
Given the shiny new top-down grammar system, there's
no requirement for keywords to be global. (Context
sensitive keywords fall out of Perl 6 grammars
naturally -- just the opposite of yacc.)

Is this a valid assumption?

What's the parse of the following code?

sub topic ($x is topic) {
   $x.topic
}

- Ken




Re: Supercomma! (was Re: UTF-8 and Unicode FAQ, demos)

2002-11-05 Thread Ken Fox
Jonathan Scott Duff wrote:


Um ... could we have a zip functor as well?  I think the common case
will be to pull N elements from each list rather than N from one, M
from another, etc.  So, in the spirit of timtowtdi:

	for zip(a,b,c) - $x,$y,$z { ... }


sub zip (\:ref repeat{1,}) {
   my $max = max(map { $_.length } _);
   my $i = 0;
   while ($i  $max) {
   for (_) {
   yield $_[$i]
   }
   ++$i
   }
   return ( )
}

That prototype syntax is probably obsolete, but I'm not sure
what the current proposal is. It might be better to force scalar
context on the args so that both arrays and array refs can be
zipped.

I really like the idea of using generic iterators instead of
special syntax. Sometimes it seems like we're discussing 6.x
instead of just 6.0.

This iterator is nice too:

sub pairs (\a, \b) {
   my $max = max(a.length, b.length);
   my $i = 0;
   while ($i  $max) {
   yield a[$i] = b[$i];
   ++$i
   }
   return ( )
}

for pairs (a, b) {
   print .x, .y
}

- Ken




Re: UTF-8 and Unicode FAQ, demos

2002-11-04 Thread Ken Fox
Damian Conway wrote:

Larry Wall wrote:

That suggests to me that the circumlocution could be *. 

A five character multiple symbol??? I guess that's the penalty for not
upgrading to something that can handle unicode.


Unless this is subtle humor, the Huffman encoding idea is getting
seriously out of hand. That 5 char ASCII sequence is *identically*
encoded when read by the human eye. Humans can probably type the 5
char sequence faster too. How does Unicode win here?

I know I'm just another sample point in a sea of samples, but
my embedded symbol parser seems optimized for alphabetic symbols.
The cool non-alphabetic Unicode symbols are beautiful to look at,
but they don't help me read or write faster. There are rare
exceptions (like grouping) where I strongly prefer non-alphabetics,
but otherwise alphabetics help me get past the what is this code?
phase and into the what does this code do? phase as quickly as
possible.

(I just noticed that all the non-alphabetic symbols (except '?')
in the previous paragraph are used for grouping. Weird.)

- Ken




Re: How to set your Windows keyboard to ¶erl-mode

2002-11-04 Thread Ken Fox
Austin Hastings wrote:


At this point, Meestaire ISO-phobic Amairecain Programmaire, you have
achieved keyboard parity with the average Swiss six-year-old child.


The question is not about being ISO-phobic or pro-English. **

The question is whether we want a pictographic language. I like
the size of the English alphabet. It produces fairly short words,
but the words are very robust (people can read words in all
orientations, backwards, upside down, in crazy fonts, hand-written,
etc.) This is the opposite of Huffman encoding, but just
as useful IMHO.

I've had the unpleasant job of turning math into software. Hand
written formulae can be very difficult to read because mathematics
worships Huffman encoding. Multiplication is specified by *nothing*.
Exponents are just written a bit smaller and a bit raised. Is this
what we want in the core?

Does anyone have any references for reading and comprehension
rates for different types of languages? I'm ignorant on the subject
and this seems like something a Perl programmer should know.

- Ken

** I'm probably both. ISO-phobic because I actually represented my
company on an ISO standard committee. Pro-English because it's what
I use -- being pro-English doesn't make me against everything else.
A language would have to be pretty bad to have its native speakers
advocate something else!




Re: How to set your Windows keyboard to ¶erl-mode

2002-11-04 Thread Ken Fox
Austin Hastings wrote:


The  and  ... are just as pictographic (or
not) as [ and ].


I'm not particularly fond of  or  either. ;) Damian just
wrote that he prefers non-alphabetic operators to help
differentiate nouns and verbs. I find it helpful when people
explain their biases like that. What's yours?


 They look the same from top or bottom, and are
unmistakable in direction when looked at from either side.


Well, anything can look like itself, that wasn't the point. The
goal is to not look like anything else in any orientation. The
chars O and 0 fail badly, but A and T are excellent. I'm not
sure where  and  fall because I don't have any experience
with them.

Programming languages probably get away with more because
most programmers don't spray paint algorithms on the side of
a bridge. (Well, Lisp programmers maybe. ;) My three points
against arbitrary punctuation as symbols are
 (1) it's impossible to identify symbol boundaries when
 reading punctuation -- you just have to guess,
 (2) it's harder to work with punctuation in non-digital
 communication, and
 (3) my memory doesn't work well on punctuation symbols!

Perl has some nice features like sigils that clue people in on
how to read a sentence. But...


difference between ' (apostrophe) and ` (tick)


is a horrible abomination. ;)


If every keyboard and operating system had the ability to simply
generate arbitrary expressions of the form (expr-a) ** (expr-b), ad
infinitum (a ** b ** c ** d ** e) then we'd be remiss not to use it.
But they can't, so we don't.


Non sequitur. Written language prior to the printing press had
no technological reason to limit alphabet size. Some languages
developed very large pictographic representations, others
developed small alphabets with word formation rules. I have no
idea what the design pressures were that caused these different
solutions. Do you? What are the strengths and weaknesses of the
approaches? Why should we select one over the other?

- Ken




Re: Blocks and semicolons

2002-09-12 Thread Ken Fox

Luke Palmer wrote:
 This requires infinite lookahead to parse.  Nobody likes infinite 
 lookahead grammars.

Perl already needs infinite lookahead. Anyways, most people
don't care whether a grammar is ambiguous or not -- if we did,
natural human languages would look very different.

People want expressive languages. (Some people even consider
ambiguity a feature. Poets? Comedians? Lawyers?)

I don't know how we should handle code blocks, but I do know
that the answer should solve human problems, not yacc's.

Perl 5 doesn't accept either of these loops:

   if ($_) { print $_\n } for (qw(1 0));
   print $_\n if ($_) for (qw(1 0));

The code must be written as:

   for (qw(1 0)) {
 print $_\n if ($_)
   }

Why won't a similar solution work for user-defined syntax in
Perl 6? (It would be no worse than Perl 5...)

- Ken




Re: Blocks and semicolons

2002-09-12 Thread Ken Fox

Luke Palmer wrote:
 On Thu, 12 Sep 2002, Ken Fox wrote:
  Perl already needs infinite lookahead.
 
 Really? Where?

Indirect objects need infinite lookahead and they are
in the core language. Hyper operators may need lookahead.
Place holders may need lookahead. User defined rules
will definitely need infinite lookahead. (Remember we
are switching from LR to LL parsing. LL(1) parsing is
not as powerful as LR(1), so Perl 6 will need lookahead
even in places where Perl 5 doesn't.)

 Computers don't like ambiguous grammars ...

The dangling else problem is ambiguous. Every time you
get a shift-reduce conflict in yacc, you have an ambiguous
grammar. Computers don't care about ambiguous grammars,
but some of the common tools (yacc) disambiguate whether
we like it not. ;)

BTW, there are some parser generators that handle
ambiguous grammars -- they either support backtracking,
infinite lookahead, or simultaneously parse all possible
derivations. In the case of the simultaneous parse, they
can actually return multiple parse trees and let the
code generator decide how to interpret things.

 But in Perl 5, Cif is not a sub.
 ...
 Because subs have to work as expressions.

In Perl 6 the difference between sub and syntax is
almost non-existant. Some subs will behave like built-in
syntax, some subs will behave like normal function
calls. (AFAIK, the only difference is that subs can not
provide lazy argument evaluation. Maybe the is rx
property eliminates even that? BTW, does anybody else
find is rx funny? This is your argument. This is your
argument on drugs. (Rx is an abbreviation for drug
prescription in the U.S.))

I think Perl 5's solution for handling if can be
applied to Perl 6 subs that look like syntax. It might
not be an improvement over Perl 5, but at least it's
no worse.

A better solution may be to continue the parse until
we see if for is followed by a block. (That's really
hard to do with yacc, which might be why Perl 5 does
what it does.)

- Ken



Re: Hypothetical variables and scope

2002-09-08 Thread Ken Fox

Damian Conway wrote:
 Though leaving optimization in the hands of the programmer
 is generally a Bad Idea.

That doesn't sound like a Perl slogan.

 It's also a matter of syntactic consistency. It has to be := for
 inlined bindings (i.e. rx/ $name:=ident /) because otherwise
 we make = meta (which is *not* a good idea). So it probably should be
 := for explicit Clets as well.

If let only works on bindings, it really bites into the
expressiveness of the language. For example, the natural way
to skip text within a match is to do something like:

   / (\w+) \d+ (\w+) { let $1 _= $2; let $2 = undef } /

This feels natural too:

   / (\w+ \d+ \w+) { let $1 =~ s/\d+// } /

Binding might be really fast for some implementations of Perl,
but slow for others. Just like string eval may be impossible in
some, but trivial in others.

- Ken




Re: Suggestion for perl 6 regex syntax

2002-09-07 Thread Ken Fox

Mr. Nobody wrote:
 /^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/
 
 would actually become longer:
 
 /^([+-]?)before \d|\.\d\d*(\.\d*)?([Ee]([+-]?\d+))?$/

Your first expression uses capturing parens, but the captures
don't bind anything useful, so you should probably compare
non-capturing versions of the regex:

/^[+-]?(?=\d|\.\d)\d*(?:\.\d*)?(?:[Ee][+-]?\d+)?$/

vs

/^[+-]?before \d|\.\d\d*[\.\d*]?[[Ee][+-]?\d+]?$/

The [Ee] isn't the way I'd write it in Perl 6 -- I'd shift
into case-insensitive mode temporarily because those
hand-written [Cc][Aa][Ss][Ee] insensitive matches are hard
to read.

/^[+-]?before \d|\.\d\d*[\.\d*]?[:i e[+-]?\d+]?$/

Now Perl 6 is just 5 characters longer. That's a horrible
pattern to read though. Can Perl 6 fix that? I think so.

I'd change the [+-] fragments to use a sub-rule because
repeated constants make things harder to read. (Not so bad
in this case, but it's a good general rule -- and you're making
generalizations about regex syntax.)

/^sign?before \d|\.\d\d*[\.\d*]?[:i esign?\d+]?$/

I'd put in some white space to clarify the different logical
pieces of the rule:

/^ sign? before \d | \.\d
\d* [\.\d*]?
[:i e sign? \d+]? $/

Now it's pretty obvious that the :i can be moved outside the
rule without screwing anything up. I'd rather have modifiers
affect the whole rule rather than remembering where they begin
and end inside it.

:i /^ sign? before \d | \.\d
   \d* [\.\d*]?
   [e sign? \d+]? $/

That's how I'd write your Perl 5 regex in Perl 6. (Well,
actually it's probably just /^ number $/, but would you
call that cheating? ;)

It does have more characters than the Perl 5 regex. Looking
at it another way, it has fewer symbols. It's faster to read.
How many times are you going to write it? How many times are
you going to read it?

When I was reading A5, I was concerned about character
classes too, but mostly because of the regex style that I
learned from the Friedl book:

   opening normal* ( special normal* )* closing

which can be used to match quoted strings for example:

   /[^\\]*(\\.[^\\]*)*/

The direct Perl 6 equivalent is not very pretty:

   / -[\\]* [ \\. -[\\]* ]* /

It's hard to come up with a good name for the character
class used there. not_a_quote_or_slash? special_char_in_quote?

I'm not concerned about it anymore because I think the
Perl 6 style will be:

   opening ( special :: | . )*? closing

The non-greedy match makes so many things easier to write
and the backtracking control prevents the special case
from accidentally matching the normal one.

I'd write the string match in Perl 6 like this:

   / [ \\. :: | . ] *? /

The only possible problem with this is non-greedy
iteration is slower. It doesn't have to be though -- and
the optimizations needed to get Perl 6 rules to match
full grammars should fix this.

If the pattern is rewritten as a grammar, we can
talk about first and follow sets.

   quoted_string:  quoted_char_seq 
quoted_char_seq: null
  | quoted_char quoted_char_seq
 quoted_char: \ any
  | any

The reason non-greedy matching is slow is because the
rule quoted_char_seq can be empty, i.e. it always
matches the current spot. However, the follow set of
quoted_char_seq is the quote. That means the *only*
thing that can follow quoted_char_seq is a quote.
There's no point in returning (taking the null route)
unless the rule is looking at a quote. This reduces
backtracking tremendously.

The other problem would normally be in the conflict
between the first sets of quoted_char. The slash
character is also an any character, so if the slash
alternative is taken, the system has to prepare to
backtrack to the any alternative. The :: backtracking
control eliminates the backtracking point, so it's
impossible for an escape sequence to be re-parsed
as two separated characters.

Damian wrote several good examples of Perl 5 - Perl 6
conversions. Take a look at E5 and experiment some
more. The built-in named rules may simplify a lot of
things too -- we're going to have a much richer library
than just \d, \w, etc.

- Ken




Re: Request for default rule modifiers in a grammar

2002-09-05 Thread Ken Fox

Damian Conway wrote:
 I would imagine that modifiers would be passed some
 kind of hierarchical representation of the rule
 they're modifying (i.e. a parse tree of it), and
 would be expected to manipulate that structure
 representation.

Excellent. Will there be an abstract syntax for tree
rewriting or is it Perl 6 all the way down?

This is really amazing stuff. I was expecting some
support for building mini languages on top of Perl 6,
but it really looks like Perl 6 is going to become
the de facto language prototyping tool. (Bye bye
yacc!) In many ways Perl 6 is more language neutral
than Parrot.

- Ken




Re: Hypothetical variables and scope

2002-09-05 Thread Ken Fox

Damian Conway wrote:
 Because what you do with a hypothetical has to be reversible.
 And binding is far more cheaply reversible than assignment.

Why not leave it in the language spec then? If it's too
hard to implement, then the first release of Perl 6 can
leave it out. Someday somebody might come up with an
implementation.

BTW, I thought the Apocalypses were moving us away from
the-tarball-defines-the-language. Am I wrong to think that
Perl 6 the spec may have more than Perl 6 the release?

- Ken




Re: Multimethod Dispatch

2002-09-04 Thread Ken Fox

David Wheeler wrote:
 Ah, yes, the same thing exists in Java. I remember, now.

I thought Java only has over loading?

Over loading is what C++ has. It is not the same as
multi-dispatch. The trouble with over loading is that the
compiler uses static (compile-time) type information to
select the over loaded method. This can create subtle
bugs when people try to re-use code by sub-classing.

- Ken




Re: regex args and interpolation

2002-09-04 Thread Ken Fox

David Whipp wrote:
 But can I use a non-constant date?

You didn't show us the iso_date rule.

 Obviously we could put the onus on the module writer to write super-flexible
 rules/grammars. But will there be an easy way to force interpolative context
 onto this type of regex-valued subroutine arg?

With this rule you need literal numbers:

   rule iso_date { $year:=(\d{4}) -
   $month:=(\d{2}) -
   $day:=(\d{2}) }

This rule is more flexible:

   rule iso_date { (Perl.term) - { let $year := eval $1 }
   (Perl.term) - { let $month := eval $2 }
   (Perl.term)   { let $day := eval $3 }
   ($year =~ /^\d{4}$/ 
 $month =~ /^\d{2}$/ 
 $day =~ /^\d{2}$/) }

The eval is very restrictive though -- it forces the terms to
have values at compile time and breaks lexical scoping. (Unless
eval can play games with %MY and somehow evaluate in the
caller's scope. Maybe caller.{MY}.eval?)

We can tidy that up a little with another rule:

   rule value { Perl.term
{ let $0 := caller(2).{MY}.eval($term) } }

   rule iso_date { $year:=value -
   $month:=value -
   $day:=value
   ($year =~ /^\d{4}$/ 
 $month =~ /^\d{2}$/ 
 $day =~ /^\d{2}$/) }

There's still the compile-time value problem though.

What is really needed is something that converts the date syntax
to normal Perl code:

   rule iso_date { (Perl.term) -
   (Perl.term) -
   (Perl.term)
   { use grammar Perl::AbstractSyntax;
 $0 := (expr (invoke 'new (class 'Date) $1 $2 $3))) }

This rule just generates code -- the $0 return result is spliced
into the syntax tree at compile time. All the valid value checking
is done at run-time in the Date object constructor. (Or maybe at
compile time if the compiler can figure out there aren't any
side-effects and the object is serializable. But that's a whole
different thread.) It's very efficient because it only parses the
source code once. All the compiler optimizations (like constant
folding) get applied to your special syntax. But you're writing
Parrot macros (IMCC on steroids maybe?) and not Perl!

Another approach is to delay the final parse until run-time. In
this case, the compiler runs the rule over the input source
code and records the matched text, but without running any of the
code blocks. When the sub is called at run-time, the matched text
is parsed by the rule again, but this time with the code blocks
enabled. The second rule above (using eval) would work fine this
way (as long as the %MY games are possible). The down-side to
this approach is that you have lots of string evals happening
at run-time.

Maybe all of these techniques will be possible?

   rule x is macro { ... }
   rule y is delayed { ... }
   rule z is syntax { ... }

- Ken




Re: Hypothetical variables and scope

2002-09-03 Thread Ken Fox

Peter Haworth wrote:
 Also the different operators used (:= inside the rule, = inside the code)
 seems a bit confusing to me; I can't see that they're really doing anything
 different:
 
  / $x := (gr\w+) /vs/ (gr\w+) { let $x = $1 } /
 
 Shouldn't they both use C :=  ?

Depends on what you want. The $x := in the rule binds the
first match to $x -- it does not copy the value. The $x =
in the code block copies the value of the first match. You
can use either binding or assignment in the code block.

Both of them will be undone during backtracking. It's more
efficient to bind, but the copy guarantees changes to $x and $1
are independent.

- Ken



Re: Request for default rule modifiers in a grammar

2002-09-02 Thread Ken Fox

Damian Conway wrote:
  One possibility is that a modifier is
 implemented via a special class:
 
 my class Decomment is RULE::Modifier
is invoked(:decomment) {
 method SETUP ($data, $rule) {
 ...
 }
 # etc.
  }

I'm messing around with regex code generation by
converting first to a grammar. The modifiers seem
to need intimate knowledge of regex - grammar
conversion. This may be a quirk of my approach.
People using tree traversal or generating code
directly from the regex might see something else.
I suspect modifiers will still be deeply connected
with the internals.

Is this why you're thinking of modifiers as
classes? So the modifier can guide the regex
engine (tree generator, code generator, etc.) by
hooking in at certain spots? (I don't envy the
job of standardizing those hooks -- it seems like
every modifier I've thought of needs a different
hook.)

One alternative I was thinking of is treating
modifiers as filters -- something like a tree
transformation language that runs after the core
has parsed a regex. I like filters because they
don't need much knowledge of the regex engine.
They're slower though, so maybe a combination of
the two would be best. Built-in modifiers like :w
and :i could be embedded in the core, but user
defined modifiers could be filters. (Some
modifiers are really just new rules in disguise.
Those are the easiest kind. For example :p5 could
just be an attribute checked by the regex rule:
   rule regex { ($p5) :: p5_regex | p6_regex }
Oh. Wait. That should be 'rule' not 'regex'... ;)

BTW, what code is legal in a grammar block? Can
we introduce new lexicals? I'd like to write
something like this:

grammar foo {
   my $parsing_x;

   rule x { { let $parsing_x = true } ... }
   rule y { ($parsing_x) ... | ... }
}

(Hypotheticals are so cool!)

It's no big deal to put the grammar state in an
outer block, but I like the idea of treating
grammars like classes.

- Ken




Request for default rule modifiers in a grammar

2002-09-01 Thread Ken Fox

The thing I'd like to do right now is turn on :w
for all rules. A Fortran grammar might want to turn
on :i for all rules.

Maybe add modifiers to the grammar declaration?

   grammar Fortran :i { ... }

It would also be convenient to allow the :w
modifier to have lexically scoped behavior so a
grammar can change the definition of a word. For
example, a C grammar might want to skip comments.

Would it be better to create a new (user-defined)
modifier instead of changing the meaning of :w?

BTW, how do we create user-defined modifiers?

- Ken




Re: Regex stuff...

2002-08-31 Thread Ken Fox

Piers Cawley wrote:
  Unless I'm very much mistaken, the order of execution will
 look like:
 
   $2:=$1; $1:=$2;

You're not binding $2:=$1. You're binding $2 to the first
capture. By default $1 is also bound to the first capture.

Assuming that numbered variables aren't special, the order
of execution is:

   $2:=$1:=first; $1:=$2:=second;

That doesn't make any sense though, so numbered variables
must be treated specially -- an explicit numbered binding
replaces the default numbered binding. So, the order of
execution is really:

   $2:=first; $1:=second;

I think this solves both of your puzzles.

One last thing though. Binding might be done at compile-time
because it changes variables, not the values of variables.
Thinking about binding as a compile-time declaration might
be easier than thinking about run-time execution order.

Thinking about binding as a compile-time thing, the rule

   / $2:=(\S+) lt= $1:=(\S+) /

becomes

   / begin $2[\S+]end $2 lt= begin $1[\S+]end $1 /

- Ken




Re: @array = %hash

2002-08-31 Thread Ken Fox

Simon Cozens wrote:
 [EMAIL PROTECTED] (Damian Conway) writes:
 
  %hash4 = (Something, mixing, pairs = and, scalars);

That's perfectly okay (except you forgot the quotes around the and
and you have an odd number of elements initializing the hash).
 
 Urgh, no. Either a pair is an atomic entity or it isn't. Which?

Odd meaning not correct... When initializing the hash, the pair
comes off as a single element. That leaves scalars as a key without
a value. So there's an even, but insufficient, number of elements
initializing the hash.

- Ken




Re: atomicness and \n

2002-08-31 Thread Ken Fox

Damian Conway wrote:
 No. It will be equivalent to:
 
   [\x0a\x0d...]

I don't think \n can be a character class because it
is a two character sequence on some systems. Apoc 5
said \n will be the same everywhere, so won't it be
something like

   rule \n { \x0d \x0a | \x0d | \x0a }

Hmm. Now that I read that, I'm thinking some characters
will be multi-byte sequences. Is there going to be
multi-byte magic for line endings? Even in ASCII data
streams?

- Ken




Re: backtracking into { code }

2002-08-30 Thread Ken Fox

Damian Conway wrote:
 rule expr1 {
 term { m:cont/operators/ or fail } term
 }
 
 Backtracking would just step back over the rule as if it were atomic
 (or followed by a colon).

Ok, thanks. (The followed by a colon is just to explain the behavior,
right? It's illegal to follow a code block with a colon, isn't it?)

After talking to Aaron yesterday, I was wondering if sub-rules are
meant to backtrack at all.

Does the following example backtrack into foo?

   rule foo { b+ }
   rule bar { a foo b }

If it doesn't, I think we're restricted to pure LL(1) grammars.
That would suck. Apoc 5 is so close to ANTLR or precc that it would
be a shame not to be LL(k).

I've been playing around with converting regex operators into
Perl 6 attribute grammars and they look good. If backtracking into
rules doesn't work though, these conversions don't work.

   a ?rule x { a | null }
   a *rule x { a x | null }
   a +rule x($i=0) { a x($i+1) | ($i0) null }
   a {n,m}rule x($i=0) { ($i$m) a x($i+1) | ($i=$n) null }

The non-greedy versions just reverse the productions, so for example

   a *?   rule x { null | a x }

- Ken




Re: backtracking into { code }

2002-08-30 Thread Ken Fox

Larry Wall wrote:
 On Fri, 30 Aug 2002, Ken Fox wrote:
 : Ok, thanks. (The followed by a colon is just to explain the behavior,
 : right? It's illegal to follow a code block with a colon, isn't it?)
 
 I don't see why it should be illegal--it could be useful if the closure
 has played continuation games of some sort to get backtracking.

Apoc 5 has It is an error to use : on any atom that does no
backtracking. Code blocks don't backtrack (at least that's what
I understood Damian to say). Are zero width atoms treated specially?

And can you give me an example of a continuation game? That sounds
sort of like my original question.

Great news about backtracking into sub-rules. Perl 6 is going to
be a lovely system to work with. I think it's going to suffer a bit
from the same declarative-face vs procedural-heart** that Prolog
does, but it hits the little language target perfectly.

- Ken

** Prolog uses a cut (!) operator to control backtracking just like
Perl 6. A big problem (at least for me...) is learning when ! just
makes things run faster vs. when ! gives me the wrong answer. Maybe
I just haven't used Prolog enough to get my brain wrapped around it.




Re: backtracking into { code }

2002-08-30 Thread Ken Fox

Larry Wall wrote:
 There's a famous book called Golf is Not a Game of Perfect.

Well now I'm *totally* confused. I looked that up on Amazon
and it has something to do with clubs and grass and stuff. That's
completely different than what I thought golfing was. ;)

Seriously, though. I have a positive and confident outlook that
Perl 6 will be a lovely system for hacking grammars.

- Ken




backtracking into { code }

2002-08-29 Thread Ken Fox

A question: Do rules matched in a { code } block set backtrack points for
the outer rule? For example, are these rules equivalent?

  rule expr1 {
term { /operators/ or fail } term
  }

  rule expr2 {
term operators term
  }

And a comment: It would be nice to have procedural control over back-
tracking so that { code } can fail, succeed (not fail), or succeed and
commit. Right now we can follow { code } with ::, :::, etc. but that does
not allow much control. I'm a little afraid of what happens in an LL(Inf)
grammar if backtracking states aren't aggressively pruned.

- Ken



Re: backtracking into { code }

2002-08-29 Thread Ken Fox

Aaron Sherman wrote:
 rule { term { /operators/.commit(1) or fail } term }
 
 The hypothetical commit() method being one that would take a number and

That would only be useful if the outer rule can backtrack into the
inner /operators/ rule. Can it?

I agree with you that a commit method would be useful -- especially when
used on $self. I'd probably write your example as:

  rule { term { m/operators { $self.commit(1) }/ or fail } term }

which is of course just a complicated

  rule { term { m/operators :/ or fail } term }

BTW, why isn't fail a method? Then a rule could pass itself to a sub-rule
and allow the sub-rule to fail it's parent, but not the entire match. Isn't
failing just invoking the last continuation on the backtrack stack?

- Ken



Re: Light ideas

2002-08-03 Thread Ken Fox

Dave Storrs wrote:
 why didn't you have to write:
 
   rule ugly_c_comment {
  
/
  
\/ \*  [ .*? ugly_c_comment? ]*?  \* \/
  
{ let $0 :=   }
  
/
   }

Think of the curly braces as the regex quotes. If { is the quote
then there's nothing special about / and it doesn't need to be
escaped. Also, I don't think you want spaces between / and *
because / * isn't a comment delimiter.

 2) As written, I believe that the ugly_c_comment rule would permit nested
 comments (that is, /* /**/ */), but would break if the comments were
 improperly nested (e.g., /* /* */).  Is that correct?

It wouldn't fail, but it would scan to EOF and then back track.
Basically the inner ugly_c_comment succeeds and then the rest
of the file is scanned for '*/'. When that fails, the regex
back tracks to the inner ugly_c_comment, fails that and then
skips the unbalanced /* with .*?. I'd like to add ::: to fail
the entire comment if the inner comment fails, but I'm not sure
how to do it. Does this work?

   /\* [ .*? | ugly_c_comment ::: ]*? \*/

 3) The rule will replace the comment with a single, literal space.  Why is
 this replacement necessary...isn't it sufficient to simply define it as
 whitespace, as was done above?

Probably. I think it's a hold-over from thinking of parser vs lexer,
but that may not be true depending on how the rest of the grammar
uses white space. IMHO value bound to the white space production
should be the actual text (the comment in this case).

- Ken




Rebinding can change type? [was: Static Values and Variable Bindings]

2001-11-02 Thread Ken Fox

Garrett Goebel wrote:
 Just does compile-time typing for $foo? Not inlining the constant?

You can't assume that the value associated with the symbol is
the same each time through the code, so how can it be inlined?

 I was thinking lowercase typed variables couldn't be rebound, because
 they were compile-time optimized... Can they? Or are we back to the
 selective use of yet to be named pragmas?

Binding normally means associating a value with a symbol, so binding
to a different type depends upon whether the type information is
associated with the symbol or the value.

I can't recall what Perl 6 does. I suspect that it allows binding
to change types because binding is supposed to replace messing with
globs.

This code should work, yes?

  my int $foo;

  ... $foo is a tiny little int

  { my $bar; $foo := $bar }

  ... $foo is a big hulking scalar

Why would sticking const on $foo change anything?

- Ken



Re: Static Values and Variable Bindings [was RE: Perl 6 - Cheerleader s?]

2001-11-01 Thread Ken Fox

Garrett Goebel wrote:
 worried about the loss of data-hiding with Perl6's lexicals.
 Was that ever resolved?

Here in the 10-step Perl 6 program we don't talk about
resolution. We just learn to cope with change. ;)

There were two issues I had. As a Perl 6 user I felt
uncomfortable that Perl 6 is encouraging people to violate
lexical scoping. The other was that it seemed impossible
to compile any code if a caller at run-time can yank
the lexical carpet out from underneath a sub.

Happily I've learned to cope with both. I think in serious
code people won't do it -- there will probably be pragmas
to disable run-time mucking with lexicals. Same thing for
compilation. Some Perl 6 features just might not work when
compiled to JVM or .Net or native code. (I think Java
Python is proof that this isn't a huge problem in
practice.)

Re-defining constants is a simliar thing and I have
similar reservations. It might make reading programs
harder. It will definitely hurt compilation. The trouble
will be that the compiler can't inline a constant. If
it can't inline the constant, then it won't be able to
do constant folding, dead code elimination and a whole
bunch of other cool stuff.

On the other hand, people live with C's preprocessor
and its #undef/#define of constants. If C programmers
don't mind having different parts of a program compiled
with different values for the same constant, then why
should Perl programmers? ;)

- Ken



Re: What's up with %MY?

2001-09-06 Thread Ken Fox

Dan Sugalski wrote:
 I think you're also overestimating the freakout factor.

Probably. I'm not really worried about surprising programmers
when they debug their code. Most of the time they've requested
the surprise and will at least have a tiny clue about what
happened.

I'm worried a little about building features with global effects.
Part of Perl 6 is elimination of action-at-a-distance, but now
we're building the swiss-army-knife-of-action-at-a-distance.

What worries me the most is that allowing %MY to change at run-time
slows down code that doesn't do it. Maybe we can figure out how
to reduce the impact, but that's time IMHO better spent making
existing code run faster.

You wrote on perl6-internals:

   get_lex P1, $x  # Find $x
   get_type I0, P1 # Get $x's type

   [ loop using P1 and I0 ]

That code isn't safe! If %MY is changed at run-time, the
type and location of $x will change. You really need to put
the code for finding $x *inside the loop*.

Maybe we can detect a few cases when it's safe to move
get_lex out of a loop, but if the loop calls any subs or
non-core ops we're stuck.

- Ken



Re: What's up with %MY?

2001-09-06 Thread Ken Fox

Dan Sugalski wrote:
 At 02:05 PM 9/6/2001 -0400, Ken Fox wrote:
 You wrote on perl6-internals:
 
 get_lex P1, $x  # Find $x
 get_type I0, P1 # Get $x's type
 
 [ loop using P1 and I0 ]
 
 That code isn't safe! If %MY is changed at run-time, the
 type and location of $x will change. You really need to put
 the code for finding $x *inside the loop*.
 
 Only if $x is active. (i.e. tied) In that case we need to do some other
 things as well. I was assuming the passive case, for which the code was
 valid since there wasn't any way for it to be changed.

Could you compile the following for us with the assumption that
g() does not change its' caller?

  sub f {
my $sum = 0;
for (0..9) {
  $sum += g()
}
$sum
  }

Now what if g() is:

  sub g {
my $parent = caller().{MY};
my $zero = 0;
$parent{'$sum'} = \$zero;
1
  }

What if g() *appears* to be safe when perl compiles the loop, but
later on somebody replaces its' definition with the scope changing
one? Does perl go back and re-compile the loop?

The compiler could watch for uses of %MY, but I bet that most
modules will eventually use %MY to export symbols. Can the
compiler tell the difference between run-time and compile-time
usage of %MY?

 Now, granted, it might be such that a single uses string eval or uses
 MY in the program shuts down optimization the same way that $ kills RE
 performance in perl 5, but we are in the position of tracking that.

To quote Timone: And you're okay with that?

- Ken



Re: What's up with %MY?

2001-09-06 Thread Ken Fox

Dan Sugalski wrote:
 On the other hand, if we put the address of the lexical's PMC into a
 register, it doesn't matter if someone messes with it, since they'll be
 messing with the same PMC, and thus every time we fetch its value we'll Do
 The Right Thing.

Hmm. Shouldn't re-binding affect only the *variable* and not
the value bound to the variable? Maybe I misunderstand a PMC, but
if the PMC represents a value, then re-binding a lexical should
create a new PMC and bind it to the variable.

I think we have a language question... What should the following
print?

  my $x = 1;
  my $y = \$x;
  my $z = 2;
  %MY::{'$x'} = \$z;
  $z = 3;
  print $x, $$y, $z\n

a. 2, 1, 3
b. 2, 2, 3
c. 3, 1, 3
d. 3, 3, 3
e. exception: not enough Gnomes

I think I would expect behavior (c), but it's not obvious to me.

Anyways, it looks like you just reached the same conclusion I have: we
can't shadow a named variable in a non-PMC register. This might have
a surprising effect on the speed of

  foreach (1..10)

vs.

  foreach my $i (1..10)

- Ken



Re: What's up with %MY?

2001-09-06 Thread Ken Fox

Bryan C. Warnock wrote:
 Generically speaking, modules aren't going to be running amok and making a
 mess of your current lexical scope - they'll be introducing, possibily
 repointing, and then possibly deleting specific symbols

How much do you want to pay for this feature? 10% slower code? 50%
slower? Do you want the feature at any price?

I don't like run-time frobbing of the symbol table. Not even
precise tweaking. ;) I think it's in bad style and inconsistent with
the purpose of lexicals. *But* bad style isn't a good argument
and I wouldn't be pursuing this if it were just a style issue.

The trouble lies in running the code. Lexicals used to be known at
compile time. Now they can change practically anywhere. It's like
using C and having *everything* be volatile. Except worse because
you don't even know the address where something is going to be.

A simple solution would be to allow lexical scope editing at
compile time, but not run-time. Change a BEGIN block's caller() so
that it is the scope being compiled instead of main. This achieves
the majority of the benefits (lexical imports at compile time)
without any downside.

There are two other things that are easy to add. If the
compiler knew in advance which lexicals might dynamically change,
it could generate better code and not slow everything down. A
trait called :volatile or something. IMHO this would also
show intent to the people reading the code that something funny
might happen to the variable. (Macros or compile-time injected
lexicals could declare the :volatile trait, so I would imagine
that some pretty interesting packages could still be written.)

The other thing is to permit attaching attributes to a
lexical scope. This allows undetected channels of communication
between callees. There are interesting things that could be
used for (carrying state between function calls in the same
scope or simple thread-local storage). And It wouldn't impact
compiled code at all.

- Ken



Re: What's up with %MY?

2001-09-05 Thread Ken Fox

[EMAIL PROTECTED] wrote:
 Clearly caller() isn't what we want here, but I'm not
 quite sure what would be the correct incantation.

I've always assumed that a BEGIN block's caller() will
be the compiler. This makes it easy for the compiler to
lie about %MY:: and use the lexical scope being compiled
instead of the compiler's lexical scope.

- Ken



Re: What's up with %MY?

2001-09-04 Thread Ken Fox

Damian Conway wrote:
 It would seem *very* odd to allow every symbol table *except*
 %MY:: to be accessed at run-time.

Well, yeah, that's true. How about we make it really
simple and don't allow any modifications at run-time to
any symbol table?

Somehow I get the feeling that *very* odd can't be
fixed by making the system more normal. ;)

 Is stuff like:

   %MY::{'$lexical_var'} = \$other_var;

 supposed to be a compile-time or run-time feature?
 
 Run-time.

A definition of run-time would help too since we have
things like BEGIN blocks. I consider it run-time if the
compiler has already built a symbol table and finished
compiling code for a given scope. Is that an acceptable
definition of run-time? This allows BEGIN blocks to
modify their caller's symbol tables even if we prohibit
changes at run-time.

Can we have an example of why you want run-time
symbol table manipulation? Aliases are interesting,
but symbol table aliases don't seem very friendly.
It would be simple to write:

  %MY::{'@point'} = [ $x, $y ];

But that probably won't work and using [ \$x, \$y ]
doesn't make sense either. What seems necessary is:

  %MY::{'$x'} = \$point[0];
  %MY::{'$y'} = \$point[1];

If the alias gets more complicated, I'm not sure the
symbol table approach works well at all.

 Modifying the caller's environment:
 
   $lexscope = caller().{MY};
   $lexscope{'die'} = die_hard;

This only modifies the caller's scope? It doesn't modify
all instances of the caller's scope, right? For example,
if I have an counter generator, and one of the generated
closures somehow has its' symbol table modified, only that
*one* closure is affected even though all the closures
were cloned from the same symbol table.

What about if the symbol doesn't exist in the caller's scope
and the caller is not in the process of being compiled? Can
the new symbol be ignored since there obviously isn't any
code in the caller's scope referring to a lexical with that
name?

 Between source filters and Inline I can do pretty much whatever I like
 to your lexicals without your knowledge. ;-)

Those seem more obvious. There will be a use declaration
I wrote and I already know that use can have side-effects on
my current name space. IMHO this could become a significant problem
as we continue to make Perl more expressive. Macros, filters,
self-modifying code, mini-languages ... they all make expressing
a solution easier, and auditing code harder. Do we favor
expression too much over verification? I'm not qualified to
answer because I know I'm biased towards expression. (The %MY
issues I'm raising mostly because of performance potential.)

 I would envisage that mucking about with symbol tables would be no more
 common in Perl 6 than it is in Perl 5. But I certainly wouldn't want to
 restrict the ability to do so.

We also want Perl 6 to be fast and cleanly implemented.

This particular issue is causing trouble because it has a big
impact on local variable analysis -- which then causes problems
with optimization. I'd hate to see lots of pragmas for turning
features on/off because it seems like we'll end up with a more
fragmented language that way.

 How am I expected to produce fresh wonders if you won't let me warp the
 (new) laws of the Perl universe to my needs?

You constantly amaze me and everyone else. That's never
been a problem.

One of the things that I haven't been seeing is the exchange
of ideas between the implementation side and the language side.
I've been away for a while, so maybe it's just me.

It vaguely worries me though that we'll be so far down the
language side when implementation troubles arise that it will
be hard to change the language. Are we going to end up with
hacks in the language because certain Very Cool And Important
Features turned out too hard to implement?

- Ken



Re: What's up with %MY?

2001-09-04 Thread Ken Fox

Damian wrote:
 Dan wept:
 I knew there was something bugging me about this.
 
 Allowing lexically scoped subs to spring into existence (and
 variables, for that matter) will probably slow down sub and
 variable access, since we can't safely resolve at compile time what
 variable or sub is being accessed. 
 
 Understood. And that's why you get the big bucks. ;-)

Efficiency is a real issue! I've got 30,000 lines of *.pm in my
latest application -- another 40,000 come from CPAN. The lines
of code run a good deal less, but it's still a pretty big chunk
of Perl.

The thought of my app suddenly running slower (possibly *much*
slower after seeing the semantics of Perl 6 lexicals) doesn't
make me real happy. IMHO it would fork the language, even if
the fork was only done with pragmas.

- Ken



Re: Prototypes

2001-09-03 Thread Ken Fox

Bryan C. Warnock wrote:
 {
 my $a = sub ($$) { code };
 gork($a);
 }
 
 sub gork {
 my ($a) = shift;
 $a-(@some_list);  # - Here
 }
 
 The reason prototypes aren't checked at Here is because there really
 isn't a way to know what the prototype was.

Um, that's not true. ML can do stuff like that -- all automatically and
without any type declarations.

What happens is the type of gork's $a is determined, which cascades
to the type of gork's $_[0], which cascade's to your first block's $a.
ML even has polymorphic functions where the output type depends on the
input type.

It is possible. It's just a question of whether we want to do it.

- Ken



What's up with %MY?

2001-09-03 Thread Ken Fox

I haven't seen details in an Apocalypse, but Damian's
Perl 6 overview has a bit about it. The Apocalypse
specifically mentions *compile-time* scope management,
but Damian is, uh, Damian. (DWIMery obviously. ;)

Is stuff like:

  %MY::{'$lexical_var'} = \$other_var;

supposed to be a compile-time or run-time feature?

Modifying the caller's environment:

  $lexscope = caller().{MY};
  $lexscope{'die'} = die_hard;

is especially annoying because it means that I can't
trust lexical variables anymore. The one good thing
about Damian's caller() example is that it appears
in an import() function. That implies compile-time,
but isn't as clear as Larry's Apocalypse.

This feature has significant impact on all parts of
the implementation, so it would be nice if a little
more was known. A basic question: how much performance
is this feature worth?

- Ken



Re: ! and !

2001-09-02 Thread Ken Fox

Bryan C. Warnock wrote:
 I'm waiting for someone to say that in tri-state logic, '!' != '='

That's what I thought it was. $a ! $b might be !defined($a) || $a = $b.
In SQL this is $a IS NULL or $a = $b.

- Ken



Re: Will subroutine signatures apply to methods in Perl6

2001-09-01 Thread Ken Fox

Uri Guttman wrote:
[Re: use strict 'typing'; my $rex = new Dog; $rex.bark]
 then it should be a compile time error at the assignment to $rex
 and not later. you can't trace $rex at compile time to see what
 kind of object (if any) was assigned to it. so the illegal method
 call can't (easily) be detected at compile time. it has to be a
 runtime error.

I agree with this comment, but I think the approach has serious
usability problems. From my experience with using const in C++,
it looks like this pragma will be *very* difficult to use.
Perl code also tends to be highly generic and polymorphic.

Wouldn't it be better to handle strict typing as a warning in
the places where the type information isn't known? In the Dog
example, my $rex = new Dog would generate a warning, unless
Dog::new was typed.

Also, I think it would be excellent to have an assumptions
pragma to complement strict typing. If the Dog package does not
declare type information, I could write an assumption to quiet
the strict type warnings. If the assumption is false (maybe
the author of Dog adds type declarations someday), then an
invalid assumption error should occur at compile time.

- Ken



Re: explicitly declare closures???

2001-08-28 Thread Ken Fox

Dave Mitchell wrote:
 The whole point is that closed variables *aren't* 'just local variables'.
 The inner $x's in the following 2 lines are vastly different:
 
 sub foo { my $x= ... { $x } }
 sub foo { my $x= ... sub { $x } }

You really need to learn what a closure is. There's a very nice book
called Structure and Interpretation of Computer Programs that can
give you a deep understanding. **

Anyways, it's important to understand that closures do not change the
scoping rules. A closure simply *captures* an existing environment.
If the environment isn't captured properly or changes after capture,
then you have buggy closures.

 causes the behaviour to change is that the middle $x implicitly gives
 foo() a copy of $x at compile time. When the anon sub is cloned,
 it picks up the current value of foo()'s $x. Without the middle $x, the
 cloned sub picks up the outer $x instead.

You're speaking in Perl implementation terms. I've already told you
that if Perl acts the way you say it does, then Perl has buggy
closures. You don't need to explain a bug to know that one exists!

On page 260 in Programming Perl (3rd ed), Larry/Tom/Jon talk about
how Perl's closures (should) behave:

  In other words, you are guaranteed to get the same copy of
  a lexical variable each time ...

IMHO bugs in Perl 5 shouldn't carry over to Perl 6. (Unless, of course,
we *like* the bugs... ;)

- Ken

** Unfortunately the term closure has two important meanings that
are not really related. We're talking about closing a subroutine's
environment, which is not how SICP uses the word. If you want a
Closures For 21 Dummies sort of book, this is not it.



Re: Expunge implicit @_ passing

2001-08-27 Thread Ken Fox

Michael G Schwern wrote:
 I can't think of any reason why this feature is useful anymore, and it
 can be a really confusing behavior, so what say we kill it in Perl 6?

I've always thought is was pretty useful for implementing generic
redirectors. I wrote a frame system that allows instances to over-ride
class methods. The basic idea is

  sub foo {
my $method = $_[0]{_foo} || $_[0]-can(_foo);
{$method};
  }

The only thing I'd like to change is to make foo a tail call instead
of a normal function call. But I guess that would *really* confuse
people.

- Ken



Re: explicitly declare closures???

2001-08-26 Thread Ken Fox

Dave Mitchell [EMAIL PROTECTED] wrote:
 John Porter [EMAIL PROTECTED] wrote:
  Dave Mitchell wrote:
   I think closures are a lot harder (or at least subtler) than
   people think ...
  
  ... The scenario you gave seems rather far-fetched to me, in terms
  of real-world programming.
 
 Perhaps, although the following code demonstrates it, and isn't (too)
 contrived. Comment out the 'superfluous' $counters[$i] = 0; line and
 the code stops working.

We must be very careful not to confuse closure with Perl's
current implementation of closure. You've stumbled onto a bug in
Perl, not discovered a feature of closures. Perl's closures
were horribly buggy until release 5.004. (Thanks Chip!)

Closed variables are just local variables. There's nothing special
about refering to a variable from an inner scope. You don't want
to write

  sub foo {
my $x;

if (...) { my outer $x; $x = 0 }
else { my outer $x; $x = 1 }

$x;
  }

do you? So why make people using closures do it?

My only closure bug due to scoping was because I named a variable
poorly and accidentally shadowed it in an inner scope. Your
proposal does nothing to eliminate that bug.

A more common problem I have is with reading code that uses a
variable a long way from where it was introduced. Your proposal
adds a nearby declaration, but it's a useless declaration that
doesn't help find the original variable.

IMHO, every declaration we write should make a program more
understandable by *people*. If the only reason we declare
something is to help the computer understand it, we should think
carefully on how to eliminate the declaration.

- Ken



Re: JWZ on s/Java/Perl/

2001-02-11 Thread Ken Fox

Bart Lateur wrote:
 On Fri, 09 Feb 2001 12:06:12 -0500, Ken Fox wrote:
  1. Cheap allocations. Most fast collectors have a one or two
 instruction malloc. In C it looks like this:
 
   void *malloc(size) { void *obj = heap; heap += size; return obj; }
  ...
 
 That is not a garbage collector.

I said it was an allocator not a garbage collector. An advanced
garbage collector just makes very simple/fast allocators possible.

 That is "drop everything you don't need, and we'll never use it
 again." Oh, sure, not doing garbage collection at all is faster then
 doing reference counting.

You don't have a clue. The allocator I posted is a very common allocator
used with copying garbage collectors. This is *not* a "pool" allocator
like Apache uses. What happens is when the heap fills up (probably on a
seg fault triggered by using an obj outside the current address space),
the collector is triggered. It traverses live data and copies it into a
new space (in a simple copying collector these are called "from" and "to"
spaces). Generational collectors often work similarly, but they have
more than two spaces and special rules for references between spaces.

  2. Work proportional to live data, not total data. This is hard to
 believe for a C programmer, but good garbage collectors don't have
 to "free" every allocation -- they just have to preserve the live,
 or reachable, data. Some researchers have estimated that 90% or
 more of all allocated data dies (becomes unreachable) before the
 next collection. A ref count system has to work on every object,
 but smarter collectors only work on 10% of the objects.
 
 That may work for C, but not for Perl.

Um, no. It works pretty well for Lisp, ML, Prolog, etc. I'm positive
that it would work fine for Perl too.

 sub test {
 my($foo, $bar, %baz);
 ...
 return \%baz;
 }
 
 You may notice that only PART of the locally malloced memory, gets
 freed. the memory of %baz may well be in the middle of that pool. You're
 making a huge mistake if you simply declare the whole block dead weight.

You don't understand how collectors work. You can't think about individual
allocations anymore -- that's a fundamental and severe restriction on
malloc(). What happens is that the garbage accumulates until a collection
happens. When the collection happens, live data is saved and the garbage
over-written.

In your example above, the memory for $foo and $bar is not reclaimed
until a collection occurs. %baz is live data and will be saved when
the collection occurs (often done by copying it to a new heap space).
Yes, this means it is *totally* unsafe to hold pointers to objects in
places the garbage collector doesn't know about. It also means that
memory working-set sizes may be larger than with a malloc-style system.

There are lots of advantages though -- re-read my previous note.

The one big down-side to non-ref count GC is that finalization is
delayed until collection -- which may be relatively infrequently when
there's lots of memory. Data flow analysis can allow us to trigger
finalizers earlier, but that's a lot harder than just watching a ref
count.

- Ken



Re: JWZ on s/Java/Perl/

2001-02-11 Thread Ken Fox

[Please be careful with attributions -- I didn't write any
 of the quoted material...]

Russ Allbery wrote:
   sub test {
   my($foo, $bar, %baz);
   ...
   return \%baz;
   }

 That's a pretty fundamental aspect of the Perl language; I use that sort
 of construct all over the place.  We don't want to turn Perl into C, where
 if you want to return anything non-trivial without allocation you have to
 pass in somewhere to put it.

There's no problems at all with that code. It's not going to break under
Perl 6. It's not going to be deprecated -- this is one of the ultimate
Keep Perl Perl language features!

I think that there's a lot of concern and confusion about what it means to
replace perl's current memory manager (aka garbage collector) with something
else. The short-term survival guide for dealing with this is "only believe
what Dan says." The longer-term guide is "only believe what Benchmark says."

There are only three Perl-visible features of a collector that I can think
of (besides the obvious "does it work?"):

1. How fast does it run?
2. How efficient is it? (i.e. what's the overhead?)
3. When does it call object destructors?

The first two are too early to talk about, but if Perl 6 is worse than
Perl 5 something is seriously wrong.

The last has never been defined in Perl, but it's definitely something to
discuss before the internals are written. Changing it could be a *major*
job.

- Ken



Re: JWZ on s/Java/Perl/

2001-02-09 Thread Ken Fox

Branden wrote:
 Ken Fox wrote:
  Some researchers have estimated that 90% or
  more of all allocated data dies (becomes unreachable) before the
  next collection. A ref count system has to work on every object,
  but smarter collectors only work on 10% of the objects.
 
 Does this 90/10 ratio mean that the memory usage is actually 10 times it
 needs to be? (if it were even _possible_ to pack all the data without
 fragmentation problems)

The general rule is the more space you "waste" the faster the collector
is. If you have memory to spare, then don't run the garbage collector as
often and your program will spend less total time garbage collecting.
In other words, the collection cost per object approaches zero.

If you "need" to go faster, then waste more memory.

If you "need" to use less memory, then go slower and collect more
frequently.

When comparing the memory management efficiency of different approaches,
it's very important to remember all the costs that the approaches have.
C-style malloc has quite a bit of overhead per object and tends to
fragment the heap. Many garbage collectors don't have either of these
problems.

Garbage collectors are very good from an efficiency perspective, but
tend to be unreliable in a mixed language environment and sometimes
impose really nasty usage requirements.

- Ken



Re: Garbage collection (was Re: JWZ on s/Java/Perl/)

2001-02-09 Thread Ken Fox

Dan Sugalski wrote:
 At 04:09 PM 2/9/2001 -0200, Branden wrote:
  If I change the way some objects are used so
  that I tend to create other objects instead of reusing the old ones, I'm
  actually not degrading GC performance, since its work is proportional to
  live data. Right?
 
 Correct. Whether reuse is a win overall is a separate question.

It's totally dependent upon hardware. From a software big-O type of
analysis, creating new objects is never slower than reusing objects.

The problems come about if (a) memory is low and the OS decides to
page without telling the application to prepare for paging or (b) if all
memory isn't the same speed, e.g. caches are faster than main memory.

  This increases memory usage, though, right? Would this
  cause some thrashing if the excessive memory usage causes degrading to
  virtual memory? ...
 
 It depends on whether the old structures are really unused. If they are,
 one of the GC passes will reclaim the space they're taking.

It also depends on locality of reference. Semi-space-based collectors
are not bad at preserving locality -- mark-sweep and malloc-like allocators
are terrible.

The weird thing is that a collector can actually *improve* locality by
moving objects "close" to the things they refer to. In perl's case, the
collector could move the underlying value representation close to the PMC
that refers to it. (But we may want to pin a PMC so that foreign code
can keep references to it. Argh.)

 (It's safe to assume that if perl 6's garbage collector causes otherwise
 small programs to swap then it's busted and needs fixing)

If you mean small as in "tight loop" then I agree. If you mean small as
in a "quick one liner" then I'm not sure. The quick one liners run quickly
and speeding memory management up/down by 100% might not even be noticeable.

 The less memory you chew through the faster your code will probably be (or
 at least you'll have less overhead). Reuse is generally faster and less
 resource-intensive than recycling. What's true for tin cans is true for memory.

The electrons are re-used whether you allocate a new object or not... ;)

 Going to a more advanced garbage collection scheme certainly isn't a
 universal panacea--mark and sweep in perl 6 will *not* bring about world
 peace or anything. It will (hopefully) make our lives easier, though.

Mark-sweep doesn't have a cheap allocator or good locality. At this point
in history, I think if we don't go with a more advanced system we're not
learning.

- Ken



Re: more POST recitation

2001-02-09 Thread Ken Fox

"David L. Nicol" wrote:
 # with POST
 sub find_first_line_matching_array($\@){
 open F, shift or die "could not open: $!";
 POST{close F};
 while(F){
 foreach $w (@{$_[0]}){
 return $_ if /$w/;
 }   }   }

I'd rather not use POST for resource cleanup at all. Why not
just:

sub find_first_line_matching_array($\@){
   my $f = open shift or die "could not open: $!";
   while($f){
  foreach $w (@{$_[0]}){
 return $_ if /$w/;
   }
}

We already have object destructors invoked when they go out
of scope. Why not push that technique until we reach a situation
where it doesn't work? Do you have something in mind?

- Ken



Re: Why shouldn't sleep(0.5) DWIM?

2001-02-01 Thread Ken Fox

Dan Sugalski wrote:
 At 11:57 PM 1/31/2001 +0100, [EMAIL PROTECTED] wrote:
  On Wed, Jan 31, 2001 at 05:35:03PM -0500, Michael G Schwern wrote:
   grossly UNIX specific things like getpwnam's [can be pulled]
 
  But why? What is it going to buy you?
 
 Not that much. More than anything else the ability to deal with them
 externally for non-unix platforms.

So on Unix-like platforms getpwnam is pre-loaded and on all the others
it's auto-loaded (a Unix compatiblity module or something). If it's
faster pre-loading a module than telling the auto-loader about it, we
should pre-load. Speed always wins over size in perl.

I might be off base, but there seems to be a movement to take Unix out
of Perl. Sometimes I'm not sure if we're arguing about "time() returns
an int" or "time() looks like Unix so it has to change." IMHO I'd like
to see more Unix in Perl, not less. (We can do better than the C API
though.)

- Ken



Re: Really auto autoloaded modules

2001-02-01 Thread Ken Fox

Dan Sugalski wrote:
 At 02:04 PM 2/1/2001 -0500, Ken Fox wrote:
 Isn't the trick to detect the necessary modules at compile time?
 
 Nope, no trick at all. The parser will have a list of functions--if it sees
 function X, it loads in module Y. (Possibly version Z) Nothing fancy needs
 to be done.

Hmm. I was hoping that Perl 6 would be a bit smarter about this than
Perl 5. Is the following code going to work the same as Perl 5?

  sub getpwnam {
 print "OK\n";
  }

  getpwnam();

If it does then I can understand doing it in the parser. If it calls
the user-defined sub though then I think it will have to be done after
symbol resolution (and Perl doesn't require forward declaration, so this
means waiting until the entire program source has been seen).

Anyways, it's not something that can be done now with Perl 5 -- which
is what I meant by "the trick".

 The list of functions that trigger this automagic use-ing of modules should
 be reasonably small.

Seems to be some debate over how big the list should be. Does it include
time()? ;)

Something else to consider is whether auto-loading can occur in response
to something other than an unbound symbol (or a symbol "on the list" if
we use your implementation). Can we auto-load "bigint" if we see
10_000_000_000? Auto-load different optimizers or garbage collectors
if we see lots of closures? Auto-load the Perl parser if we see string
eval?

It'd be really powerful if auto-load were tied to pattern matching on
the parse tree. That code could implement auto-load, macros and lots of
other language features.

- Ken



Re: Really auto autoloaded modules

2001-02-01 Thread Ken Fox

Dan Sugalski wrote:
 At 12:33 PM 2/1/2001 -0500, Michael G Schwern wrote:
  Have a look at AnyLoader in CPAN.
 
 Looks pretty close to what's needed. Care to flesh it out (and streamline
 it where needed) to a PDD?

Isn't the trick to detect the necessary modules at compile time? Run-time
can always be handled with a UNIVERSAL AUTOLOAD -- it doesn't need to be
part of the core. Run-time autoload should be a replaceable module like
the debugger. (Perhaps the presence of an AUTOLOAD should turn off compile-
time autoloading too?)

We're also going to need MakeMaker support for updating the registry of
sub name - module mappings. We don't want to force full module names for
everything. Maybe the EXPORT list should be registered without module
names and the EXPORT_OK list with module names?

- Ken



safe signals + sub-second alarms [was: sleep(0.5) should DWIM]

2001-01-31 Thread Ken Fox

Branden wrote:
 Actually, with event loops and threading issues, probably things like
 the perl built-ins sleep and alarm won't ever be passed to the syscalls
 sleep(3) and alarm(3).

Sleep isn't usually a syscall -- it's often a library routine that sets
an alarm and blocks or uses some other general purpose syscall to block.
Perl *must* use one of these OS features unless you want to busy wait.

 Perl will probably block that instance of the
 interpreter internally and do some other stuff. It will probably use
 its internal clock to measure the time to unblock it, and that clock
 will probably have sub-second precision.

That's silly. You want perl to block a thread and then busy wait until
it's time for the thread to wake up? Even if perl has other threads
going it will be incredible wasteful to check a timer between every OP.

Lots of discussion on signal delivery has taken place and it seems like
people have agreed to have async delivery with the signal handler called
at the next "safe" opportunity -- between expression boundaries for
example. IMHO making alarm a high-resolution timer is not consistent
with safe signal delivery. If a user asks for alarm(10) and gets a signal
12 seconds later that's probably ok. If a user asks for alarm(0.005) and
gets a signal 1 second later that's a problem. (Perl isn't Java -- we
have lots of OPs that can take a long time to run.)

Basically I'm afraid of advertising high-res timers and then having
so many caveats that they aren't useful or portable. Stuff in the core
should be dependable.

- Ken

P.S. I didn't realize anybody was doing video games or device drivers
in Perl. Has anybody ever written code where the resolution of alarm was
a problem? I've only used alarm to prevent blocked I/O from hanging
servers. For graphical programs I've always used the toolkit dependent
alarm features.



Re: safe signals + sub-second alarms [was: sleep(0.5) should DWIM]

2001-01-31 Thread Ken Fox

Bart Lateur wrote:
 What if we take the ordinary sleep() for the largest part of the
 sleeping time (no busy wait), and the 4 argument select for the
 remainder, i.e. subsecond?

You're trying to solve a problem that doesn't exist.

Sleep doesn't have the signal delivery problems that alarm has,
but IMHO sleep and alarm must have identical argument semantics.
Since we can't reasonably provide sub-second alarm resolution
then sleep can't have it either.

If you need sub-second resolution and accuracy then use an
external module. It will be a tough module to implement -- and
very platform specific.

- Ken



Re: RFC 124 usefulness implementation suggestion

2000-10-18 Thread Ken Fox

Bart Lateur wrote:
 But isn't there going to be a large overhead, in populating such a
 "hash"?

If you need an ordered data structure the overhead would be lower
than using a hash.

 Doesn't the tree have to be reorganized every time you add a
 single new entry?

No. Sometimes you may have to re-balance the tree, but that only
requires examining the path to the item, not the entire tree. BTW,
inserting into a hash also requires occasional reorganization. You
want to keep both the number of buckets and collisions small, which
forces reorganization as a hash grows.

 Reading can be fast, I grant you.

Faster than hashes sometimes.

Anyways, hash syntax can be used to interface to balanced trees so
there isn't any reason to debate them.

- Ken



Re: RFC 277 (v1) Eliminate unquoted barewords from Perl entirely

2000-10-17 Thread Ken Fox

Nathan Wiger wrote:
 Your point is assuming that STDERR retains its weirdness, and does not
 become a simple scalar object ...

sub STDERR () { $STDERR }

or am I missing something?

 Making STDERR into $STDERR is all hinged on fast vtable stuff in core ...

Absolutely false. $STDERR does not depend on vtables in any conceivable way.

[Yeah, this is a response to a really old post, but we shouldn't let a
 mythology of vtables develop.]

- Ken



Re: Ideas that need RFCs?

2000-08-31 Thread Ken Fox

Dan Sugalski wrote:
 I expect we'd want to have some sort of heavy-duty regex optimizer, then,
 to detect common prefixes and subexpressions and suchlike things, otherwise
 we end up with a rather monstrous alternation sequence...

We need a regex merge function too -- then we could write macros that
extend the parser. Domain languages! Lexically scoped of course. ;)

Obviously we're going to need a new expansion for "regex" because these
things aren't regular expressions anymore. (Not that Perl has had regular
expressions for a long time...) Really easy grammar experiments?

- Ken



Re: implied pascal-like with or express

2000-08-29 Thread Ken Fox

"David L. Nicol" wrote:
 Ken Fox wrote:
  IMHO, curries have nothing to do with this. All "with" really does is
  create a dynamic scope from the contents of the hash and evaluate its
  block in that scope.
...
 But that doesn't give us the speed win we want from compiling offset lookups
 into a static record structure, at the cost of some funny "in -the-record"
 syntax, as in other languages that support this (pascal, VB, C)

The hash keys (symbol lookups) could be pre-computed. Unless you are
proposing something really radical, like replacing blessed hashes with
fixed-size structs, that's the best you can do. IMHO, the "with" proposal
should not assume other RFC proposals. It will be obvious to optimize
"with" if, for example, strong types are available.

- Ken



Re: RFC 54 (v1) Operators: Polymorphic comparisons

2000-08-13 Thread Ken Fox

Bart Lateur wrote:
 On Tue, 08 Aug 2000 23:43:26 -0400, Ken Fox wrote:
 (assuming min() is polymorphic):
 
   min($a, $b) eq $a
 
 Ugly, but minimal changes to the language.
 
 We could adopt a syntax similar to sort():
 
 $lowest  = min($x, $y, $z); # default: numerical (?)
 
 $first = min { $a cmp $b } ($x, $y, $z);  # alphabetical

That's missing the point that min() should be polymorphic and
information preserving though. I think min() should attempt to
downcast it's arguments until they are either the same type or
information would be lost. If information would be lost, then
upcast the simpler one to the other.

The definition of "information loss" is slippery though. The
string "10.0" has more information than 10 or even 10.0 (stored
as a fixed precision float). IMHO we could come up with a practical
definition that made generic algorithms possible.

- Ken



Re: RFC 84 (v1) Replace = (stringifying comma) with =

2000-08-13 Thread Ken Fox

Piers Cawley wrote:
$ints_from = ^1 = sub {$ints_from-(^1+1)};
$ints = $ints_from-(1);

I think pairs should use array range syntax ".." and be folded
into the array generator RFC (or at least referenced in that RFC).

In particular, using a pair in an array context should interpret
the pair as a sequence -- just like generated arrays.

  my $pair = 1..4;

  key $pair == 1
  value $pair == 4
  @{$pair} == (1, 2, 3, 4)

  my $list = 1..2..3..nil;

  key $list == 1
  value $list == 2..3..nil
  @{$list} == (1, 2, 3)

Of course in array context these would be lazily generated, i.e. streams.

For complex generated arrays, the left or right hand side of the pair
can be a bounding object which might have a generator function or step
value.

That's three special cases for turning a pair into a stream, but at
least it confines all this new magic to a single place in the language.
I think the idea that = is "just another comma" is pretty widespread
now.

BTW, regardless of whether = or .. are used, the operator has to be
right associative with low precedence. 1..2..3 should be 1..(2..3) and
not (1..2)..3. 1..$a+b..$d*e should be 1..(($a+b)..($d*e)).

Assignments to (key $pair) and (value $pair) should also do the right
thing and replace the key and value in the pair.

The use of = for named parameters IMHO is very distinct from pairs and
should be an implementation detail.

- Ken

P.S. I think it's funny that this RFC proposes a head and tail function
 that are nearly as obscure as Lisp's car and cdr.



Re: ISA number

2000-08-08 Thread Ken Fox

Peter Scott wrote:
 Have often wanted a way to tell whether a scalar was a number

 way to get at the SvIOK and SvNOK results would be great.

SvIOK, SvNOK and "is a number" are not the same thing at all.
Often numbers are strings that just look like numbers. Perl doesn't
eagerly convert stuff into numbers -- it waits until it needs to.

For example:

if (/(\d+)/) {
  ...
}

$1 is not SvIOK, but it is definitely a number. I seriously
doubt if Perl 6 is going to change this behavior (if it ain't
broken don't fix it...)

IMHO, the language shouldn't know about SvIOK etc. Perhaps
ref() can be supplemented with a typeof() function taking one
or two args. typeof($s) would just return the type of $s.
typeof($s, 'integer') could check to see if $s is coercible
to integer.

- Ken



Re: RFC 54 (v1) Operators: Polymorphic comparisons

2000-08-08 Thread Ken Fox

Michael Fowler wrote:
 I would argue that you should be manipulating your data, or checking values,
 so that numbers and strings are sorted how you wish.  The proposed isnum(),
 or way-to-determine-if-a-scalar-is-a-number would help.  This should be an
 explicit check, though, because you have a very definite idea about how you
 want things sorted.

That's not very generic programming is it? People should be able to write
code and only say "this must be ordered", and let Perl figure out what ordered
means. In C++ you have to write an overloaded comparison operator and then
use a template function. Damian is saying Perl should make this hard
thing easier.

I think Glenn Linderman implied that the string comparison is Perl's generic
ordering function. It isn't. Equality is perfect, but ordering isn't. If they
were, the results of lt would be the same as  for all numbers.

  "20" lt "3"  # true
  20 lt 3  # true -- should be false

I don't think it's possible to retrofit different behavior into Perl though,
mostly because it's weakly typed between strings and numbers.

Perhaps the new min() and max() operators should have polymorphic behavior
with the type chosen as the simplest type representation that does not lose
information? eq is already generic like Glenn said.

Here's a proper polymorphic = (assuming min() is polymorphic):

  min($a, $b) eq $a

Ugly, but minimal changes to the language. What might be nicer is to complement
min() and max() with ordered(). ordered() will just return true if all it's
arguments are "well ordered". The same information preserving downcasting
should be used in selecting an ordering function.

- Ken



Re: RFC 23 (v2) Higher order functions

2000-08-08 Thread Ken Fox

 Higher order functions

This is a very nice proposal and I'd like to see it adopted. Since
many other RFCs are bound to assume this functionality, can we put
this one on the "fast track" approval process? ;)

- Ken



Re: RFC 73 (v1) All Perl core functions should return ob

2000-08-08 Thread Ken Fox

Dan Sugalski wrote:
 The number of different vtables needed to deal with this (along with
 the functions in those tables) is rather formidable, and it will tend
 to impact performance.

Hey! That sounds like an implementation topic... ;) (The internals
should be able to handle this if the language wants it, right?)

- Ken



Re: RFC17

2000-08-06 Thread Ken Fox

Dan Sugalski wrote:
 At 02:09 AM 8/6/00 -0400, Chaim Frenkel wrote:
  uplevel 0, $Perl:Warnings=1;# Hit everyone
  uplevel -1, $Perl:Warnings=0;   # Hit my wrapper
 Yeah, I can see that. We're going to need a mechanism to hoist things to
 outer scope levels internally (for when we return objects from subs) so it
 might be worth generalizing things.

Huh? I'm not sure if Chaim is proposing the same thing as Tcl's upvar,
but that's a hack to Tcl just to get pass-by-ref semantics. I'm not fond
of Tcl's uplevel hack either. Perl's already got references, dynamic
variables *and* lexical closures; we don't need Tcl's hacks.

It isn't clear to me why things like warnings can't be a flag on the
current lexical environment. It should be cheap enough to get at a flag
on the environment. (We already have a proposal to identify the magical
function __ at compile time. We might as well generalize that and identify
flags too.)

   "DS" == Dan Sugalski [EMAIL PROTECTED] writes:
 DS I'm not entirely sure that tossing the global nature of these things is a
 DS bad idea. It is kinda convenient to be able to mess with things (like
 DS $^W) and have them stay messed-with.

Using -w on the command line could have different semantics than turning
on $^W. Also, turning on $^W at the toplevel would allow all the nested
scopes to inherit it by default. Isn't that what we want?

- Ken



Re: RFC 49 (v1) Objects should have builtin string SCALA

2000-08-06 Thread Ken Fox

Nathan Wiger wrote:
$pw = getpwnam('nwiger');
print "$pw";  # calls $pw-SCALAR, which prints 'nwiger'
die "Bad group" unless $pw-gid == 100;

I'm ashamed that this feature would mess with my (bad?) habit of
re-writing "$pw" to just $pw on the assumption that whoever wrote
it didn't know what the hell he was doing. Would anybody else be
caught like that?

What if it were a per-variable modifier? Something like $"var"? I
know that collides with special variables, but it seems a bit
more obvious that stringify behavior is wanted.

- Ken



Re: RFC 50 (v1) BiDirectional Support in PERL

2000-08-06 Thread Ken Fox

Perl6 RFC Librarian wrote:
 BiDirectional Support in PERL

I know nothing about bi-directional output. It doesn't seem
like Perl has any assumption of left-to-right at all. What's
the difference between right-to-left support in Perl and just
editing/using Perl in an xterm that does right-to-left?

Please understand that I'm not at all opposed to adding support
for other languages. I've seen German/French/Japanese/etc. C
code and it looked like a really odd mixture of English keywords
and standard library routines combined with the native language.
(And what I assume to be a weird spelling of the native language
since the comments were usually 8-bit *something* and the code
itself was ASCII.)

BTW, is anybody planning on doing an RFC for native language
keywords? It may be pointless unless CPAN modules also come
in different translations, but it seems like something I'd like
if I didn't think in English.

- Ken



Re: RFC17

2000-08-06 Thread Ken Fox

Dan Sugalski wrote:
 But, if we toss refcounts, and split GC cleanup and
 end of scope actions anyway, we need to have a mechanism to hoist things
 out of the current scope.

Why say hoist when we can say return? I can think of several ways of
returning values that don't require the caller to allocate a binding for
the return value. Variants of the existing perl 5 stack push would work
fine. We could also use a special "register" variable that return values
are shoved into.

None of these require the hackery of Tcl's upvar.

Shouldn't we adopt a policy of never adding stuff to Perl 6 the language
just because it'd be easy with a particular release of perl 6 the software?

- Ken



Re: Different higher-order func notation? (was Re: RFC 23 (v1) Higher order functions)

2000-08-06 Thread Ken Fox

[Sorry, spent too much time thinking in the editor and did not
 see this before my reply.]

Mike Pastore wrote:
 - ^foo is the placeholder 'foo'

That already has perfectly good meaning: XOR with the function foo().

 Although, I suppose '' would not work.

Why not? I think it would work great.

- Ken



Re: Different higher-order func notation? (was Re: RFC 23 (v1) Higher order functions)

2000-08-06 Thread Ken Fox

Jeremy Howard wrote:
 Anyhoo, there's no reason why you can't have ^1, ^2, and so forth, _and_
 allow named placeholders too. Although I don't see what this buys you.

Argument ordering. We might be constrained by the caller as to what order
the placeholders are passed in. Also, we want to make partial application,
i.e. recurrying, as useful/simple as possible. So it's important to
have the argument order independent of where the placeholders appear
in the expression.

  my $f = (^x  ^max) ? ^x * ^scale : ^y * ^scale;

has to be called

  $f($x, $max, $scale, $y)

First off this might not be the order the caller expects them in and
we're sunk. Also, that's a pain to re-curry if we know $max and $scale
but want $x and $y free:

  my $g = $f(^_, 10, 2, ^_);

Seems better to just write $f as:

  my $f = (^2  ^1) ? ^2 * ^0 : ^3 * ^0;

Alright, yeah, maybe not. That's total gibberish isn't it. ;) So how
about taking the *original* $f and rebinding the order of all the
arguments:

  my $f = $f(^2, ^0, ^1, ^3);

And the $g becomes:

  my $g = $f(10, 2);

Anyways, ^_ has my vote. (Although I really have a soft spot for that
Oracle SQL*Plus variable syntax... ;)

- Ken



Re: RFC: type inference

2000-08-04 Thread Ken Fox

Chaim Frenkel wrote:
 The Bytecode representation should be mutable and contain enough iformation
 for type/data flow analysis.

What do you mean by "mutable"? Wouldn't the dataflow analysis for a
given bytecode be immutable? Or do you mean the implementation should
be hackable?

 (Do you think this is possible? If it is a question of speed, would
 making it optional still have it work?)

I was thinking that dataflow information collected during compilation
would be saved in the bytecode format. The interpreter would generally
ignore it, but if the compiler read a bytecode file, it could just use
the information already there and not worry about having to analyze
bytecode.

- Ken



Re: RFC 23 (v1) Higher order functions

2000-08-04 Thread Ken Fox

[Could we get the librarian to cc: the RFC owner and hide
 replies from the announcement list?]

Perl6 RFC Librarian wrote:
 That is, the expression:
 
 $check = __  2 + __ * atan($pi/__) or die __;
 
 is equivalent to:
 
 $check = sub (;) {
 $_[0]  2 + $_[1] * atan($pi/$_[3]) or die $_[4]
 };

I *really* like it. If the internals implement this well (and it
has to be *really* fast and slim) this could have enormous impact.
This feature combined with a decent macro system could implement
many core-like special forms in external modules.

 $root-traverse( $sum += __ );

There's a syntactic ambiguity here. I assumed that __ "poisons" an
expression so that an entire parse tree gets transformed into a
closure. If you isolate the parse tree based on expression precedence,
then I'm not sure why the += example works.

And this one is worse:

 $check = sub (;) {
   @_==0 ?  __  2 + __ * atan($pi/__) or die __
 : @_==1 ?  $_[0]  2 + __ * atan($pi/__) or die __
 : @_==2 ?  $_[0]  2 + $_[1] * atan($pi/__) or die __
 : @_==3 ?  $_[0]  2 + $_[1] * atan($pi/$_[1]) or die __
 :  $_[0]  2 + $_[1] * atan($pi/$_[1]) or die $_[1]
 ;
 };

Why isn't the ?: operator part of the curry?

I think we need a curry context and that all curries must be surrounded
by parens. There should be a curry prototype for subs too. reduce can be
written as:

  sub reduce (_@) { ... }

Curries formed outside of a default curry context must be forced into
the context with curry().

If we adopt curry context and parens, then the __ can be automatically
inserted in some situations.

Here are some examples.

  sub traverse ($_);
  my $sum = 0;
  $root-traverse(($sum += __));

  my @flattened = ();
  $root-traverse((push @flattened, __));

  sub reduce (_@);
  $sum = reduce (+) (0, @nums);

  $pred = shift || curry(__ ne '');
  if ($pred($x)) { ... }

The use of parens as the curry delimiter is probably a poor choice
because of confusion with function call syntax and list context. I'm
pretty certain the parser could handle it, but I'm not sure about
the perl programmers...

- Ken



Re: RFC 28 (v1) Perl should stay Perl.

2000-08-04 Thread Ken Fox

Perl6 RFC Librarian wrote:
 not because language design is a fun thing to do of an evening.

Huh? You mean I'm supposed to pretend to not enjoy myself? I keep
all my hair shirts at work, thanks.

 If that's the case, nobody wins if we bend the Perl language out of all
 recognition, because it won't be Perl any more. So let's not do this.

Do you use closures? You know, sub { ... }. That's a blow-your-mind-cool
feature and it's the most important part of Perl *for me*. If you don't
use them that's fine by me, that's what Perl's all about. (Closures didn't
actually work right until around 5.004, but IMHO they were worth waiting
for.)

If anybody proposes a feature that destroys the dialect of Perl you
speak, complain loudly. If somebody proposes a feature you don't
understand, please take the time to learn. I don't see how we can
be a community unless we understand each other. I think I understand
that Perl might be a fragile thing and we should be careful when
changing it. However, it might be a fragile thing that dies because
we don't put any new ideas into it. (Very unlikely problem with
Damian around...)

With that said, I could agree with RFC 28 if the insulting and anti-
community rhetoric about computer science and new-ideas-are-bad was
toned down. (Remember Larry's slide with the Perl influences on it?
Linguistics, Art, Common Sense *and* Computer Science.)

- Ken



Re: RFC 23 (v1) Higher order functions

2000-08-04 Thread Ken Fox

Jeremy Howard wrote:
 Unless I'm missing something here, you're not filling in the args correctly.
 I think you mean:
 
 $check = sub (;) {
   @_==0 ?  __  2 + __ * atan($pi/__) or die __
 : @_==1 ?  $_[0]  2 + __ * atan($pi/__) or die __
 : @_==2 ?  $_[0]  2 + $_[1] * atan($pi/__) or die __
 : @_==3 ?  $_[0]  2 + $_[1] * atan($pi/$_[2]) or die __
 :  $_[0]  2 + $_[1] * atan($pi/$_[1]) or die $_[3]
 ;
 };

And neither did you... ;) The last line should be:
 :  $_[0]  2 + $_[1] * atan($pi/$_[2]) or die $_[3]

That's why curried functions should automatically re-curry themselves
instead of depending upon lazy programmers. ;)

  Arguments other than the last can also be bound by the explicit use of
  placeholders:
 
 What do you mean 'other than the last'. Isn't your example showing that
 _all_ the arguments can get bound?

Not really. In the last case the arguments don't get bound, they get
evaluated. (Since __ doesn't appear, the last case is just a plain
expression.) In all the other cases, the arguments are *not* evaluated
at all, only bound.

Of course, Damian may have meant 'other than the first' and just made
a mistake. The default for a curried function is to re-curry itself if
it is given less than the number of arguments it needs. The re-curry
binds from left to right (first to last). If you don't want the default
left to right binding, then you have to use __ to skip the arguments
you don't want to bind.

- Ken



Re: RFC 25 (v1) Multiway comparisons

2000-08-04 Thread Ken Fox

Jonathan Scott Duff wrote:
 On Fri, Aug 04, 2000 at 10:52:24PM -0400, Ken Fox wrote:
  Why would we ever use source filters when we're going to have a
  beautiful extend-syntax macro system.
 
 Because source filters *are* that macro system.  Why would we invent
 yet another language within a language when we can use a language we
 already know--Perl.  We just need to make source filters part of the
 language rather than a module (Perl 5 is already almost there).

Look at the hoops Damian had to go through to implement switch -- he
basically had to *parse perl by hand*. That means the macro system
is weaker than it should be. Any macro system that doesn't know how
to tokenize Perl is too weak.

Source filtering is great for things like compression and encryption.
It's a *long* way from being a good macro system.

- Ken



Re: RFC 22 (v1) Builtin switch statement

2000-08-04 Thread Ken Fox

Glenn Linderman wrote:
 For instance,
 
   if ( $v == @foo )
 
 is clearly testing the size of foo against the value of $v.  But
 
   switch ( $v ) {
 case 1: { ... }
 case 2: { ... }
 case @foo { ... }
   }
 
 does not!

Then write the switch as:

  switch ( __ ) {
case $v == 1: { ... }
case $v == 2: { ... }
case $v == @foo { ... }
  }

It might take you a little while to get your head around the __
symbol. I'm not sure it's useful to think of it as a variable;
poison is more like it. Or a Midas touch. Any expression it
touches turns into a subroutine. All the case statement does is
call the subroutine.

- Ken