Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-30 Thread David L. Nicol

Dan Sugalski wrote:
 
 At 02:52 PM 3/29/2001 -0800, Russ Allbery wrote:
 James Mastros [EMAIL PROTECTED] writes:
 
   Ahh, bingo.  That's what a number of people (inculding me) are
   suggesting -- a :functional / :pure / :stateless /
   :somthingelseIdontrecall attribute attachable to a sub.
 
 The experience from gcc, which has a similar attribute, is that such an
 attribute will be fairly rarely used and that most of your gains will come
 from managing to teach the compiler to figure out that information for
 itself.

 I'm hoping to have this stage of optimization in perl.
 
 Dan


  define non-MY to mean,  dynamic or lexical from outside of the function,
  so, non-my relative to the function.

obviously_clean(\subX){ # could be defined as:
not (any appearance of known-unclean functions)
   and
not (non-MY appears at all)
}

  subs and functions can get flagged clean/unclean fairly quickly then.
  known-unclean are things like "rand" and any functions that have not
  been marked "clean" already.

  The attribute would force a non-obviously clean subroutine to be treated
as clean, and memoized, unless use-less-memory is in effect. It would allow
for some pretty subtle abuses.




Re: Schwartzian transforms

2001-03-29 Thread Bart Lateur

On Wed, 28 Mar 2001 11:11:20 -0500, Dan Sugalski wrote:

   "Can perl automatically optimize away function and tie calls inside
a sort function, and under what circumstances?"

It doesn't really matter if the functions inside the sort function are 
idempotent--what matters is whether it's OK for us to go and memoize the 
things (or whatever else we might choose to do)

Exactly. This whole discussion borders on the edge of the ridiculous.
Any sort algorithm ALWAYS assumes that comparisons are constant, i.e.
return consequent results on subsequent calls. They always infer sorting
information out of what they got yet. That's why they hardly ever need
to compare every item with every other item. The fewser comparisons, the
better the algorithm.

As MJD very recently wrote:

An optimal sort function will not call the comparator if it
already knows what the result should be!

So true.

That implies that side effects and otherwise unmemoizability of sort
functions may *always* safely be ignored. If not, the programmer has a
personal problem, i.e. a bad starting point. Otherwise, the results of
sort would largely depend upon which sort algorithm was used internally!

-- 
Bart.



Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-29 Thread James Mastros

On Wed, Mar 28, 2001 at 03:41:42PM -0800, Hong Zhang wrote:
 Are we over-optimizing? The Perl is just an interpreter language.
 Who really needs this kind of optimization for Perl? Even C does
 not provide this feature. 
Umm, art thou sure?  C can optimize better then we currently do many times,
because it doesn't have to worry about side-efects as often because it
doesn't have the concept of ties/overriden operators.  (It does, and we do,
have to worry about aliasing, but that is somthing of a smaller problem.)

Just because C doesn't memonize, doesn't mean we shouldn't have that
optimization available to us.  So many other optimizations that are doable
in C aren't in perl.

 Though Pascal/Ada have distinctions
 like function/procedure, it does not make them any faster than C.
Umm, I don't know Ada, but in Pascal, the only difference is that one
returns a value and the other does not (IE like void vs. nonvoid functions
in C, or sub vs. function in VB).  

 Just given its ugly name, I hate to see it in the core language.
 If people really want to optimize Perl, they can write a native
 compiler for Perl with advanced garbage collector, just like
 Scheme or Strongtalk compiler?
We want to make it as fast as reasonably possible.  Writing a native
compiler might not be _reasonably_ possible.  And an advanced GC will almost
certianly be part of perl6; they're orthogonal issues.

 -=- James Mastros
-- 
The most beautiful thing we can experience is the mysterious.  It is the
source of all true art and science.  He to whom this emotion is a stranger,
who can no longer pause to wonder and stand wrapt in awe, is as good as dead.
-=- Albert Einstein
AIM: theorbtwo   homepage: http://www.rtweb.net/theorb/



Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-29 Thread Jarkko Hietaniemi

On Thu, Mar 29, 2001 at 10:25:06AM -0500, James Mastros wrote:
 On Wed, Mar 28, 2001 at 03:41:42PM -0800, Hong Zhang wrote:
  Are we over-optimizing? The Perl is just an interpreter language.
  Who really needs this kind of optimization for Perl? Even C does
  not provide this feature. 
 Umm, art thou sure?  C can optimize better then we currently do many times,

Somewhat tangentially: this reminds me of a message a week ago or so
(can't find it anymore in my inbox) which proposed writing C (or C++)
code for Perl 6 so that "modern CPU architectures are happy" (no
pipeline stalls because of "if"-s, etc.)  Hello?  This is a very
high-level language we are writing, not a DSP core.  Optimizing by
choosing good algorithms and data structures, yes, microoptimizing,
maybe, only after the code works first, and even then we would be
following the mirage since CPU architectures do evolve, and in
general, for large codebases, the C compilers are much, much, better
in optimizing than humans.  Yes, a human can sit down and read
the databooks and optimize a simple algorithm to hell and back.
But megabytes of source code?  Get real.

-- 
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen



Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-29 Thread John Porter

Jarkko Hietaniemi wrote:
 ... proposed writing C (or C++)
 code for Perl 6 so that "modern CPU architectures are happy" (no
 pipeline stalls because of "if"-s, etc.)  ... in
 general, for large codebases, the C compilers are much, much,
 better in optimizing than humans.

I totally agree.  That would just be absurd.

We need to keep our priorities straight.  We have to
live with the fact that perl might scream a little more
on one platform than on another.  Far better for us
to write code that is maintainable for the long haul.
Shoot, by the time perl7 comes out, the C compilers 
will just be that much better (and the hardware that
much faster).

-- 
John Porter




Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-29 Thread Dave Mitchell

Jarkko Hietaniemi [EMAIL PROTECTED] wrote:
 Somewhat tangentially: this reminds me of a message a week ago or so
 (can't find it anymore in my inbox) which proposed writing C (or C++)
 code for Perl 6 so that "modern CPU architectures are happy" (no
 pipeline stalls because of "if"-s, etc.)  Hello?  This is a very
 high-level language we are writing, not a DSP core.  Optimizing by
 choosing good algorithms and data structures, yes, microoptimizing,
 maybe, only after the code works first, and even then we would be
 following the mirage since CPU architectures do evolve, and in
 general, for large codebases, the C compilers are much, much, better
 in optimizing than humans.  Yes, a human can sit down and read
 the databooks and optimize a simple algorithm to hell and back.
 But megabytes of source code?  Get real.

That may have been me:

http://archive.develooper.com/perl6-internals%40perl.org/msg02685.html

(PDD for coding conventions)

The main thrust of that was whether a PDD on coding conventions
should have sections on:
* Coding style
* Naming conventions
* Commenting conventions
* Portability guidelines
* Performance guidelines

Based on your comments above (which I hearily agree with), I guess
we can safely dispense with that last entry.

Dave M.




Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-29 Thread John Porter

Dave Mitchell wrote:
 The main thrust of that was whether a PDD on coding conventions
 should have sections on:
...
 * Performance guidelines
 
 ...I guess we can safely dispense with that last entry.

No, performance guidelines are probably still appropriate;
but doing hand-coded peephole optimizations is not the
Right Thing.  As Jarkko, said:
  Optimizing by
  choosing good algorithms and data structures, yes

-- 
John Porter




Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-29 Thread Dan Sugalski

At 10:25 AM 3/29/2001 -0500, James Mastros wrote:
On Wed, Mar 28, 2001 at 03:41:42PM -0800, Hong Zhang wrote:
  Are we over-optimizing? The Perl is just an interpreter language.
  Who really needs this kind of optimization for Perl? Even C does
  not provide this feature.
Umm, art thou sure?  C can optimize better then we currently do many times,
because it doesn't have to worry about side-efects as often because it
doesn't have the concept of ties/overriden operators.  (It does, and we do,
have to worry about aliasing, but that is somthing of a smaller problem.)

Aliasing is actually one of the bigger problems with C, or so I'm lead to 
believe. It gets in the way of a number of optimizations rather badly. (So 
say some of Compaq's C and Fortran compiler folks, and I have no reason to 
doubt them. The Fortran compiler often generates faster code than the C 
compiler for this reason apparently)

Just because C doesn't memonize, doesn't mean we shouldn't have that
optimization available to us.  So many other optimizations that are doable
in C aren't in perl.

That's not true. Perl's probably more optimizable than C for many things.

  Though Pascal/Ada have distinctions
  like function/procedure, it does not make them any faster than C.
Umm, I don't know Ada, but in Pascal, the only difference is that one
returns a value and the other does not (IE like void vs. nonvoid functions
in C, or sub vs. function in VB).

Well, functions without return values are mildly faster than functions with 
return values. No stack and/or register manipulation needed for the return.


Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-29 Thread Dan Sugalski

At 03:41 PM 3/28/2001 -0800, Hong Zhang wrote:
Are we over-optimizing? The Perl is just an interpreter language.

Perl's not just an interpreter language, and hasn't been for a while. 
(Granted the bytecode compiler's not fully functional, but it does work in 
some cases)

Who really needs this kind of optimization for Perl?

I do. Lots of people with web apps do. Pretty much anyone with a large or 
long-running perl program does.

Even C does
not provide this feature.

C doesn't have the sort of sort function that makes this possible in the 
first place, nor does it have the sort of data that makes it a reasonable 
thing to do. Perl does, so some optimization is reasonable. Whether it's a 
good first spot to optimize is an open question, but it will be a big win 
for quite a few people.

Though Pascal/Ada have distinctions
like function/procedure, it does not make them any faster than C.

Depends on how the function's written in C. A function that has no 
meaningful return yet still returns one which gets inevitably thrown away 
will cost more than one that never returns anything, and there is a 
tendency to return *something* since you can.

Just given its ugly name, I hate to see it in the core language.

What, the ST? Why on earth would you think that we'd make much of a big 
deal over it in the docs? I'd just soon not mention it and just put it in 
as an optimization and have it be a pleasant performance win.

If people really want to optimize Perl, they can write a native
compiler for Perl with advanced garbage collector, just like
Scheme or Strongtalk compiler?

What, as opposed to the interpreter with advanced garbage collector? :)

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-29 Thread Jarkko Hietaniemi

On Thu, Mar 29, 2001 at 11:29:16AM -0500, Dan Sugalski wrote:
 At 05:19 PM 3/29/2001 +0100, Dave Mitchell wrote:
 Jarkko Hietaniemi [EMAIL PROTECTED] wrote:
   Somewhat tangentially: this reminds me of a message a week ago or so
   (can't find it anymore in my inbox) which proposed writing C (or C++)
   code for Perl 6 so that "modern CPU architectures are happy" (no
   pipeline stalls because of "if"-s, etc.)  Hello?  This is a very
   high-level language we are writing, not a DSP core.  Optimizing by
   choosing good algorithms and data structures, yes, microoptimizing,
   maybe, only after the code works first, and even then we would be
   following the mirage since CPU architectures do evolve, and in
   general, for large codebases, the C compilers are much, much, better
   in optimizing than humans.  Yes, a human can sit down and read
   the databooks and optimize a simple algorithm to hell and back.
   But megabytes of source code?  Get real.
 
 That may have been me:

(Sorry if I sounded too blunt, the other stuff in the PDD was fine.)

 http://archive.develooper.com/perl6-internals%40perl.org/msg02685.html
 
 (PDD for coding conventions)
 
 The main thrust of that was whether a PDD on coding conventions
 should have sections on:
 * Coding style
 * Naming conventions
 * Commenting conventions
 * Portability guidelines
 * Performance guidelines
 
 Based on your comments above (which I hearily agree with), I guess
 we can safely dispense with that last entry.
 
 Don't. Hand-optimizing the C code's pretty silly generally, but there are 
 some reasonable rules of thumb that can gain enough of a performance boost 
[snip]

Yes.  Rules of thumb are good, sort of like common lore of how
CPUs in general work.  I would like to add the overall principle of
optimizing for speed first (that is, at the expense of space, e.g.
by using wide integer types) (worry about space later, if needed),
and avoiding division/modulus.  Perhaps the rule #0 would be:
measure before you optimize.  Concentrate on the hotspots, instead
of wasting your time honing some obscure rarely executed code.

An additional coding guideline could be:

* Extensibility guidelines

(This is not the same as 'portability.')

One particular detail I've noticed cropping up would be to avoid using
char* (or {char*,length}) if it is feasible to later use a SV*
(perl5-speak) at the same spot.  An example: UTF-8 hash keys in Perl5.
The same goes for flag fields: only if you can be fairly certain
that the number of flags never exceeds 32, should you use an int.

-- 
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen



Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-29 Thread Hong Zhang

 Who really needs this kind of optimization for Perl?

 I do. Lots of people with web apps do. Pretty much anyone with a large or
 long-running perl program does.

I have to say that I agree to disagree. Since it has been so controversal,
I just don't think this optimization is a good one.

 C doesn't have the sort of sort function that makes this possible in the
 first place, nor does it have the sort of data that makes it a reasonable
 thing to do. Perl does, so some optimization is reasonable. Whether it's a
 good first spot to optimize is an open question, but it will be a big win
 for quite a few people.

C has qsort() and bsearch(). Anyway, I will not put it on my optimization
list.

 Though Pascal/Ada have distinctions
 like function/procedure, it does not make them any faster than C.

 Depends on how the function's written in C. A function that has no
 meaningful return yet still returns one which gets inevitably thrown away
 will cost more than one that never returns anything, and there is a
 tendency to return *something* since you can.

The function in Ada can not have any side effect, i.e. no change to globals.
The procedure can have side effect. It gives compilers some more chances
for optimizations. For example (pseudo code),
  function comp(int n, int m) : int;
the compiler can safely remember the result of comparison for the same
arguments.

Hong





Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-29 Thread James Mastros

On Thu, Mar 29, 2001 at 10:36:48AM -0800, Hong Zhang wrote:
 I have to say that I agree to disagree. Since it has been so controversal,
 I just don't think this optimization is a good one.
Hmm, we aren't talking sort() specificly anymore.  Look at the subject line.
G

 The function in Ada can not have any side effect, i.e. no change to globals.
 The procedure can have side effect. It gives compilers some more chances
 for optimizations. For example (pseudo code),
   function comp(int n, int m) : int;
 the compiler can safely remember the result of comparison for the same
 arguments.
Ahh, bingo.  That's what a number of people (inculding me) are suggesting --
a :functional / :pure / :stateless / :somthingelseIdontrecall attribute
attachable to a sub.

   -=- James Mastros
-- 
The most beautiful thing we can experience is the mysterious.  It is the
source of all true art and science.  He to whom this emotion is a stranger,
who can no longer pause to wonder and stand wrapt in awe, is as good as dead.
-=- Albert Einstein
AIM: theorbtwo   homepage: http://www.rtweb.net/theorb/



Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-29 Thread Juanma Barranquero

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thu, 29 Mar 2001 10:36:48 -0800, "Hong Zhang" [EMAIL PROTECTED]
wrote:

 The function in Ada can not have any side effect, i.e. no change to
 globals.  

Unless my reading of the Ada 95 standard is wrong, there's nothing
that precludes functions having side effects. The only thing an Ada
function cannot do is having "out" or "in out" parameters, so it
cannot modify its arguments, and even that can be circumvented (as
you can pass "in" access parameters).

   /L/e/k/t/u


-BEGIN PGP SIGNATURE-
Version: PGPfreeware 6.5.8 for non-commercial use http://www.pgp.com

iQA/AwUBOsOiu/4C0a0jUw5YEQKjQwCeP2Aeu6aJsdpP3asvcQDS+dyk2aoAoKS5
2gJ1o0xX1lfa0hXrVff1tPvx
=t4je
-END PGP SIGNATURE-




Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-29 Thread Piers Cawley

Dan Sugalski [EMAIL PROTECTED] writes:
 True enough. This is a small subset of general optimizations. For example,
 this:
 
 
$i = 0;
foreach (1..1000) {
 $i++;
}
 
 can be easily optimized to:
 
$i = 1000;
 
 and most language implementations with any sort of optimizer will do this. If
 $i isn't actually used after that point without another assignment it might
 well be completely optimized away. With perl, though, this does potentially
 unexpected things if $i is tied. Do we still optimize it away? Do we only do
 it if we can tell that $i's not tied? Do we force the programmer to explicitly
 note which variables are potentially tied with the "my Dog $spot" syntax and
 assume that everything else is fair game? Can we even do that in the face of
 runtime requires, dos, or evals? (Or does that force a complete reevaluation
 of the optimized bytecode)

How painful would an 'potential' optimization that marked that area in
the bytecode/optree/whatever, with something along the lines of the
following be?

If you get to this point and $i is not tied, and '=' is not
overridden for $i's class 
then $i = 1000
otherwise, do the loop.

Of course, it means that the static runtime size is going to increase,
but presumably these potential optimizations wouldn't get done in the
presence of 'use less qw/space/'

-- 
Piers




Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-29 Thread Dan Sugalski

At 10:57 PM 3/29/2001 +0100, Piers Cawley wrote:
How painful would an 'potential' optimization that marked that area in
the bytecode/optree/whatever, with something along the lines of the
following be?

 If you get to this point and $i is not tied, and '=' is not
 overridden for $i's class
 then $i = 1000
 otherwise, do the loop.

I've considered an "iftied" opcode that would essentially do this. The 
downside is the extra size of the bytecode as you've mentioned. I'm not 
sure if the speed boost would be worth it, but it's on my list of things to 
try.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-29 Thread Russ Allbery

Dan Sugalski [EMAIL PROTECTED] writes:

 Aliasing is actually one of the bigger problems with C, or so I'm lead
 to believe. It gets in the way of a number of optimizations rather
 badly. (So say some of Compaq's C and Fortran compiler folks, and I have
 no reason to doubt them. The Fortran compiler often generates faster
 code than the C compiler for this reason apparently)

Hence the introduction of the restrict keyword in C99 and several of gcc's
attribute extensions for marking pure functions to try to get a handle on
the problem.  *wry grin*  Yeah, that's the main thing that gets in the way
of optimizing C.

-- 
Russ Allbery ([EMAIL PROTECTED]) http://www.eyrie.org/~eagle/



Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-29 Thread Russ Allbery

James Mastros [EMAIL PROTECTED] writes:

 Ahh, bingo.  That's what a number of people (inculding me) are
 suggesting -- a :functional / :pure / :stateless /
 :somthingelseIdontrecall attribute attachable to a sub.

The experience from gcc, which has a similar attribute, is that such an
attribute will be fairly rarely used and that most of your gains will come
from managing to teach the compiler to figure out that information for
itself.

This will probably be harder in Perl than in C because C can afford to
take more time to do global optimization passes.

-- 
Russ Allbery ([EMAIL PROTECTED]) http://www.eyrie.org/~eagle/



Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-29 Thread Dan Sugalski

At 02:52 PM 3/29/2001 -0800, Russ Allbery wrote:
James Mastros [EMAIL PROTECTED] writes:

  Ahh, bingo.  That's what a number of people (inculding me) are
  suggesting -- a :functional / :pure / :stateless /
  :somthingelseIdontrecall attribute attachable to a sub.

The experience from gcc, which has a similar attribute, is that such an
attribute will be fairly rarely used and that most of your gains will come
from managing to teach the compiler to figure out that information for
itself.

This will probably be harder in Perl than in C because C can afford to
take more time to do global optimization passes.

I'm hoping to have this stage of optimization in perl. Off by default with 
a normal parse-and-go run (though certainly enableable if you want), on by 
default with the bytecode compiler.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-29 Thread Uri Guttman

 "DS" == Dan Sugalski [EMAIL PROTECTED] writes:

   This will probably be harder in Perl than in C because C can afford to
   take more time to do global optimization passes.

  DS I'm hoping to have this stage of optimization in perl. Off by
  DS default with a normal parse-and-go run (though certainly
  DS enableable if you want), on by default with the bytecode compiler.

for long running daemon type programs, any extra optimization overhead
at startup will be well worth it. this would be a great win if it wrings
out a decent amount of speedup. or you could make it only doable for
bytecode and then you run that. this shouldn't be an issue with many
systems who want that optimization pass.

uri

-- 
Uri Guttman  -  [EMAIL PROTECTED]  --  http://www.sysarch.com
SYStems ARCHitecture, Software Engineering, Perl, Internet, UNIX Consulting
The Perl Books Page  ---  http://www.sysarch.com/cgi-bin/perl_books
The Best Search Engine on the Net  --  http://www.northernlight.com



Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-29 Thread Dan Sugalski

At 06:22 PM 3/29/2001 -0500, Uri Guttman wrote:
  "DS" == Dan Sugalski [EMAIL PROTECTED] writes:

This will probably be harder in Perl than in C because C can afford to
take more time to do global optimization passes.

   DS I'm hoping to have this stage of optimization in perl. Off by
   DS default with a normal parse-and-go run (though certainly
   DS enableable if you want), on by default with the bytecode compiler.

for long running daemon type programs, any extra optimization overhead
at startup will be well worth it. this would be a great win if it wrings
out a decent amount of speedup. or you could make it only doable for
bytecode and then you run that. this shouldn't be an issue with many
systems who want that optimization pass.

There's not much point in making it doable for bytecode but not normal perl 
runs, as the same engine will be there for both. (Though I suppose the main 
perl interpreter would be smaller if there wasn't an optimizer in it, but I 
wouldn't bet it would take up much space) Since I'm going to tend to fire 
off long-running stuff in perl, I expect it'll be there for the asking. :)

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




RE: What can we optimize (was Re: Schwartzian transforms)

2001-03-29 Thread David Whipp

 From: Dan Sugalski [mailto:[EMAIL PROTECTED]]
 I'm hoping to have this stage of optimization in perl. Off by 
 default with 
 a normal parse-and-go run (though certainly enableable if you 
 want), on by 
 default with the bytecode compiler.

Don't forget about run-time information: You could mark candidates
for optimisation at compile time then, at run time, if the size
of the array is "big" then spend the time to see if you can
optimise.

I'm not sure how to define "big", but I'm sure that's solvable.


Dave.



Schwartzian transforms

2001-03-28 Thread Dan Sugalski

While all the discussion around Schwartzian transforms is interesting, the 
single ultimate question is:

   "Can perl automatically optimize away function and tie calls inside
a sort function, and under what circumstances?"

It doesn't really matter if the functions inside the sort function are 
idempotent--what matters is whether it's OK for us to go and memoize the 
things (or whatever else we might choose to do)

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Schwartzian transforms

2001-03-28 Thread James Mastros

On Wed, Mar 28, 2001 at 11:11:20AM -0500, Dan Sugalski wrote:
"Can perl automatically optimize away function and tie calls inside
 a sort function, and under what circumstances?"
Agreed.

 It doesn't really matter if the functions inside the sort function are 
 idempotent--what matters is whether it's OK for us to go and memoize the 
 things (or whatever else we might choose to do)
As far as I can see, this, in essence, gives a few basic cases:
1) The sort function is ill-defined.
2) The sort function is stateless.
3) The sort function is simply internaly stateless.
4) The function is well-defined, but not stateless whatsoever.

In case 1, The sort won't work anyway, so we can ignore this case.

I'm of the opinion that we should consider 3 to be Just Plain Silly and not
worth worring about overmuch.  (These would be functions that increment a
counter every time they are accessed, for example.)

I think that the difference between 43 dosn't matter.  We only have things
in 4 and not 3 that vary in abs(), but not sign.

We're left with 12, and for 1, the sort won't work anyway.

So long as we consider 2 Just Plain Silly, we're OK memonizing.

  -=- James Mastros
-- 
The most beautiful thing we can experience is the mysterious.  It is the
source of all true art and science.  He to whom this emotion is a stranger,
who can no longer pause to wonder and stand wrapt in awe, is as good as dead.
-=- Albert Einstein
AIM: theorbtwo   homepage: http://www.rtweb.net/theorb/



Re: Schwartzian transforms

2001-03-28 Thread Dan Sugalski

At 11:22 AM 3/28/2001 -0500, John Porter wrote:
Dan Sugalski wrote:
  It doesn't really matter if the functions inside the sort function are
  idempotent--what matters is whether it's OK for us to go and memoize the
  things (or whatever else we might choose to do)

Exactly, that's what I've been trying to say.
And that's why I propose the :constant/:function/:pure/:stateless
attribute, so that perl only has to trust the programmer to say
which functions can be memoized.

I'm actually considering whether we even need to care what the programmer's 
said. If we can just flat-out say "We may optimize your sort function, and 
we make no guarantees as to the number of times tied data is fetched or 
subs inside the sort sub are called" then life becomes much easier.

Of course, we may not be able to say that, in which case hints of any sort 
are a Good Thing.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Schwartzian transforms

2001-03-28 Thread John Porter

James Mastros wrote:
 I'm of the opinion that we should consider 3 to be Just Plain Silly and not
 worth worring about overmuch.  

AFAICT, you're worrying about everything overmuch.

It suffices, I believe, to put the following contract on the
sort() function:

  If the user-supplied comparison function is not relative-idempotent,
  then the results of sort() are not guaranteed to be sorted.

  If the user-supplied key extraction function is tagged with
  :function/:pure (or whatever), then perl is free to optimize
  the operation of sort() by memoizing the results of calls to
  that function.

-- 
John Porter




Re: Schwartzian transforms

2001-03-28 Thread Bryan C. Warnock

On Wednesday 28 March 2001 11:47, Dan Sugalski wrote:
 At 11:22 AM 3/28/2001 -0500, John Porter wrote:
 Dan Sugalski wrote:
   It doesn't really matter if the functions inside the sort function are
   idempotent--what matters is whether it's OK for us to go and memoize
   the things (or whatever else we might choose to do)
 
 Exactly, that's what I've been trying to say.
 And that's why I propose the :constant/:function/:pure/:stateless
 attribute, so that perl only has to trust the programmer to say
 which functions can be memoized.

 I'm actually considering whether we even need to care what the programmer's
 said. If we can just flat-out say "We may optimize your sort function, and
 we make no guarantees as to the number of times tied data is fetched or
 subs inside the sort sub are called" then life becomes much easier.


But you can't.  A complex sort can currently by simplified, if desired.  To 
invert the behavior (simplification first), you'd still need a way to 
GWBrecomplexify/GWB it, for the folks who need a fetch every time.

 Of course, we may not be able to say that, in which case hints of any sort
 are a Good Thing.

Yes.  One way or t'other.

-- 
Bryan C. Warnock
[EMAIL PROTECTED]



Re: Schwartzian transforms

2001-03-28 Thread John Porter

Dan Sugalski wrote:
...
 subs inside the sort sub are called" then life becomes much easier.

Easier for perl.  Don't we want to make life easier for the programmer?
I mean, in general, it would be nice if there were a way to have
perl memoize for us, rather than have to arrange it ourself.
It could benefit a lot of situations besides sorting.

-- 
John Porter




Re: Schwartzian transforms

2001-03-28 Thread Simon Cozens

On Wed, Mar 28, 2001 at 11:59:19AM -0500, John Porter wrote:
 I mean, in general, it would be nice if there were a way to have
 perl memoize for us, rather than have to arrange it ourself.

Again with the over-specific solutions! What you seem to want is for 
(for instance)

sub foo :memoize  # Colon Rule conformance
{
# THING
}

to automagically memoize the subroutine. That's the specific solution.

The general solution is "allow people to register user-defined behaviour
triggered by certain attributes", hence:

use attribute::sub memoize = sub {
my ($coderef, $argument) = @_;
exists $cache{$coderef}{$argument} ? 
   $cache{$coderef}{$argument} :
   $cache{$coderef}{$argument} = $coderef-($argument);
}

-- 
An ASCII character walks into a bar and orders a double.  "Having a bad
day?" asks the barman.  "Yeah, I have a parity error," replies the ASCII
character.  The barman says, "Yeah, I thought you looked a bit off." 
-- from Skud



Re: Schwartzian transforms

2001-03-28 Thread Bryan C. Warnock

Since I'm supposed to be summarizing this thread for Simon's weekly write-up, 
let me make sure I have the four(?) basic suggestions/stances.

1) There are too many variations/problems/issues to bother having Perl try to 
handle them all.  If folks want an optimized sort, they should continue to 
use the ST, or roll something similar.

2) Perl should create some form of special syntax explicitly for doing an ST 
on data.  (Other than the special syntax of the ST, of course.)

3) Perl should provide a general memoization mechanism, usable outside sort, 
but that could be used to get ST-like behavior from a regular sort routine.

  sort { f'($a) cmp f''($b) | ... | f``($a) cmp f`($b) } @list;  or
  sort { $a-f' cmp $b-f'' | ... | $a-f`` cmp $b-f` } @list;

Each value in list would have the results for f() cached for subsequent 
comparisons within the sort.  This would eliminate the need for the ST.

4) Should should grok a sort as an ST.

  sort { f'($a) cmp f''($b) | ... | f``($a) cmp f`($b) } @list;  or
  sort { $a-f' cmp $b-f'' | ... | $a-f`` cmp $b-f` } @list;

Perl should see this and think aha!

  map { $_-[0] } 
  sort { $a-[1] cmp $b-[2] | ... | $a-[-2] cmp $b-[-1] }
  map { [$_, f'($_), f''($_), ... , f``($_), f`($_)] } @list;

Did I grossly miss anyone's position?

On Wednesday 28 March 2001 15:02, Dan Sugalski wrote:
 At 11:59 AM 3/28/2001 -0500, John Porter wrote:
 Dan Sugalski wrote:
  ...
   subs inside the sort sub are called" then life becomes much easier.
 
 Easier for perl.  Don't we want to make life easier for the programmer?
 I mean, in general, it would be nice if there were a way to have
 perl memoize for us, rather than have to arrange it ourself.
 It could benefit a lot of situations besides sorting.

 I'm not talking about making it easier on perl so much as making it faster.
 Basically to give us the wiggle room to recognize some simple constructs
 like

 foo($a) = bar($b)

 or

 foo($a) cmp bar($b)

 and optimize them to a table build and sort. This would work for plain perl
 data structures as well, as we might potentially be doing a fair amount of
 data conversion through the variable vtable interface. (Not to mention the
 issues of data mangling for proper Unicode sorting support)

   Dan

 --"it's like this"---
 Dan Sugalski  even samurai
 [EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

-- 
Bryan C. Warnock
[EMAIL PROTECTED]



Re: Schwartzian transforms

2001-03-28 Thread Russ Allbery

Dan Sugalski [EMAIL PROTECTED] writes:

 I'm actually considering whether we even need to care what the
 programmer's said. If we can just flat-out say "We may optimize your
 sort function, and we make no guarantees as to the number of times tied
 data is fetched or subs inside the sort sub are called" then life
 becomes much easier.

I am strongly in favor of that approach.  I see no reason to allow for
weird side effects in Perl 6.  (Perl 5 would be a different matter, of
course.)

Not only is it simpler to deal with, it's simpler to *explain*, and that's
important.

-- 
Russ Allbery ([EMAIL PROTECTED]) http://www.eyrie.org/~eagle/



RE: Schwartzian transforms

2001-03-28 Thread David Whipp

 From: Russ Allbery [mailto:[EMAIL PROTECTED]]
 we can just flat-out say "We may optimize your
  sort function"
 
 I am strongly in favor of that approach.  I see no reason to allow for
 weird side effects in Perl 6.

Let me second the motion. "Allow optimisation" should be the default. A
programmer should, however, be able to say

  sort sub :no_memoize { $global++ ; (($a*10+$b)%3)-1 } (1..10);

if they really want to. But make the programmer say "I am doing something
wierd", not the other way round.


Dave.



Re: Schwartzian transforms

2001-03-28 Thread John Porter

Russ Allbery wrote:
 I'd really like to see a concrete example of a sane sorting function which
 cannot be memoized.  (Issues of syntax aside; just caching the result of
 comparing any two pairs of data results in caching data that a sane
 sorting algorithm will never use again.  

Well, we seem to be talking about different things.

I'm talking about memoizing the key extraction function.

Even a *sane* sorter will call get_keys($a) multiple times
for any given value of $a, in general.

But I guess your question still stands:  Why/when would we
ever *not* want to cache the result?  Any user code which
depends on how many times the key extraction function was
called is certainly kludgerific, and should not be condoned! :-)

-- 
John Porter

Give the braindead no head.




What can we optimize (was Re: Schwartzian transforms)

2001-03-28 Thread Dan Sugalski

At 01:22 PM 3/28/2001 -0800, Russ Allbery wrote:
Dan Sugalski [EMAIL PROTECTED] writes:

  I'm actually considering whether we even need to care what the
  programmer's said. If we can just flat-out say "We may optimize your
  sort function, and we make no guarantees as to the number of times tied
  data is fetched or subs inside the sort sub are called" then life
  becomes much easier.

I am strongly in favor of that approach.  I see no reason to allow for
weird side effects in Perl 6.  (Perl 5 would be a different matter, of
course.)

Well, weird side effects can be rather useful to exploit. Not, mind, that I 
can think of one in the case of sorting, bur that doesn't mean there aren't 
any.

Not only is it simpler to deal with, it's simpler to *explain*, and that's
important.

True enough. This is a small subset of general optimizations. For example, 
this:

   $i = 0;
   foreach (1..1000) {
$i++;
   }

can be easily optimized to:

   $i = 1000;

and most language implementations with any sort of optimizer will do this. 
If $i isn't actually used after that point without another assignment it 
might well be completely optimized away. With perl, though, this does 
potentially unexpected things if $i is tied. Do we still optimize it away? 
Do we only do it if we can tell that $i's not tied? Do we force the 
programmer to explicitly note which variables are potentially tied with the 
"my Dog $spot" syntax and assume that everything else is fair game? Can we 
even do that in the face of runtime requires, dos, or evals? (Or does that 
force a complete reevaluation of the optimized bytecode)

I think I need to go fetch my brain out from behind the bookcase again...

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-28 Thread James Mastros

On Wed, Mar 28, 2001 at 04:36:58PM -0500, Dan Sugalski wrote:
 With perl, though, this does 
 potentially unexpected things if $i is tied. Do we still optimize it away? 
 Do we only do it if we can tell that $i's not tied? 
Yep.  And in non-trivial cases, the only way to do that might be for $i to
be :simple.

 Do we force the 
 programmer to explicitly note which variables are potentially tied with the 
 "my Dog $spot" syntax and assume that everything else is fair game? 
I'd rather see that things that will never be tied or otherwise magic be
marked as :simple.  Code always works, and will be faster with a little
effort.

 Can we
 even do that in the face of runtime requires, dos, or evals? (Or does that 
 force a complete reevaluation of the optimized bytecode)
It is a catchable error to remove a behavorial restrictor attribute such as
:simple or :stateless.  So let it be spoken, so let it be done.

This isn't any more preverse then the "you can't assign to constants" rule.

 -=- James Mastros
-- 
The most beautiful thing we can experience is the mysterious.  It is the
source of all true art and science.  He to whom this emotion is a stranger,
who can no longer pause to wonder and stand wrapt in awe, is as good as dead.
-=- Albert Einstein
AIM: theorbtwo   homepage: http://www.rtweb.net/theorb/



Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-28 Thread James Mastros

On Wed, Mar 28, 2001 at 05:57:30PM -0500, James Mastros wrote:
 [A bunch of stuff]
Oh, and I agree with sombody else on this thread that unless otherwise
stated, the sort should always assume statelessness (and thus the ability to
cache at will).  If it's trivial to see that the sort function isn't
stateless (IE it's a named sub that doesn't have the :stateless attribute
set), then have an optional warning, because you probably don't want to be
using that function, or the function should be marked :stateless.

  -=- James Mastros
-- 
The most beautiful thing we can experience is the mysterious.  It is the
source of all true art and science.  He to whom this emotion is a stranger,
who can no longer pause to wonder and stand wrapt in awe, is as good as dead.
-=- Albert Einstein
AIM: theorbtwo   homepage: http://www.rtweb.net/theorb/