Re: Perl culture, perl readabillity

2001-03-28 Thread Otto Wyss

> - Make readability your main objective. Readability is possibly the
> weakest part of Perl.
> 
> - Keep your eyes on modularity. Modularity is by far the best concept
> where complexity could be hidden.
> 
> - Don't forget usability. This is after all the point why people use
> Perl in the first place.
> 
It seems you are not interested in critics, so lets end this thread.

O. Wyss



Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-28 Thread Hong Zhang

Are we over-optimizing? The Perl is just an interpreter language.
Who really needs this kind of optimization for Perl? Even C does
not provide this feature. Though Pascal/Ada have distinctions
like function/procedure, it does not make them any faster than C.

Just given its ugly name, I hate to see it in the core language.
If people really want to optimize Perl, they can write a native
compiler for Perl with advanced garbage collector, just like
Scheme or Strongtalk compiler?

Hong

- Original Message -
From: "James Mastros" <[EMAIL PROTECTED]>
To: "Dan Sugalski" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Wednesday, March 28, 2001 3:01 PM
Subject: Re: What can we optimize (was Re: Schwartzian transforms)


> On Wed, Mar 28, 2001 at 05:57:30PM -0500, James Mastros wrote:
> > [A bunch of stuff]
> Oh, and I agree with sombody else on this thread that unless otherwise
> stated, the sort should always assume statelessness (and thus the ability
to
> cache at will).  If it's trivial to see that the sort function isn't
> stateless (IE it's a named sub that doesn't have the :stateless attribute
> set), then have an optional warning, because you probably don't want to be
> using that function, or the function should be marked :stateless.
>
>   -=- James Mastros
> --
> The most beautiful thing we can experience is the mysterious.  It is the
> source of all true art and science.  He to whom this emotion is a
stranger,
> who can no longer pause to wonder and stand wrapt in awe, is as good as
dead.
> -=- Albert Einstein
> AIM: theorbtwo  homepage: http://www.rtweb.net/theorb/




Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-28 Thread James Mastros

On Wed, Mar 28, 2001 at 05:57:30PM -0500, James Mastros wrote:
> [A bunch of stuff]
Oh, and I agree with sombody else on this thread that unless otherwise
stated, the sort should always assume statelessness (and thus the ability to
cache at will).  If it's trivial to see that the sort function isn't
stateless (IE it's a named sub that doesn't have the :stateless attribute
set), then have an optional warning, because you probably don't want to be
using that function, or the function should be marked :stateless.

  -=- James Mastros
-- 
The most beautiful thing we can experience is the mysterious.  It is the
source of all true art and science.  He to whom this emotion is a stranger,
who can no longer pause to wonder and stand wrapt in awe, is as good as dead.
-=- Albert Einstein
AIM: theorbtwo   homepage: http://www.rtweb.net/theorb/



Re: What can we optimize (was Re: Schwartzian transforms)

2001-03-28 Thread James Mastros

On Wed, Mar 28, 2001 at 04:36:58PM -0500, Dan Sugalski wrote:
> With perl, though, this does 
> potentially unexpected things if $i is tied. Do we still optimize it away? 
> Do we only do it if we can tell that $i's not tied? 
Yep.  And in non-trivial cases, the only way to do that might be for $i to
be :simple.

> Do we force the 
> programmer to explicitly note which variables are potentially tied with the 
> "my Dog $spot" syntax and assume that everything else is fair game? 
I'd rather see that things that will never be tied or otherwise magic be
marked as :simple.  Code always works, and will be faster with a little
effort.

> Can we
> even do that in the face of runtime requires, dos, or evals? (Or does that 
> force a complete reevaluation of the optimized bytecode)
It is a catchable error to remove a behavorial restrictor attribute such as
:simple or :stateless.  So let it be spoken, so let it be done.

This isn't any more preverse then the "you can't assign to constants" rule.

 -=- James Mastros
-- 
The most beautiful thing we can experience is the mysterious.  It is the
source of all true art and science.  He to whom this emotion is a stranger,
who can no longer pause to wonder and stand wrapt in awe, is as good as dead.
-=- Albert Einstein
AIM: theorbtwo   homepage: http://www.rtweb.net/theorb/



What can we optimize (was Re: Schwartzian transforms)

2001-03-28 Thread Dan Sugalski

At 01:22 PM 3/28/2001 -0800, Russ Allbery wrote:
>Dan Sugalski <[EMAIL PROTECTED]> writes:
>
> > I'm actually considering whether we even need to care what the
> > programmer's said. If we can just flat-out say "We may optimize your
> > sort function, and we make no guarantees as to the number of times tied
> > data is fetched or subs inside the sort sub are called" then life
> > becomes much easier.
>
>I am strongly in favor of that approach.  I see no reason to allow for
>weird side effects in Perl 6.  (Perl 5 would be a different matter, of
>course.)

Well, weird side effects can be rather useful to exploit. Not, mind, that I 
can think of one in the case of sorting, bur that doesn't mean there aren't 
any.

>Not only is it simpler to deal with, it's simpler to *explain*, and that's
>important.

True enough. This is a small subset of general optimizations. For example, 
this:

   $i = 0;
   foreach (1..1000) {
$i++;
   }

can be easily optimized to:

   $i = 1000;

and most language implementations with any sort of optimizer will do this. 
If $i isn't actually used after that point without another assignment it 
might well be completely optimized away. With perl, though, this does 
potentially unexpected things if $i is tied. Do we still optimize it away? 
Do we only do it if we can tell that $i's not tied? Do we force the 
programmer to explicitly note which variables are potentially tied with the 
"my Dog $spot" syntax and assume that everything else is fair game? Can we 
even do that in the face of runtime requires, dos, or evals? (Or does that 
force a complete reevaluation of the optimized bytecode)

I think I need to go fetch my brain out from behind the bookcase again...

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Schwartzian transforms

2001-03-28 Thread John Porter

Russ Allbery wrote:
> I'd really like to see a concrete example of a sane sorting function which
> cannot be memoized.  (Issues of syntax aside; just caching the result of
> comparing any two pairs of data results in caching data that a sane
> sorting algorithm will never use again.  

Well, we seem to be talking about different things.

I'm talking about memoizing the key extraction function.

Even a *sane* sorter will call get_keys($a) multiple times
for any given value of $a, in general.

But I guess your question still stands:  Why/when would we
ever *not* want to cache the result?  Any user code which
depends on how many times the key extraction function was
called is certainly kludgerific, and should not be condoned! :-)

-- 
John Porter

Give the braindead no head.




RE: Schwartzian transforms

2001-03-28 Thread David Whipp

> From: Russ Allbery [mailto:[EMAIL PROTECTED]]
> >we can just flat-out say "We may optimize your
> > sort function"
> 
> I am strongly in favor of that approach.  I see no reason to allow for
> weird side effects in Perl 6.

Let me second the motion. "Allow optimisation" should be the default. A
programmer should, however, be able to say

  sort sub :no_memoize { $global++ ; (($a*10+$b)%3)-1 } (1..10);

if they really want to. But make the programmer say "I am doing something
wierd", not the other way round.


Dave.



Re: Schwartzian transforms

2001-03-28 Thread Russ Allbery

John Porter <[EMAIL PROTECTED]> writes:

>   If the user-supplied key extraction function is tagged with
>   :function/:pure (or whatever), then perl is free to optimize
>   the operation of sort() by memoizing the results of calls to
>   that function.

I'd really like to see a concrete example of a sane sorting function which
cannot be memoized.  (Issues of syntax aside; just caching the result of
comparing any two pairs of data results in caching data that a sane
sorting algorithm will never use again.  But provided that there's a way
to separate things out so that Perl *can* usefully memoize, I can't think
of any realistic sort function where this would be a problem.)

-- 
Russ Allbery ([EMAIL PROTECTED]) 



Re: Schwartzian transforms

2001-03-28 Thread Russ Allbery

Dan Sugalski <[EMAIL PROTECTED]> writes:

> I'm actually considering whether we even need to care what the
> programmer's said. If we can just flat-out say "We may optimize your
> sort function, and we make no guarantees as to the number of times tied
> data is fetched or subs inside the sort sub are called" then life
> becomes much easier.

I am strongly in favor of that approach.  I see no reason to allow for
weird side effects in Perl 6.  (Perl 5 would be a different matter, of
course.)

Not only is it simpler to deal with, it's simpler to *explain*, and that's
important.

-- 
Russ Allbery ([EMAIL PROTECTED]) 



Re: Schwartzian transforms

2001-03-28 Thread Bryan C. Warnock

Since I'm supposed to be summarizing this thread for Simon's weekly write-up, 
let me make sure I have the four(?) basic suggestions/stances.

1) There are too many variations/problems/issues to bother having Perl try to 
handle them all.  If folks want an optimized sort, they should continue to 
use the ST, or roll something similar.

2) Perl should create some form of special syntax explicitly for doing an ST 
on data.  (Other than the special syntax of the ST, of course.)

3) Perl should provide a general memoization mechanism, usable outside sort, 
but that could be used to get ST-like behavior from a regular sort routine.

  sort { f'($a)  f''($b) &| ... &| f``($a)  f`($b) } @list;  or
  sort { $a->f'  $b->f'' &| ... &| $a->f``  $b->f` } @list;

Each value in list would have the results for f() cached for subsequent 
comparisons within the sort.  This would eliminate the need for the ST.

4) Should should grok a sort as an ST.

  sort { f'($a)  f''($b) &| ... &| f``($a)  f`($b) } @list;  or
  sort { $a->f'  $b->f'' &| ... &| $a->f``  $b->f` } @list;

Perl should see this and think aha!

  map { $_->[0] } 
  sort { $a->[1]  $b->[2] &| ... &| $a->[-2]  $b->[-1] }
  map { [$_, f'($_), f''($_), ... , f``($_), f`($_)] } @list;

Did I grossly miss anyone's position?

On Wednesday 28 March 2001 15:02, Dan Sugalski wrote:
> At 11:59 AM 3/28/2001 -0500, John Porter wrote:
> >Dan Sugalski wrote:
> > >...
> > > subs inside the sort sub are called" then life becomes much easier.
> >
> >Easier for perl.  Don't we want to make life easier for the programmer?
> >I mean, in general, it would be nice if there were a way to have
> >perl memoize for us, rather than have to arrange it ourself.
> >It could benefit a lot of situations besides sorting.
>
> I'm not talking about making it easier on perl so much as making it faster.
> Basically to give us the wiggle room to recognize some simple constructs
> like
>
> foo($a) <=> bar($b)
>
> or
>
> foo($a) cmp bar($b)
>
> and optimize them to a table build and sort. This would work for plain perl
> data structures as well, as we might potentially be doing a fair amount of
> data conversion through the variable vtable interface. (Not to mention the
> issues of data mangling for proper Unicode sorting support)
>
>   Dan
>
> --"it's like this"---
> Dan Sugalski  even samurai
> [EMAIL PROTECTED] have teddy bears and even
>   teddy bears get drunk

-- 
Bryan C. Warnock
[EMAIL PROTECTED]



Re: Schwartzian transforms

2001-03-28 Thread Dan Sugalski

At 11:59 AM 3/28/2001 -0500, John Porter wrote:
>Dan Sugalski wrote:
> >...
> > subs inside the sort sub are called" then life becomes much easier.
>
>Easier for perl.  Don't we want to make life easier for the programmer?
>I mean, in general, it would be nice if there were a way to have
>perl memoize for us, rather than have to arrange it ourself.
>It could benefit a lot of situations besides sorting.

I'm not talking about making it easier on perl so much as making it faster. 
Basically to give us the wiggle room to recognize some simple constructs like

foo($a) <=> bar($b)

or

foo($a) cmp bar($b)

and optimize them to a table build and sort. This would work for plain perl 
data structures as well, as we might potentially be doing a fair amount of 
data conversion through the variable vtable interface. (Not to mention the 
issues of data mangling for proper Unicode sorting support)

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Schwartzian transforms

2001-03-28 Thread Simon Cozens

On Wed, Mar 28, 2001 at 11:59:19AM -0500, John Porter wrote:
> I mean, in general, it would be nice if there were a way to have
> perl memoize for us, rather than have to arrange it ourself.

Again with the over-specific solutions! What you seem to want is for 
(for instance)

sub foo :memoize  # Colon Rule conformance
{
# THING
}

to automagically memoize the subroutine. That's the specific solution.

The general solution is "allow people to register user-defined behaviour
triggered by certain attributes", hence:

use attribute::sub memoize => sub {
my ($coderef, $argument) = @_;
exists $cache{$coderef}{$argument} ? 
   $cache{$coderef}{$argument} :
   $cache{$coderef}{$argument} = $coderef->($argument);
}

-- 
An ASCII character walks into a bar and orders a double.  "Having a bad
day?" asks the barman.  "Yeah, I have a parity error," replies the ASCII
character.  The barman says, "Yeah, I thought you looked a bit off." 
-- from Skud



Re: Schwartzian transforms

2001-03-28 Thread John Porter

Dan Sugalski wrote:
>...
> subs inside the sort sub are called" then life becomes much easier.

Easier for perl.  Don't we want to make life easier for the programmer?
I mean, in general, it would be nice if there were a way to have
perl memoize for us, rather than have to arrange it ourself.
It could benefit a lot of situations besides sorting.

-- 
John Porter




Re: Schwartzian transforms

2001-03-28 Thread Bryan C. Warnock

On Wednesday 28 March 2001 11:47, Dan Sugalski wrote:
> At 11:22 AM 3/28/2001 -0500, John Porter wrote:
> >Dan Sugalski wrote:
> > > It doesn't really matter if the functions inside the sort function are
> > > idempotent--what matters is whether it's OK for us to go and memoize
> > > the things (or whatever else we might choose to do)
> >
> >Exactly, that's what I've been trying to say.
> >And that's why I propose the :constant/:function/:pure/:stateless
> >attribute, so that perl only has to trust the programmer to say
> >which functions can be memoized.
>
> I'm actually considering whether we even need to care what the programmer's
> said. If we can just flat-out say "We may optimize your sort function, and
> we make no guarantees as to the number of times tied data is fetched or
> subs inside the sort sub are called" then life becomes much easier.
>

But you can't.  A complex sort can currently by simplified, if desired.  To 
invert the behavior (simplification first), you'd still need a way to 
recomplexify it, for the folks who need a fetch every time.

> Of course, we may not be able to say that, in which case hints of any sort
> are a Good Thing.

Yes.  One way or t'other.

-- 
Bryan C. Warnock
[EMAIL PROTECTED]



Re: Schwartzian transforms

2001-03-28 Thread John Porter

James Mastros wrote:
> I'm of the opinion that we should consider 3 to be Just Plain Silly and not
> worth worring about overmuch.  

AFAICT, you're worrying about everything overmuch.

It suffices, I believe, to put the following contract on the
sort() function:

  If the user-supplied comparison function is not relative-idempotent,
  then the results of sort() are not guaranteed to be sorted.

  If the user-supplied key extraction function is tagged with
  :function/:pure (or whatever), then perl is free to optimize
  the operation of sort() by memoizing the results of calls to
  that function.

-- 
John Porter




Re: Schwartzian transforms

2001-03-28 Thread Dan Sugalski

At 11:22 AM 3/28/2001 -0500, John Porter wrote:
>Dan Sugalski wrote:
> > It doesn't really matter if the functions inside the sort function are
> > idempotent--what matters is whether it's OK for us to go and memoize the
> > things (or whatever else we might choose to do)
>
>Exactly, that's what I've been trying to say.
>And that's why I propose the :constant/:function/:pure/:stateless
>attribute, so that perl only has to trust the programmer to say
>which functions can be memoized.

I'm actually considering whether we even need to care what the programmer's 
said. If we can just flat-out say "We may optimize your sort function, and 
we make no guarantees as to the number of times tied data is fetched or 
subs inside the sort sub are called" then life becomes much easier.

Of course, we may not be able to say that, in which case hints of any sort 
are a Good Thing.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Schwartzian transforms

2001-03-28 Thread James Mastros

On Wed, Mar 28, 2001 at 11:11:20AM -0500, Dan Sugalski wrote:
>"Can perl automatically optimize away function and tie calls inside
> a sort function, and under what circumstances?"
Agreed.

> It doesn't really matter if the functions inside the sort function are 
> idempotent--what matters is whether it's OK for us to go and memoize the 
> things (or whatever else we might choose to do)
As far as I can see, this, in essence, gives a few basic cases:
1) The sort function is ill-defined.
2) The sort function is stateless.
3) The sort function is simply internaly stateless.
4) The function is well-defined, but not stateless whatsoever.

In case 1, The sort won't work anyway, so we can ignore this case.

I'm of the opinion that we should consider 3 to be Just Plain Silly and not
worth worring about overmuch.  (These would be functions that increment a
counter every time they are accessed, for example.)

I think that the difference between 4&3 dosn't matter.  We only have things
in 4 and not 3 that vary in abs(), but not sign.

We're left with 1&2, and for 1, the sort won't work anyway.

So long as we consider 2 Just Plain Silly, we're OK memonizing.

  -=- James Mastros
-- 
The most beautiful thing we can experience is the mysterious.  It is the
source of all true art and science.  He to whom this emotion is a stranger,
who can no longer pause to wonder and stand wrapt in awe, is as good as dead.
-=- Albert Einstein
AIM: theorbtwo   homepage: http://www.rtweb.net/theorb/



Re: Schwartzian transforms

2001-03-28 Thread John Porter

Dan Sugalski wrote:
> It doesn't really matter if the functions inside the sort function are 
> idempotent--what matters is whether it's OK for us to go and memoize the 
> things (or whatever else we might choose to do)

Exactly, that's what I've been trying to say.
And that's why I propose the :constant/:function/:pure/:stateless
attribute, so that perl only has to trust the programmer to say
which functions can be memoized.

-- 
John Porter

Give the braindead no head.




Schwartzian transforms

2001-03-28 Thread Dan Sugalski

While all the discussion around Schwartzian transforms is interesting, the 
single ultimate question is:

   "Can perl automatically optimize away function and tie calls inside
a sort function, and under what circumstances?"

It doesn't really matter if the functions inside the sort function are 
idempotent--what matters is whether it's OK for us to go and memoize the 
things (or whatever else we might choose to do)

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Schwartzian Transform

2001-03-28 Thread John Porter

James Mastros wrote:
> This runs afoul of the halting problem real quick.  

That would be taking the entirely wrong approach.
All you'd need to do is check the return values from multiple
calls with the same arguments.  As long as they appear
idempotent, that's all you care about.


> My intuition says that it will either be sorted, for a simple def of sorted
> ($s[i] <= $s[i+1]), or will take infinite time.  (The "take infinite time"
> is, I assume, the one that made things dump core.)

Well, your intuition is not serving you well.  You're mistaken
on both counts.


> > Hm. We could call that "relative idempotency", I suppose.
> 
> I'd go with "transitive", since this is a property of the comparator,
> not the  extractor.  

It's relative, because while the return values doen't need to be
exactly the same each time, they only need to have the same 
signum(difference()).


> If you seperate the comparator and the extractor(s), then the comparator
> must be transitive, and the extractors must be internaly stateless.

Must be?  I think not.


> > Give the braindead no head.
> 
> You might want to change that to "heed".

I definitely do not.  The whole reason I put it in my .sig
is because I like the way it sounds.
If you find it offensive, well, I'll have a new one soon.

-- 
John Porter




Re: Schwartzian Transform

2001-03-28 Thread James Mastros

On Wed, Mar 28, 2001 at 09:38:59AM -0500, John Porter wrote:
> Mark-Jason Dominus wrote:
> > I have to agree with whoever followed up that this is a really dumb idea.  
> Yahbut...  (See last paragraph, below).
OK, I'm agreeing with MJD on this one, and it was my idea.  There is no easy
way to check for statelessness in all cases.

In some cases you can -- sin($a) <=> cos($b) is obviously stateless (if $a
and $b aren't magic), because it is composed only of stateless primitives.

This runs afoul of the halting problem real quick.  Or so I'm told, and I
belive it.

> If the comparison (or key extraction) function is not idempotent,
> that is a much worse situation than simply one of degraded 
> performance.  It means the result of sort() won't be (or at least
> will not be guaranteed to be) sorted!
My intuition says that it will either be sorted, for a simple def of sorted
($s[i] <= $s[i+1]), or will take infinite time.  (The "take infinite time"
is, I assume, the one that made things dump core.)

> > if my_compare(a,b) < 0, and
> >my_compare(b,c) < 0, then it should also be the case that
> >my_compare(a,c) < 0
> Hm. We could call that "relative idempotency", I suppose.
I'd go with "transitive", since this is a property of the comparator, not the 
extractor.  

If you seperate the comparator and the extractor(s), then the comparator
must be transitive, and the extractors must be internaly stateless.

> But, aaui, it is not a question of the comparison function
> being idempotent, but the key extraction function being so.
aaui?

Hm.  Let's define some terms like "idempotency".  These are also my
(current) ideas for sub attributes.

Stateless: The function neither depends on state, nor modifies it.  This makes
it a pure (IE mathematical) function.  f(a) == f(a), and there is no
side-effect.  sin() is stateless.
This means the function is memonizeable.

Internaly stateless: f(a) == f(a), but there might be sideefects that do not
effect the return value.  printf is internaly stateless (ignoring linenoize
vars).
This is the essencial property of key extractors.  Note that all stateless
functions are internaly stateless.

Transitive: 
> > if my_compare(a,b) < 0, and
> >my_compare(b,c) < 0, then it should also be the case that
> >my_compare(a,c) < 0
I can't define it better then that.  (Though there's more to it then that).
Note that only the sign of the answer is gaurnteed, so it doesn't even have
to be internaly stateless -- but it probably doesn't make sense for it not
to be.


> Give the braindead no head.
You might want to change that to "heed".

-=- James Mastros
-- 
The most beautiful thing we can experience is the mysterious.  It is the
source of all true art and science.  He to whom this emotion is a stranger,
who can no longer pause to wonder and stand wrapt in awe, is as good as dead.
-=- Albert Einstein
AIM: theorbtwo   homepage: http://www.rtweb.net/theorb/



Re: Schwartzian Transform

2001-03-28 Thread Graham Barr

On Wed, Mar 28, 2001 at 09:13:01AM -0500, Mark-Jason Dominus wrote:
> 
> > So you can say
> > 
> >   use Memoize;
> >   # ...
> >   memoize 'f';
> >   @sorted = sort { my_compare(f($a),f($b)) } @unsorted
> > 
> > to get a lot of the effect of the S word.
> 
> Yes, and of course the inline version of this technique is also
> common:
> 
>@sorted = sort { my $ac = $cache{$a} ||= f($a);
> my $bc = $cache{$b} ||= f($b);
> my_compare($ac,$bc);
>   } @unsorted;
> 
> Joseph Hall calls this the 'Orcish Maneuver'.

That does have the (slight) disadvantage that if f(x) returns
a false value then f(x) may be called multiple times. Of course
this can be fixed by using exists. It also has the overhead of
computing the hash value of a and b on every itteration

Personally I have found the fastest sort tends to be

  my @f = map { f($_) } @unsorted;
  @sorted = @unsorted[ sort { $f[$a] cmp $f[$b] } 0..$#f ];

Graham.



Re: Schwartzian Transform

2001-03-28 Thread John Porter

Mark-Jason Dominus wrote:
> 
> > I'd think /perl/ should complain if your comparison function isn't
> > idempotent (if warnings on, of course).  If nothing else, it's probably an
> > indicator that you should be using that schwartz thang.
> 
> I have to agree with whoever followed up that this is a really dumb idea.  

Yahbut...  (See last paragraph, below).


> I explained how the /o in
> /$foo/o
> represents a promise to Perl that $foo will never change, so Perl can
> skip the operation of checking to see if it has changed every time the
> match is performed.  Then there was a question from someone in the
> audience, asking if Perl would emit a warning if $foo changed.

That is not really an analogous situation.
If the comparison (or key extraction) function is not idempotent,
that is a much worse situation than simply one of degraded 
performance.  It means the result of sort() won't be (or at least
will not be guaranteed to be) sorted!


> Idempotency is not the important thing here.  The *important* property
> that the comparator needs, and the one that bad comparators usually
> lack is 
> if my_compare(a,b) < 0, and
>my_compare(b,c) < 0, then it should also be the case that
>my_compare(a,c) < 0
> 
> for all keys a, b, and c.

Hm. We could call that "relative idempotency", I suppose.


> An optimal sort function will not call the comparator if it
> already knows what the result should be!

But, aaui, it is not a question of the comparison function
being idempotent, but the key extraction function being so.

MJ's commentary on the relative idempotency of the comparison
function is valid.  But still, the real issues are the
idempotency and constness (pureness) of the key extraction
function.  Unless, that is, the results of the key extraction
function are memoized.  Given that they're *not* memoized,
non-idempotency of the function could be a problem.
However, as I said before, I think it is not worth the cost
(at least by default) for perl to assert idempotency.
And automatic memoizing could be turned on for any function
tagged with :constant (:function? :pure?).

-- 
John Porter

Give the braindead no head.




Re: Schwartzian Transform

2001-03-28 Thread Mark-Jason Dominus


> So you can say
> 
>   use Memoize;
>   # ...
>   memoize 'f';
>   @sorted = sort { my_compare(f($a),f($b)) } @unsorted
> 
> to get a lot of the effect of the S word.

Yes, and of course the inline version of this technique is also
common:

   @sorted = sort { my $ac = $cache{$a} ||= f($a);
my $bc = $cache{$b} ||= f($b);
my_compare($ac,$bc);
  } @unsorted;

Joseph Hall calls this the 'Orcish Maneuver'.

However (I don't know who suggested this, but:)

> > > > >I'd think /perl/ should complain if your comparison function isn't
> > > > >idempotent (if warnings on, of course).  If nothing else, it's probably an
> > > > >indicator that you should be using that schwartz thang.

I have to agree with whoever followed up that this is a really dumb
idea.  It reminds me of the time I was teaching the regex class at
TPC3, and I explained how the /o in

/$foo/o

represents a promise to Perl that $foo will never change, so Perl can
skip the operation of checking to see if it has changed every time the
match is performed.  Then there was a question from someone in the
audience, asking if Perl would emit a warning if $foo changed.

On the other side of the argument, however, I should mention that I've
planned for a long time to write a Sort::Test module which *would*
check to make sure the comparator function behaved properly, and would
report problems.   When you use the module, it would make all your
sorts run really slowly, but you would get a warning if your
comparator was bad. 

Idempotency is not the important thing here.  The *important* property
that the comparator needs, and the one that bad comparators usually
lack is 
if my_compare(a,b) < 0, and
   my_compare(b,c) < 0, then it should also be the case that
   my_compare(a,c) < 0

for all keys a, b, and c.

Sort::Test would run a quadratic sort such as a bubble sort, and make
sure that this essential condition held true.  Note in particular that
if the comparator has the form { my_compare(f(a),f(b)) }, then it does
not matter if f() is idempotent; what really matters is that
my_compare should have the property above.

I had also planned to have optional checks:

use Sort::Test 'self';

(Make sure that my_compare(a,a) == 0 for all a)

use Sort::Test 'twice';

(Make sure that my_compare(a,b) == my_compare(a,b) for all a,b)

This last is essentially the idempotency restriction again.  The
reason I've never implemented this module is that in perl 5, sort()
cannot be overridden, so the usefulness seemed low; you would have to
rewrite your source code to use it.  I hope this limitation is fixed
in perl 6, because it would be a cool hack.

Finally, another argument in the opposite direction yet again.  It has
always seemed to me that this 'inconsistent sort comparator' thing is
a tempest in a teapot.  In the past it has gotten a lot of attention
because some system libraries have a qsort() function that dumps core
if the comparator is inconsistent.  

To me, this obviously indicates a defective implementation of
qsort().  If the sort function dumps core or otherwise detects an
inconsistent comparator, it is obviously functioning suboptimally.  An
optimal sort will not notice that the comparator is inconsistent,
because the only you can find out that the comparator is returning
inconsistent results is if you call it in a situation where you
already know what the result should be, and it returns a different
result.  An optimal sort function will not call the comparator if it
already knows what the result should be!

For example, consider the property from above:
if my_compare(a,b) < 0, and
   my_compare(b,c) < 0, then
   my_compare(a,c) < 0.

If the qsort() already knows that a


Re: Schwartzian Transform

2001-03-28 Thread Ariel Scolnicov

Dan Sugalski <[EMAIL PROTECTED]> writes:

> At 09:26 AM 3/27/2001 -0800, Peter Buckingham wrote:
> >Dan Sugalski wrote:
> > >
> > > At 09:50 PM 3/26/2001 -0500, James Mastros wrote:
> >
> >[..]
> >
> > > >I'd think /perl/ should complain if your comparison function isn't
> > > >idempotent (if warnings on, of course).  If nothing else, it's probably an
> > > >indicator that you should be using that schwartz thang.
> > >
> > > If you figure out how, tell me. I'd like to make arrangements to be there
> > > for your Nobel acceptance speech. :) (This is, alas, a variant on the
> > > halting problem, unless you're proposing perl do the memoizing *and* still
> > > call the functions, and complain if one doesn't match the other)
> >
> >not wanting to collect my nobel prize just yet, but...
> >
> >could you not try a simple test (not guaranteed to be 100% accurate though),
> >by copying the first data element and apply it twice and then check to see
> >that the result of applying it once is the same as applying it twice.
> 
> Feels a little too magic to me, and awfully fragile. I'm not
> comfortable doing that, though arguments to the contrary are
> welcome.

There are 3 separate notions of idempotency at work here; I'm making
up their names.

1. THE MATHEMATICAL NOTION:

   f(f(x)) == f(x) for all x

   We don't care about that *at all* when we're sorting (we're sorting
   the x's in order of their f(x)'s, with not an f(f(x)) in sight!).
   I'm not going to talk about this idempotency, I just mention it
   because of the incorrect analogy from linear algebra that has been
   floating around.

2. "STRONG" IDEMPOTENCY: (PURE FUNCTIONS)

   Saying C<$a=f(x); $a=f(x)> leaves the program in the same state as
   just C<$a=f(x)>.  This is achieved by avoiding assignment, not
   using any routines with state (I/O, PRNGs, ...) and the like.

3. "WEAK" IDEMPOTENCY: ("PURE RESULT"?)

   After C<$a=f(x); more_stuff(); $b=f(x)> we have C<$a==$b>.  But the
   state of the program may have been changed.

   Examples include any f() that does caching (or I/O for other
   purposes), logging, or profiling.

   If f() is memoized then it is weakly idempotent but not strongly
   idempotent, since calling f(x) for the first time may change the
   memoized data.

   If f() is strongly idempotent then it is of course weakly
   idempotent also.


(3) is enough for the sorting examples, I think.  Unfortunately it's
also a lot harder to test syntactically than is (2).

So I'd like to suggest a cop-out, similar to the "let a module do the
work" argument.  MJD has a module Memoize, which is very easy to use.
You can only memoize a pure function (in one of the above 2 senses;
you can always memoize (2), and you can do (3) if the semantics of it
are "good enough" for what you want to do).

So you can say

  use Memoize;
  # ...
  memoize 'f';
  @sorted = sort { my_compare(f($a),f($b)) } @unsorted

to get a lot of the effect of the S word.


Yes, I know it's not the transform (never better and often worse),
since Memoize was designed with other things in mind.  But it's
probably a "good enough" solution, and has very low brain overhead.


-- 
Ariel Scolnicov|"GCAAGAATTGAACTGTAG"| [EMAIL PROTECTED]
Compugen Ltd.  |Tel: +972-2-5713025 (Jerusalem) \ We recycle all our Hz
72 Pinhas Rosen St.|Tel: +972-3-7658117 (Main office)`-
Tel-Aviv 69512, ISRAEL |Fax: +972-3-7658555http://3w.compugen.co.il/~ariels