Musings on operator overloading (was: File-Fu overloading)

2008-02-24 Thread Aristotle Pagaltzis
[Cc to perl6-language as I think this is of interest]

[Oh, and please read the entire thing before responding to any
one particular point. There are a number of arguments flowing
from one another here. (I am guilty of being too quick with the
Reply button myself, hence this friendly reminder.)]

* Eric Wilhelm [EMAIL PROTECTED] [2008-02-24 02:05]:
 # from Aristotle Pagaltzis
 # on Saturday 23 February 2008 14:48:
 I find the basic File::Fu interface interesting… but operator
 overloading always makes me just ever so slightly queasy, and
 this example is no exception. 
 
 Is that because of the syntax, the concepts, or the fact that
 perl5 doesn't quite get it right?

It’s a matter of readability. It’s the old argument about, if
not to say against, operator overloading: you’re giving `*` a
completely arbitrary meaning that has nothing in common in any
way with what `*` means in contexts that the reader of the code
had previously encountered.

 Does it help to know that error messages will be plentiful and
 informative?  (Not to mention the aforementioned disambiguation
 between mutations and stringifications.)

It has nothing to do with any of these factors.

 I get the desire for syntactic sugar, I really do… but looking
 at this, I think the sane way to accommodate that desire is to
 attach overloaded semantics to a specially denoted scope
 rather than hang them off the type of an object.
 
 I can't picture that without an example.

Something like

path { $app_base_dir / $conf_dir / $foo_cfg . $cfg_ext }

where the operators in that scope are overloaded irrespective of
the types of the variables (be they plain scalar strings,
instances of a certain class, or whatever).

Note that I’m not proposing this as something for File::Fu to
implement. It would be rather difficult, if at all possible, to
provide such an interface in Perl 5. You need macros or access to
the grammar or something like that in order to implement this at
all. Although I think that even if you have those, you wouldn’t
want to use them directly, but rather as a substrate to implement
scope-attached operator overloading as an abstraction over them.

But I think it’s desirable to use this abstraction instead of
using grammar modifications or macros directly, since it vastly
more limited power than the former and still much less power than
the latter. It should therefore be easier both in use by the
programmer who designs the overloading scope and in readability
for the maintenance programmer who reads code that uses overload
scopes.

It would particularly help the latter, of course, because the
code’s behaviour does not vary based on the types that happen to
pass through; the source code is explicit and direct about its
meaning.

 I suspect though that having the object carry the semantics
 around with it is still going to be preferred.

There are cases where it would be.

When the object is a mathematical abstraction in some broad
sense, e.g. it’s a complex number class, or it implements some
kind of container such as a set, then being able to overload
operators based on the type of that object would be useful.

But note that in all of these examples, it is very much
self-evident what the meaning of an overloaded `+` would be: that
meaning comes from the problem domain – a problem domain that has
the rare property of having concepts such as operators and
operands.

When you leave the broader domain of mathematical and “para-”
mathematical abstractions behind and start to define things like
division on arbitrary object types that model aspects of domains
which have nothing even resembling such concepts, you’re rapidly
moving into the territory of obfuscation.

A lot of C++ programmers could sing a song about that.

However, I think the way that Java reacted to this (“only the
language designer gets to overload operators!!”) is completely
wrong. I agree fully with the underlying desire you express:

 The essential motivation is that if I can't make this
 interface work, I'm just going to slap strings together and be
 done with it.  The converse is that if I can make this
 interface work then cross-platform pathname compatibility
 becomes far less tedious.

Absolutely it is very, very useful to be able to define syntactic
sugar that makes it as easy and pleasant to do the right thing
(manipulate pathnames as pathnames) as it is to do the wrong
thing (use string operations to deal with pathnames). That is
precisely why I said that I do get why you’d want to overload
operators.

And this contradiction – that being able to declare sugar is
good, but the way that languages have permitted that so far leads
to insanity – is what sent me thinking along the lines that there
has to be some way to make overloading sane. And we all know that
all is fair if you predeclare. And that led me to the flash of
inspiration: why not make overloading a property of the source
(lexical, early-bound) rather than of the values (temporal, late-
bound)? And what 

Re: Musings on operator overloading (was: File-Fu overloading)

2008-02-24 Thread Andy Armstrong

On 24 Feb 2008, at 15:00, Aristotle Pagaltzis wrote:

Something like

   path { $app_base_dir / $conf_dir / $foo_cfg . $cfg_ext }



I've wanted this often. I've also wanted a clean way to lexically  
supply a default target object. For example with HTML::Tiny you often  
write


 my $h = HTML::Tiny-new();
 $h-body($h-head($h-title('FooPage')), $h-body(...));

I'd love to be able to drop the '$h-' everywhere. Like this:

 $h-body( head( title( 'FooPage' ) ), body( ... ) );

I guess that would/could be a related mechanism.

--
Andy Armstrong, Hexten






Re: Musings on operator overloading (was: File-Fu overloading)

2008-02-24 Thread Luke Palmer
On Sun, Feb 24, 2008 at 3:00 PM, Aristotle Pagaltzis [EMAIL PROTECTED] wrote:
  Something like

 path { $app_base_dir / $conf_dir / $foo_cfg . $cfg_ext }

  where the operators in that scope are overloaded irrespective of
  the types of the variables (be they plain scalar strings,
  instances of a certain class, or whatever).

This is excellent.  I've long supported the one symbol-one meaning
idea.  There are some trade-offs, though, which I'll describe.  I'll
use Haskell as my object language, since it does this sort of operator
defining rather than operator overloading (only for infix binary
operators though, not full syntactic constructs).

Recently I implemented a Vector class in Haskell.  Vectors have
addition and multiplication, of a sort, so it makes sense to use the +
and * operators.   In Haskell overloading comes in type classes, where
the operators you overload need to have specific types and you also
have an implicit contract to obey some laws about them when you
overload.  In the context of Perl, types don't mean squat, but the
implicit laws are still important, so this argument still applies.
The Num class for these operators looks like this:

  class Num a where
(+) :: a - a - a   -- take 2 arguments of type a and return 1 of type a
(*) :: a - a - a
...

Which means that if I wanted to overload (+) to work on vectors, my
(+) would have to have type Vector - Vector - Vector.  That's fine,
it is that type.  And it's associative and commutative as the class
expects.

The trouble is with (*).  There are three types of multiplication on
vectors, with these types:

Double - Vector - Vector
Vector - Double - Vector
Vector - Vector - Double  -- inner product

None of which is Vector - Vector - Vector as Num is expecting.

Now, not letting me overload (*) was a good idea on Num's part.
Normally if you have some v in Num, you can say (v*v) + v and it will
be legal.  But if v were a vector, then this wouldn't work, you'd end
up trying to add a scalar and a vector.  It's very simple, Vectors are
not Nums in the sense that the class requires

The way I solved this was to create my own class with special vector
operations, so the user of the module had to differentiate between
vector operations and scalar operations.

  class Vector v where
(^+^) :: v - v - v
(*^)  :: Double - v - v
(^*^) :: v - v - Double

  x ^* y  = y *^ x
  x ^-^ y = x ^+^ (-1) *^ y

Which could be construed as annoying.

This problem could have been ameliorated in another way, namely to
have designed Num differently.  If Num separated the notions of
addition and multiplication, or even had been designed as a Vector
space in the first place, then I could have used + and * like I wanted
to.  But it wasn't, so I couldn't.  More below.

  I hope it's obvious how such a thing would me implemented. Now,
  if you used type-bound overloading, then the following two
  expressions cannot yield the same result:

 ( 2 / 3 ) * $x
 2 * $x / 3

  But if overloading was scope-bound, they would!

And here is why you need ad-hoc overloading in addition to scope-based
overloading.  There are times when I mean 2/3  4 to mean the symbolic
data structure representing that condition, and there are other times
when I just want to check whether 2/3 is less than 4!  Here's a
contrived example:

  my $expr = 1;
  my $count = 1;
  while ($count  10) {
print $count $expr\n;
$expr = $expr + $expr;
$count = $count + 1;
  }

With the intention of printing something like:

  1 1
  2 1 + 1
  3 (1 + 1) + (1 + 1)
  ...

Which is, of course, broken.  I would have to do something crazy with scopes:

  while ($count  10) {
print $count $expr\n;
$expr = do { use Math::Symbolic; $expr + $expr };
$count = $count + 1;
  }

Which could also be construed as annoying.

However, here $count + 1 and $expr + $expr are the same +.  They both
mean add.  Why should I have to play with scopes for this.  Contrast
this with Java, where 3 + 4 and hello  + world are different +s
(unless you're used to thinking about monoids).  Getting those two to
coexist is the thing that should require playing with scopes (or
better yet, using diffrent symbols for the two of them).

I do think the best solution is a combination of overloading and
scoping.  That is, allow operator overloading, but not overloading the
name +, rather overloading a specific +, such as Math::infix:+.

This still allows poor usage as we've commonly seen with operator
overloading.  But it also allows well-behaved usage, which was
previously forbidden.

Luke


Re: Musings on operator overloading (was: File-Fu overloading)

2008-02-24 Thread Larry Wall
On Sun, Feb 24, 2008 at 04:23:54PM +, Andy Armstrong wrote:
 I've wanted this often. I've also wanted a clean way to lexically supply a 
 default target object. For example with HTML::Tiny you often write

  my $h = HTML::Tiny-new();
  $h-body($h-head($h-title('FooPage')), $h-body(...));

 I'd love to be able to drop the '$h-' everywhere. Like this:

  $h-body( head( title( 'FooPage' ) ), body( ... ) );

 I guess that would/could be a related mechanism.

In Perl 6 you can at least get it down the minimalistic indication
of a method vs function:

given $h {
  .body( .head( .title( 'FooPage' ) ), .body( ... ) );
}

That can be construed as clean in a way that a functional interface
with an implicit object could not.  P6 is big on distinguishing
method calls from function calls *because* of wanting to distinguish
object-centric single dispatch from function-based multiple dispatch
(including all operator dispatch) which, by the way, is generally
controlled lexically in P6, as the OP suggests.  (It's also used in
the global (or more like super-lexical) Prelude scope by the compiler
to define the base language each compilation unit starts in.)
So we're ahead of you there...  :)

Basically anything that could be construed as language mutation is
limited lexically in P6, and mixing in user-defined multimethods can
be construed as at least semantic mutation, and is also syntactic
mutation if you define new operators rather than merely overloading
existing ones.  (We have a bias toward new operators for different
semantics, also suggested in the OP.  There's lots of Unicode operators
available, I hear...)

But back to .body etc.  To go further than that without the cooperation
of the class, you'd have to curry the invocant on a class, currently
described as something like:

(use HTML::Tiny).assuming(:self(HTML::Tiny.new()));

which would presumably import all the methods as functions, give
or take the fact that that syntax would not intrinsically indicate
the desire to import anything, which is a problem.  It is probably
a common enough operation to give a shortcut to:

use HTML::Tiny :singleton;

or some such, which would automatically run new, curry all the methods
on the invocant, install those resulting functions under the singleton tag
as marked for export, then import them as you would any other tag.  Then
your call reduces to:

body( head( title( 'FooPage' ) ), body( ... ) );

But as I just described it, if you wanted to use another singleton in
a different scope, you'd end up clobbering singleton tag, so really
you want to treat that as an anonymous tag somehow.  (Import tags
are really just subpackages in P6, so an anonymous temporary subpackage
is likely not a problem.)  Then you wouldn't have colliding curries,
and the exporting module doesn't have to be aware of who is importing
from it, which is pretty bogus when you think about it.

Presumably the HTML::Tiny protoobject can then be queried for its
singleton object if you really need to have $h for some reason.

Larry


Re: Musings on operator overloading (was: File-Fu overloading)

2008-02-24 Thread Doug McNutt
At 17:30 + 2/24/08, Luke Palmer wrote:
On Sun, Feb 24, 2008 at 3:00 PM, Aristotle Pagaltzis [EMAIL PROTECTED] wrote:

And I read both very carefully and failed to understand most of it.

I use perl for physics and engineering mostly because I forgot most of my 
FORTRAN long ago and perl works everywhere.

I really want to use complex numbers, vectors, matrices, and sometimes 
quarternions. I really want to be able to define or use previously defined 
operators in a way that I learned in the 50's. I want my compiler to understand 
when I use vectors in which the components are complex numbers. I want dot and 
cross product to work. I want to be able to multiply a matrix by a vector and 
get a polite error message if I try that with impossible arguments.

What I think I learned from those two messages is that it's damnably difficult 
for a parser to figure out what I'm doing. Perhaps it just isn't worth while.

But. . .

I really don't mind informing my compiler in advance about what I want a 
variable to be treated as. Typedef {}, Dimension () and the like are no problem 
at all. I don't mind. And I think that would also apply to my scientifically 
oriented friends.

Wouldn't it make life easier for the parser to overload the * operator into a 
dot product whenever both arguments have been defined as vectors or been 
returned as vectors by a previous operation? One could even use ** for a cross 
product since raising to a vector power is unreasonable. Just recognizing the 
special use declared and passing the operation off to a required subroutine 
would be adequate. Yes. It can all be expressed in simple object-oriented 
language but all of the File::Fu stuff is unduly complicating the use in 
mathematics.

Practical Extraction and Reporting are what perl is about and I know I'm 
stretching the plan but just a bit of code that will allow, but not require, 
typedefs - er classes - of special things that cause operators to be passed to 
subroutines - er class methods - to be written could make a big difference.

Even translating ^ to pow($x,$y) would be useful to some, but I remember that 
much FORTRAN. And -2^2 is -4 (correctly?) in C on a two's complement machine.

-- 

-- Life begins at ovulation. Ladies should endeavor to get every young life 
fertilized. --


Re: Musings on operator overloading (was: File-Fu overloading)

2008-02-24 Thread Jonathan Lang
Aristotle Pagaltzis wrote:
  And this contradiction – that being able to declare sugar is
  good, but the way that languages have permitted that so far leads
  to insanity – is what sent me thinking along the lines that there
  has to be some way to make overloading sane. And we all know that
  all is fair if you predeclare. And that led me to the flash of
  inspiration: why not make overloading a property of the source
  (lexical, early-bound) rather than of the values (temporal, late-
  bound)? And what we need to do that is a way to say this scope
  is special in that the operators herein follow rules that differ
  from the normal semantics. There you have it.

So if I'm understanding you correctly, the following would be an
example of what you're talking about:

  { use text; if $a  49 { say $a } }

...with the result being the same as Perl5's 'if $a gt 49 { say $a
}' (so if $a equals '5', it says '5').  Am I following you?  If so,
I'm not seeing what's so exciting about the concept; all it is is a
package that redefines a set of operators for whatever scopes use it.
If I'm not following you, I'm totally lost.

-- 
Jonathan Dataweaver Lang


Re: Musings on operator overloading (was: File-Fu overloading)

2008-02-24 Thread David Green

On 2008-Feb-24, at 2:28 pm, Jonathan Lang wrote:

 { use text; if $a  49 { say $a } }
...with the result being the same as Perl5's 'if $a gt 49 { say  
$a }' (so if $a equals '5', it says '5').  Am I following you?  If  
so, I'm not seeing what's so exciting about the concept;


The whole point is to be not exciting: instead of being kept on the  
edge of your seat wondering what possible meaning  has this time,  
it's right there explicitly, boringly in front on you.  As indicated,  
that has advantages and disadvantages.  In general, I tend towards  
solutions that do what human beings would do (as opposed to  
programmers, although of course that isn't always feasible).


One thing humans do, when faced with a dot product, say, is to use a  
dot; and happily, Perl 6 makes it easy to define Unicode operators so  
we don't have to overload operators that mean something else.


Something else humans do, however, is to overload symbols: e.g. /  
for a filepath separator and for division.  I'm not convinced that  
scope-based overloading is the way to go in this particular case,  
though.  It is the way to go for regexes, which overload all sorts of  
symbols that have other meanings, because regexes come in self- 
contained lumps anyway.  (Although similar overloadings are used in  
Signatures, which are another limited scope, come to think of it).


Of course, using a slash for division is a slightly ugly hack anyway,  
because typewriters didn't have the proper symbol (U+00F7).  P6 could  
use ÷ for division, and / for portable path-separation.  (Which is  
kind of tempting, at least to me; I must admit I've been pondering the  
use of / for filenames for some time.)


On the other hand, I think that in real code, it should be fairly  
obvious whether you're doing arithmetic or working with files, so  
perhaps the solution is to add unambiguous alternatives for those  
exceptional cases when you want to make it absolutely explicit:  
open($dir fs ($filename ~ $total div $count)).


(I also expect that files will be treated more consistently in P6,  
without as many issues about whether a file is a string or a handle  
or an object...)



-David