Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-10-01 Thread Ilya Zakharevich

On Sun, Oct 01, 2000 at 08:51:04AM +1100, Jeremy Howard wrote:
   A prototypeless-function call.
  
  get rid of them all!!
 
 Please no! Anything that makes it harder to write 'quick-and-dirty' scripts
 is never going to fly--this is part of what makes Perl special.

Why?  I see no problem in making -Mstrict and -Wall the defaults.
Then make '-E' option to mean what '-e' means today, and '-e' mean 

  -M-strict -Wnone -E

Ilya



Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-30 Thread Karl Glazebrook


get rid of them all!!

Ilya Zakharevich wrote:
 
 On Thu, Sep 28, 2000 at 11:39:51AM -0400, Karl Glazebrook wrote:
so what is wrong with the statement '@y = 3*@x;' then ?
  
   That other constructs *also* create an array context, in which the
   behaviour of multiplication you propose is not appropriate.
 
  for example?
 
 A prototypeless-function call.
 
 Ilya



Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-30 Thread Jeremy Howard

Karl Glazebrook wrote:
 Ilya Zakharevich wrote:
  On Thu, Sep 28, 2000 at 11:39:51AM -0400, Karl Glazebrook wrote:
 so what is wrong with the statement '@y = 3*@x;' then ?
   
That other constructs *also* create an array context, in which the
behaviour of multiplication you propose is not appropriate.
  
   for example?
 
  A prototypeless-function call.
 
 get rid of them all!!

Please no! Anything that makes it harder to write 'quick-and-dirty' scripts
is never going to fly--this is part of what makes Perl special.

I would like to see array operations occur inside prototypeless function
calls, which as Ilya notes already creates array context. This is not
fundamentally 'inappropriate', although it is a change from P5. It just
means having to type 'scalar @arr' when that's what you mean--and having the
P52P6 converter do the same.





Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-28 Thread Karl Glazebrook

Ilya Zakharevich wrote:
  so what is wrong with the statement '@y = 3*@x;' then ?
 
 That other constructs *also* create an array context, in which the
 behaviour of multiplication you propose is not appropriate.

for example?


 I did not see any viable proposal on changing things in a major way.
 To design such a change is a *major* work.  We need to keep a lot of
 possible combinations with other features in mind, and understand all
 the ramifications and desired/undesired interaction.  We need
 insight.  We need to balance the tradeoffs.

This is what will happen no doubt, and what will emerge will probably
be less than the radicals hope for and more than the conservatives
would want!

 I did not mean interviews.  10 years ago I read the manual.  It was
 clearly there.

I am sure it was, the guy is nuts.

Karl.



Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-26 Thread Ilya Zakharevich

On Mon, Sep 25, 2000 at 06:30:22PM -0400, Karl Glazebrook wrote:
  Well, this shows that you entirely miss the problem of cryptocontexts.
  Context is determined by the "environment" of the operation, not by
  the operation.  Context is propagated:
  
the-left-hand-side-of-assignment --- the-right-hand-side-of-assignment
 
 
 so what is wrong with the statement '@y = 3*@x;' then ?

That other constructs *also* create an array context, in which the
behaviour of multiplication you propose is not appropriate.

  Changing Perl in this respect will make one particular mode of
  operation a tiny bit simpler, but (without major changes to
  cryptocontexting - PLUG see for example my interview on perl.com
  /PLUG) will make life much harder in other modes of operation.

 I think major changes are what we aree talking about here.

I did not see any viable proposal on changing things in a major way.
To design such a change is a *major* work.  We need to keep a lot of
possible combinations with other features in mind, and understand all
the ramifications and desired/undesired interaction.  We need
insight.  We need to balance the tradeoffs.

I do not think we made *any* step in the correct direction yet.

  Remember: do you do your system mainainance in Mathematica?  Why?
  Remember that Wolfram *wanted* you to do this?  Perl5 is much better
  balanced.  You are pulling the blanket to your side of the bed.
 
 I am not sure what point you are trying to make about Mathematica? I
 have read intevrviews with Woldfram ,he is clearky an egomanica and
 thinks everything should be an expression, but I am not sure he
 was arguing for system management in Mathematica.

I did not mean interviews.  10 years ago I read the manual.  It was
clearly there.

Ilya



Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-23 Thread Karl Glazebrook

[EMAIL PROTECTED] wrote:
 
 Ilya Zakharevich wrote:
  ...Do you say you are confused by using vectors (=scalars) instead of
  arrays?
 
 I'm not having a problem with that personally but *many* users of PDL
 have complained about being confused by this.
 They assume ndim == array == perl array.
 
   Christian

Yes this is the point. I guess another way of looking at it is
saying that 3*@a operates in a list context not a scalar context
and that we will define the behaviour of '*' in this context.
(Currently it is not defined, hence @a is converted to scalar(@a)).

Karl



Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-22 Thread Karl Glazebrook

Ilya Zakharevich wrote:
 But with Fortran such things are not *needed*.  Compilers are smart
 enough to convert (equivalents to)
 
   map 3*$_, 34..67

This is true, but easier (and less buggy) to say what you
exactly what you mean. 102:201:3

Anyway the idea has been proposed, it won't break Perl, we'll see
what happens.

 
   f(3*@a)
 
 would typically be a list context - and suddently instead of 3*(1+$#a)
 you get Cmap 3*$_, @a.

This is true, what I would propose is we declare 3*(1+$#a) outmoded and
always have it mean Cmap 3*$_, @a in all contexts.

This of course will break perl5 code. Note mine because I always say
3*scalar(@a) because 3*@a does not look like 3*(1+$#a) to me. I don't
know how many people would depend on that feature.

There is also the problenm that we are arguing somewhat in a vacuum
as we don't know how radical perl6 (in terms of syntax changes) will
be.

Anyhow the various proposals are out there, we'll see what happens.


 Why?  Currently you can make them look like references to array.  See
 Math::Pari for an implementation.  Overloading '@{}' gives yet another
 way to do this.

True but the user has to remember 'owe I am now using a special PDL
array which means I have to always use a reference to it rather than
treat it like a perl array'. Not good.

 
  It's really hard to explain why people should use @x[1..10] for
  perl arrays and $x-slice("1:10") for PDL arrays!
 
 Use
 
   $x-[1..10]
 
 for both.

This is true, but inelegant. If perl @x arrays are not considered useful
why not get rid of them and always use references?

Karl



Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-22 Thread Ilya Zakharevich

On Fri, Sep 22, 2000 at 11:17:40AM -0400, Karl Glazebrook wrote:
[Cryptocontext is:]

f(3*@a)
  
  would typically be a list context - and suddently instead of 3*(1+$#a)
  you get Cmap 3*$_, @a.
 
 This is true, what I would propose is we declare 3*(1+$#a) outmoded and
 always have it mean Cmap 3*$_, @a in all contexts.

You are trading a frequently used shortcut @a == 1 + $#a for a 
rarely-used-but-beautiful/intuitive semantic.  I'm not sure it is a win.

Moveover,

  $x = 3 * @_;

suddently being equivalent to

  $x = @_;

does not look very promising...

  Why?  Currently you can make them look like references to array.  See
  Math::Pari for an implementation.  Overloading '@{}' gives yet another
  way to do this.
 
 True but the user has to remember 'owe I am now using a special PDL
 array which means I have to always use a reference to it rather than
 treat it like a perl array'. Not good.

No, you do not use "a special PDL array", you use "a vector".
A subtle change in wording - and no conflict.

 This is true, but inelegant. If perl @x arrays are not considered useful
 why not get rid of them and always use references?

Actually, this is what Perl is using internally (they are
softreferences==globs, but who cares?).

Ilya



Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-22 Thread Karl Glazebrook

Ilya Zakharevich wrote:
 You are trading a frequently used shortcut @a == 1 + $#a for a
 rarely-used-but-beautiful/intuitive semantic.  I'm not sure it is a win.

It's now boiling down to a matter of opinion and we'll have to agree to 
differ. Of course I use array arithmetic all the time as a heavy PDL
user.

 
 Moveover,
 
   $x = 3 * @_;
 
 suddently being equivalent to
 
   $x = @_;
 
 does not look very promising...

But would it not be easy to catch and warned by a p5tp6 converter?

 No, you do not use "a special PDL array", you use "a vector".
 A subtle change in wording - and no conflict.

sure, but vector to me means 1D and also some sort of transformation
properties whereas a PDL array is just a N-dim square container.
anyway semantics - we call them 'piddles' which is moderately
amusing but inelegant.


  This is true, but inelegant. If perl @x arrays are not considered useful
  why not get rid of them and always use references?
 
 Actually, this is what Perl is using internally (they are
 softreferences==globs, but who cares?).

Hmm

Karl



Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-22 Thread Ilya Zakharevich

On Fri, Sep 22, 2000 at 05:24:55PM -0400, Karl Glazebrook wrote:
 It's now boiling down to a matter of opinion and we'll have to agree to 
 differ. Of course I use array arithmetic all the time as a heavy PDL
 user.

...Do you say you are confused by using vectors (=scalars) instead of
arrays?

  Moveover,
  
$x = 3 * @_;
  
  suddently being equivalent to
  
$x = @_;
  
  does not look very promising...
 
 But would it not be easy to catch and warned by a p5tp6 converter?

Why converters?  I'm discussing Perl6 now, not converters.

  No, you do not use "a special PDL array", you use "a vector".
  A subtle change in wording - and no conflict.
 
 sure, but vector to me means 1D and also some sort of transformation
 properties whereas a PDL array is just a N-dim square container.

An N-dim container is just a vector which contains vectors...

Ilya



Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-22 Thread Jeremy Howard

Ilya Zakharevich wrote:
   Moveover,
  
 $x = 3 * @_;
  
   suddently being equivalent to
  
 $x = @_;
  
   does not look very promising...

Why are these equivalent? RFC 82 only applies in list context. Am I missing
something?





Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-22 Thread Ilya Zakharevich

On Sat, Sep 23, 2000 at 09:52:51AM +1100, Jeremy Howard wrote:
  $x = 3 * @_;
   
suddently being equivalent to
   
  $x = @_;
   
does not look very promising...
 
 Why are these equivalent? RFC 82 only applies in list context. Am I missing
 something?

Yes, the proposal to make map 3*$_ semantic to work in a scalar
context too (to avoid cryptocontext).

Ilya



Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-22 Thread Jeremy Howard

Karl Glazebrook wrote:
 Ilya Zakharevich wrote:
  You are trading a frequently used shortcut @a == 1 + $#a for a
  rarely-used-but-beautiful/intuitive semantic.  I'm not sure it is a win.

 It's now boiling down to a matter of opinion and we'll have to agree to
 differ. Of course I use array arithmetic all the time as a heavy PDL
 user.

It's not just for number-crunchers either. Array notation greatly simplifies
many frequently used operations. For instance (from RFC 82):

quote
  @people = ('adam', 'eve ', 'bob ');
  @scores = (7,9,5);  # Score for each person
  @histogram = '#' x @scores; # Returns ('xxx','x','x')
  print join("\n", @people . ' ' . @histogram);

  adam xxx
  eve  x
  bob  x
/quote

Array notation is not 'rarely used' in languages that support it--in fact,
operations are applied to arrays and lists at least as often as scalars in
most code I see written for Mathematica, J, PDL, and so forth.





Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-22 Thread Ilya Zakharevich

On Sat, Sep 23, 2000 at 10:01:11AM +1100, Jeremy Howard wrote:
  It's now boiling down to a matter of opinion and we'll have to agree to
  differ. Of course I use array arithmetic all the time as a heavy PDL
  user.

 It's not just for number-crunchers either. Array notation greatly simplifies
 many frequently used operations. For instance (from RFC 82):
 
   @people = ('adam', 'eve ', 'bob ');
   @scores = (7,9,5);  # Score for each person
   @histogram = '#' x @scores; # Returns ('xxx','x','x')
   print join("\n", @people . ' ' . @histogram);
 
   adam xxx
   eve  x
   bob  x

Are you trying to convince me/us that is going to be used often?

 Array notation is not 'rarely used' in languages that support it--in fact,
 operations are applied to arrays and lists at least as often as scalars in
 most code I see written for Mathematica, J, PDL, and so forth.

a) You can *already* use vectors as scalars in Perl;
b) What we are discussing is Perl, not Mathematica, J, PDL, and so
   forth.  These languages have a very narrow niche.

Ilya



Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-22 Thread Jeremy Howard

Ilya Zakharevich wrote:
 Are you trying to convince me/us that is going to be used often?

Yes, I am. You made the unsupported statement that array operations are
rarely used. I'm suggesting otherwise (although to say that they're rarely
used in Perl 5 is a tautology, of course!).

  Array notation is not 'rarely used' in languages that support it--in
fact,
  operations are applied to arrays and lists at least as often as scalars
in
  most code I see written for Mathematica, J, PDL, and so forth.

 a) You can *already* use vectors as scalars in Perl;

That's not what RFC 82 is proposing.

 b) What we are discussing is Perl, not Mathematica, J, PDL, and so
forth.  These languages have a very narrow niche.

That's because few such languages provide strong general purpose programming
features as well. They are either limited maths-oriented languages (like
Mathematica) or add-ons to general purpose languages that aren't fully
integrated (Python/NumPy; Perl/PDL; C++/Blitz++).

Many Perl users operate on lists of data. Requiring explicit loops every
time a programmer wants to operate on a list is asking the programmer to fit
in with how a computer thinks. That's not right.





Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-22 Thread Ilya Zakharevich

On Sat, Sep 23, 2000 at 10:41:07AM +1100, Jeremy Howard wrote:
  a) You can *already* use vectors as scalars in Perl;
 
 That's not what RFC 82 is proposing.

Who cares?  This already works...

  b) What we are discussing is Perl, not Mathematica, J, PDL, and so
 forth.  These languages have a very narrow niche.
 
 That's because few such languages provide strong general purpose programming
 features as well. They are either limited maths-oriented languages (like
 Mathematica) or add-ons to general purpose languages that aren't fully
 integrated (Python/NumPy; Perl/PDL; C++/Blitz++).
 
 Many Perl users operate on lists of data. Requiring explicit loops every
 time a programmer wants to operate on a list is asking the programmer to fit
 in with how a computer thinks. That's not right.

Well, this is your opinion agains mine...  ;-)

Ilya



Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-22 Thread c . soeller

Ilya Zakharevich wrote:
 
 On Fri, Sep 22, 2000 at 05:24:55PM -0400, Karl Glazebrook wrote:
  It's now boiling down to a matter of opinion and we'll have to agree to
  differ. Of course I use array arithmetic all the time as a heavy PDL
  user.
 
 ...Do you say you are confused by using vectors (=scalars) instead of
 arrays?

I'm not having a problem with that personally but *many* users of PDL
have complained about being confused by this.
They assume ndim == array == perl array.

  Christian



Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-21 Thread Karl Glazebrook

Ilya Zakharevich wrote:
 As shipped: no.  But if this is made a primitive (which I would not
 like), then the only change which is needed is to make the
 tie::multi::range() token to be followed by 3 numbers.
 
 [Aside: Why not make ternary-range operator into 10 :: 20 :: 2 ?]

That would work. My point is that having a stride is a fundamental
feature in other array languages (IDL, Matlab, PDL) and would be
useful in the perl core.


  Finally as an overload expert what do you think about the proposals
  to make arrays overloadable objects so one can say things like:
 
  @x = 3 * @y;
 
 This is not an overloading issue, this is the context resolution
 issue.  IMO, the cryptocontext turns out to be evil with an exception
 of extremely short scripts - and this is with what we have now.
 
 A proposal like this would make a nuisance into a nightmare.  Yes, it
 looks nice, but it contradicts many rules, so in the long run it is
 going to be a significant step back.
 
 ...Unless the whole idea of cryptocontext is turned to become something else...

I am not sure what you mean by "cryptocontext"?

I guess the motivation here is to make non-core arrays (such as PDL
objects) look as much as possible like Perl arrays to simplify the
appearance to users.

It's really hard to explain why people should use @x[1..10] for
perl arrays and $x-slice("1:10") for PDL arrays!

I can see that allowing expressions on @x would require considerable
changes to perl core.

Is there a nice way to resolve this problem?

Karl



Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-21 Thread Buddha Buck

At 03:26 PM 9/21/00 -0400, Karl Glazebrook wrote:

   Finally as an overload expert what do you think about the proposals
   to make arrays overloadable objects so one can say things like:
  
   @x = 3 * @y;



I can see that allowing expressions on @x would require considerable
changes to perl core.

Is there a nice way to resolve this problem?

What do you think of:

   $x[|i] = 3 * $y[|i];

or

   @x = 3 * $y[|i];

It's not as clean as @x = 3 * @y, but it is cleaner context-wise.

(Working on RFC207(v2) even as I write)



Karl




Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-21 Thread Buddha Buck

At 03:35 PM 9/21/00 -0400, Buddha Buck wrote:
At 03:26 PM 9/21/00 -0400, Karl Glazebrook wrote:

   Finally as an overload expert what do you think about the proposals
   to make arrays overloadable objects so one can say things like:
  
   @x = 3 * @y;

What do you think of:

   $x[|i] = 3 * $y[|i];

or

   @x = 3 * $y[|i];

It's not as clean as @x = 3 * @y, but it is cleaner context-wise.

And one could argue that:

@x = map 3*^_, @y;

is cleaner yet...

Karl




Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-21 Thread Ilya Zakharevich

On Thu, Sep 21, 2000 at 03:26:39PM -0400, Karl Glazebrook wrote:
  [Aside: Why not make ternary-range operator into 10 :: 20 :: 2 ?]
 
 That would work. My point is that having a stride is a fundamental
 feature in other array languages (IDL, Matlab, PDL) and would be
 useful in the perl core.

Did not use any steps more than 1 for a decade or so.  But in 80's,
when people did not believe in 10^4..10^7 speedups my algos were
claiming, I needed to actually code them in Fortran ;-).  I think I
used larger-than-1 steps that time.

But with Fortran such things are not *needed*.  Compilers are smart
enough to convert (equivalents to)

  map 3*$_, 34..67

into efficient code...

  A proposal like this would make a nuisance into a nightmare.  Yes, it
  looks nice, but it contradicts many rules, so in the long run it is
  going to be a significant step back.
  
  ...Unless the whole idea of cryptocontext is turned to become something else...
 
 I am not sure what you mean by "cryptocontext"?

See p5p archives.  (Significant) switching of the meaning of operations
basing on the context looks good on paper and for small examples, but
it breaks badly in slightly more complicated situations.  The problem
is that the context is not always what you think.  Say,

  f(3*@a)

would typically be a list context - and suddently instead of 3*(1+$#a)
you get Cmap 3*$_, @a.

 I guess the motivation here is to make non-core arrays (such as PDL
 objects) look as much as possible like Perl arrays to simplify the
 appearance to users.

Why?  Currently you can make them look like references to array.  See
Math::Pari for an implementation.  Overloading '@{}' gives yet another
way to do this.

 It's really hard to explain why people should use @x[1..10] for
 perl arrays and $x-slice("1:10") for PDL arrays!

Use

  $x-[1..10]

for both.

Ilya



Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-18 Thread Karl Glazebrook

Hi Ilya,

I have three questions about your RFC:

Firstly does your proposal allow for a slice like 10..20:2  (i.e. with
a stride of 2) ?

If not is there an easy way to incorporate that?

Secondly, what about having multidim support in the core so that the
tie-tokenisers get optimised away? i.e. would we be able to
say something like:

@x = @y[10..20; 1..3]

for core arrays

Finally as an overload expert what do you think about the proposals
to make arrays overloadable objects so one can say things like:

@x = 3 * @y;


Katl



Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-18 Thread Christian Soeller

 
 Finally as an overload expert what do you think about the proposals
 to make arrays overloadable objects so one can say things like:
 
 @x = 3 * @y;

Is this where RFC 231's suggestion for OO slicing comes in (see quote)?

 For example, 
 
$matrix1-[2..5; 2..4] * $matrix2-[1,3,5; 11..64];
 
  would denote: create two new objects for the specified submatrices, apply 
(overloaded) multiplication to these objects. Such a
  request is illegal for untie()d arrays; for tie()d arrays it is converted to a 
call to FETCH_SLICE in a scalar context.
  (Alternative: introduce two new tie()d methods: FETCH_SUBOBJECT, 
STORE_SUBOBJECT.) 

or is this supposed to be othogonal?

Another questing re RFC 231. Is it really required to make the
syntactical distinction between ranges (..) and bi_ranges (...)? Some
more explanation would be appreciated.

  Christian



Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-16 Thread Ilya Zakharevich

On Sat, Sep 16, 2000 at 11:08:18AM +1100, Jeremy Howard wrote:
  proposes a convenient syntax to slice multi-dimensional arrays;

 It is hard to evaluate this proposal without more context. In particular:
 
  - How does it relate to RFC 204? Is it an alternative, or an addition?

204 cannot be implemented since it prohibits usage of overloaded
objects as array indices.

  - How does it relate to RFC 81? The semantics of '..' seems to conflict.

What I say conserns the usage of '..' inside an index only.  It cannot
conflict with anything else.

  - Why is it better to make ';' "special inside a hash/array index only"

Because ',' is already special there.  There is little chance that ';'
operator is created as a general-purpose operator.

  - Why is a special token for a separator necessary "to avoid the (giant)
 overhead of creation of anonymous arrays"? Don't RFC 203 arrays and RFC
 81/205 lazy generation avoid this?

a) "Lazy generation" is not defined, as stated it is a good wish only.
   What is

 @a = (0, 2..99, 200..9998, 100);
 f(@a);

   ?  My proposal has completely defined behaviour (AFAICS).

   [Yes, I was proposing lazy evaluation for a long time.  But I know
   that it can be further than it appears.]

b) The call for $a[2,3;5,6] is

  *) Put already-available SV pointers for $a, 2,3,4,5 and the cashed SV*
 for tie::multi::separator() on stack;

  *) Put the (cached) CV* for the method on stack;

  *) invoke the call frame;

This is not *very* quick, but at least it may be "not that slow".
While all the alternatives require creation of anonymous lists, which
(I expect) will slow things down 7..10 times for the call above.  For
$a[1..100;1..100] it may easily be 100..1000 times slower.

Your way was my way when I was designing Math::Pari.  When I
*implemented* Math::Pari, it took some time to determine why it was so
much slower than what I expected.  My proposal is based on this
experience.

Creation of [1,2,3] is *very* slow.

  - Overall, what is the problem in the existing array RFCs that this is
 designed to solve?

*) They are not compatible with overloading (unless overloaded things
   are dramatically changed);

*) They create a lot of temporary anonymous arrays the only purpose of
   which is to group arguments;

*) They go very high on the bizzareness scale.

  - Can we incorporate a solution into the existing RFCs without creating a
 new conflicting one?
 
 If there are implementation challenges around the existing RFCs, I would
 rather make changes required to overcome them within those RFCs.

I see no way how the existing RFC can be accepted.  (No, I could not
read the "include all the PDL" proposal to the end, so I cannot
comment on this.)

 That we we
 get the benefit of the thought we've all put into the syntax of these RFCs,
 plus the benefit of Ilya's deep understanding of Perl internals.

Thank you for suggesting that I do not need to think to create a RFC.

Ilya



Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-16 Thread Jeremy Howard

Ilya Zakharevich wrote:
 On Sat, Sep 16, 2000 at 11:08:18AM +1100, Jeremy Howard wrote:
   - How does it relate to RFC 204? Is it an alternative, or an addition?

 204 cannot be implemented since it prohibits usage of overloaded
 objects as array indices.

Why is it important for overloaded objects to be used as array indices? Why
does RFC 204 rule that out? RFC 204 simply specifies that a list reference
as an index provides multidimensional access:

  $a[ [1,1] ] == $a[1][1];

   - How does it relate to RFC 81? The semantics of '..' seems to
conflict.

 What I say conserns the usage of '..' inside an index only.  It cannot
 conflict with anything else.

RFC 81 expands on the existing operator '..' in a list context to allow more
generic list generation. It is particularly useful to generate lists to act
as array slices:

  @a[ 1..5 : 3] == @a[1,3,5];

This would seem to conflict with the meaning of '..' outlined in RFC 231.

   - Why is it better to make ';' "special inside a hash/array index only"

 Because ',' is already special there.  There is little chance that ';'
 operator is created as a general-purpose operator.

When we first discussed ';' on the list, we looked at making it special in
an index only. But the more generic approach of making it a cartesian
product operator seems cleaner--it avoids 'special' meanings in favour of
providing a generic operator.

Why is there little chance of creating ';' as a general-purpose operator?

   - Why is a special token for a separator necessary "to avoid the
(giant)
  overhead of creation of anonymous arrays"? Don't RFC 203 arrays and RFC
  81/205 lazy generation avoid this?

 a) "Lazy generation" is not defined, as stated it is a good wish only.
What is

  @a = (0, 2..99, 200..9998, 100);
  f(@a);

Lazy generation is a well understood concept in other languages. I'm most
familiar with C++, so I'll draw from that. In libraries that provide lazy
evaluation, f(@lazy_list) is a 'promise' to apply f() to the elements of
@lazy_list when an element of f(@lazy_list) needs to be calculated.
Sometimes this is all done at runtime (MTL, newmat), sometimes parts are
done at compile time ('expression templates' in POOMA and Blitz++). These
C++ examples and many others are indexed at:

  http://www.oonumerics.org/oon/

 b) The call for $a[2,3;5,6] is

   *) Put already-available SV pointers for $a, 2,3,4,5 and the cashed SV*
  for tie::multi::separator() on stack;

   *) Put the (cached) CV* for the method on stack;

   *) invoke the call frame;

 This is not *very* quick, but at least it may be "not that slow".
 While all the alternatives require creation of anonymous lists, which
 (I expect) will slow things down 7..10 times for the call above.  For
 $a[1..100;1..100] it may easily be 100..1000 times slower.

Lists of lists of known simple type are proposed by RFC 203 to be stored as
true arrays (i.e. contiguously in memory). Their overhead is not the same as
Perl 5 lists of lists.

The index in $a[1..100;1..100] should be generated lazily. An individual
element can be calculated directly from the index parameters as required.

 Your way was my way when I was designing Math::Pari.  When I
 *implemented* Math::Pari, it took some time to determine why it was so
 much slower than what I expected.  My proposal is based on this
 experience.

 Creation of [1,2,3] is *very* slow.

I hope we can change how [1,2,3] is created by:

 - Creating a true numeric array if it is an array of known simple types
 - Generating the elements lazily where it is more efficient to do so

If we can not do these, then I agree that RFCs 204 and 205 are not plausible
in their current form.

   - Overall, what is the problem in the existing array RFCs that this is
  designed to solve?

 *) They are not compatible with overloading (unless overloaded things
are dramatically changed);

There are a number of RFCs proposing substantially changing overloading.
What specific changes would we need to ensure were incorporated in P6 to
avoid this incompatibility?

 *) They create a lot of temporary anonymous arrays the only purpose of
which is to group arguments;

Yes, if we can't get any lazy generation to work.

 *) They go very high on the bizzareness scale.

Bizzare??? Which RFC?

RFC 82: The concept of all array operations being applied element-wise to
arrays is very widely used in languages oriented to numeric programming--it
is certainly not 'bizzare'. There has been debate around '||' and '',
although I find the alternative meaning of these in a list context proposed
by RFC 45 more bizarre. ...But I think that this point is already well
discussed...

RFCs 90 and 91: These builtins are in almost all languages with rich array
functionality. 'merge' and 'demerge' are more frequently called 'zip' and
'unzip', but those terms were almost universally rejected on -language.

RFC 203: If we know that a list of lists is of a simple type, why not store
it efficiently? And why not 

Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-16 Thread Ilya Zakharevich

On Sat, Sep 16, 2000 at 07:15:34PM +1100, Jeremy Howard wrote:
 Why is it important for overloaded objects to be used as array indices?

Overloaded objects should behave the same way as non-objects.

 Why 
 does RFC 204 rule that out? RFC 204 simply specifies that a list reference
 as an index provides multidimensional access:
 
   $a[ [1,1] ] == $a[1][1];

I repeat: what does

$a[ $ind ]

does if $ind is a (blessed) reference to array (1,1), but behaves as
if it were 11 (due to overloading)?

 RFC 81 expands on the existing operator '..' in a list context to allow more
 generic list generation. It is particularly useful to generate lists to act
 as array slices:
 
   @a[ 1..5 : 3] == @a[1,3,5];
 
 This would seem to conflict with the meaning of '..' outlined in RFC 231.

Sorry, I see no conflict.  (Assuming that ternary '..' is allowed, the
token tie::multi::range() would be followed by 3 numbers, not 2.)

These calls will result in

  tied(@a)-FETCH_RANGE(tie::multi::range(), 1, 5, 3)
  tied(@a)-FETCH_RANGE(1, 3, 3)

If FETCH_RANGE uses tie::multi::inline() to preprocess the keys, this
*by definition* will result in the same array of keys.  If not, it
is the responsibility of FETCH_RANGE to insure the equivalence.

And $a[ 1..5e6 ] would not need to create 5e6 Perl objects the only
purpose of which is to inform the range extractor that it needs to
create an object representing the slice.

  Because ',' is already special there.  There is little chance that ';'
  operator is created as a general-purpose operator.

 When we first discussed ';' on the list, we looked at making it special in
 an index only. But the more generic approach of making it a cartesian
 product operator seems cleaner--it avoids 'special' meanings in favour of
 providing a generic operator.

No, it is not a generic operator.  Its behavior depends on whether it
is used *inside parens*, or not.  Additionally, the behaviour of
cartesian product makes very little sense: if you did not want it 3
times, you should not insert it into the language.

  a) "Lazy generation" is not defined, as stated it is a good wish only.
 What is
 
   @a = (0, 2..99, 200..9998, 100);
   f(@a);

 Lazy generation is a well understood concept in other languages.

Maybe.  But it is not defined in the corresponding RFC nevertheless.
At least: all I could deduce was that the following constructs are
made synonymous:

  @a = ($a .. $b);
  tie @a, Array::Range, $a, $b;

No other usage of .. is covered.

  b) The call for $a[2,3;5,6] is
 
*) Put already-available SV pointers for $a, 2,3,4,5 and the cashed SV*
   for tie::multi::separator() on stack;
 
*) Put the (cached) CV* for the method on stack;
 
*) invoke the call frame;
 
  This is not *very* quick, but at least it may be "not that slow".
  While all the alternatives require creation of anonymous lists, which
  (I expect) will slow things down 7..10 times for the call above.  For
  $a[1..100;1..100] it may easily be 100..1000 times slower.

 Lists of lists of known simple type are proposed by RFC 203 to be stored as
 true arrays (i.e. contiguously in memory). Their overhead is not the same as
 Perl 5 lists of lists.

Maybe.  But you still need to create 200-elements temporary array
the only purpose of which is to inform the tied array that you need
the upper-left 1000x1000 submatrix.

*You do not want to create new values uncessesarily*.  This is too
slow.  Quick operations should reuse already available values
instead.  See how scratchpads work...

Even if it is creation of a "streamlined" array, creation still will
takes much more time than operation dispatch - which is in turn
painfully slow.

 The index in $a[1..100;1..100] should be generated lazily.

This is *exactly* what my proposal is doing.  The difference is that
it defines what "lazily" means.

  *) They are not compatible with overloading (unless overloaded things
 are dramatically changed);

 There are a number of RFCs proposing substantially changing overloading.
 What specific changes would we need to ensure were incorporated in P6 to
 avoid this incompatibility?

I see no way how they can be made compatible.  Overloading allows
objects to behave *both* as numbers and as array references.

Well, maybe there is a solution: 2 new overloaded accessors in
addition to '""', '0+', 'bool', '@{}', '${}' etc: "extract the value
as the array/hash index", defaulting to '0+' and '""' correspondingly.

  *) They go very high on the bizzareness scale.
 
 Bizzare??? Which RFC?

Binary ';'.

 RFCs 90 and 91: These builtins are in almost all languages with rich array
 functionality. 'merge' and 'demerge' are more frequently called 'zip' and
 'unzip', but those terms were almost universally rejected on -language.

These are convenience functions.  I do not see what they have to do
with the language design...

 RFC 204: Isn't it fairly intuitive that:
 
   $a[ [1,1] ] == $a[1][1];

It may be - for people who do not 

Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-16 Thread Jeremy Howard

Ilya Zakharevich wrote:
 On Sat, Sep 16, 2000 at 07:15:34PM +1100, Jeremy Howard wrote:
  Why is it important for overloaded objects to be used as array indices?

 Overloaded objects should behave the same way as non-objects.

  Why
  does RFC 204 rule that out? RFC 204 simply specifies that a list
reference
  as an index provides multidimensional access:
 
$a[ [1,1] ] == $a[1][1];

 I repeat: what does

 $a[ $ind ]

 does if $ind is a (blessed) reference to array (1,1), but behaves as
 if it were 11 (due to overloading)?

How $ind is implemented (ie the actual structure that is blessed) does not
matter. What matters is what interface its class provides. If it overloads
operators such that dereferencing it does not provide an array, then it
shouldn't be expected to work as a multidimensional array index. If it
provides operators that give it the same interface as a list ref, then it
should work everywhere a list ref does.

  RFC 81 expands on the existing operator '..' in a list context to allow
more
  generic list generation. It is particularly useful to generate lists to
act
  as array slices:
 
@a[ 1..5 : 3] == @a[1,3,5];
 
  This would seem to conflict with the meaning of '..' outlined in RFC
231.

 Sorry, I see no conflict.  (Assuming that ternary '..' is allowed, the
 token tie::multi::range() would be followed by 3 numbers, not 2.)

 These calls will result in

   tied(@a)-FETCH_RANGE(tie::multi::range(), 1, 5, 3)
   tied(@a)-FETCH_RANGE(1, 3, 3)

 If FETCH_RANGE uses tie::multi::inline() to preprocess the keys, this
 *by definition* will result in the same array of keys.  If not, it
 is the responsibility of FETCH_RANGE to insure the equivalence.

 And $a[ 1..5e6 ] would not need to create 5e6 Perl objects the only
 purpose of which is to inform the range extractor that it needs to
 create an object representing the slice.

From RFC 81:

quote
When a lazy list is passed to a function it is not evaluated. The function
can then access only the elements it needs, which are calculated as
required. Furthermore, the arguments that generated the list are available
as attributes of the list, and can therefore be used directly without
actually accessing the list
/quote

It is not necessary to create 5e6 objects.

Furthermore, RFC 81 proposes syntax beyond just ($start..$stop: $step).
Implementing it using tie::multi::range() followed by 3 numbers would not be
enough. Anyway, we're defining a language interface here, not an
implementation, so we don't really need to nail this down immediately.

  When we first discussed ';' on the list, we looked at making it special
in
  an index only. But the more generic approach of making it a cartesian
  product operator seems cleaner--it avoids 'special' meanings in favour
of
  providing a generic operator.

 No, it is not a generic operator.  Its behavior depends on whether it
 is used *inside parens*, or not.  Additionally, the behaviour of
 cartesian product makes very little sense: if you did not want it 3
 times, you should not insert it into the language.

I'm not wedded to allowing ';' outside of a list index. However, it does
lead to both consistency and convenience with how list slicing is done in
Perl 5:

  # Perl 5 behaviour
  @indices = (1,3);
  @list = (3,4,5,6);
  @list[@indices] = (1,2);   # (3,1,5,2)

  # Multidim extension
  @2d_indices = ([0,0],[1,1]);
  @2d_arr = ([3,4,5],[6,7,8]);
  @2d_arr[@2d_indices] = (1,2);   # ([1,4,5],[6,2,8])

  # Slice syntax extension
  @2d_slice = (0..1 ; 0..1);   # ([0,0],[0,1],[1,0],[1,1])
  @2d_arr = ([3,4,5],[6,7,8]);
  @2d_arr[@2d_slice] = ([0,1],[0,1]);   # ([0,1,5],[0,1,8])

The implementation of ';' when used as a list index and then thrown away
clearly should not create an actual list of lists, for efficiency reasons. I
don't see why this case can't be dealt with appropriately.

 Maybe.  But it is not defined in the corresponding RFC nevertheless.
 At least: all I could deduce was that the following constructs are
 made synonymous:

   @a = ($a .. $b);
   tie @a, Array::Range, $a, $b;

 No other usage of .. is covered.

RFC 81 defines 4 uses of C... It does not propose a specific
implementation in terms of Ctie, or anything else--it simply defines a
language interface.

 *You do not want to create new values uncessesarily*.  This is too
 slow.  Quick operations should reuse already available values
 instead.  See how scratchpads work...

Agreed. RFC 81 proposes that generated lists be memoized, and that new
values are only create when required.

 Even if it is creation of a "streamlined" array, creation still will
 takes much more time than operation dispatch - which is in turn
 painfully slow.

We should optimise special cases when we know which are causing problems.
Perl 5 may or may not provide useful experience here--the operation dispatch
approach in Perl 6 may be quite different, given how the -internals
discussions are progressing.

  RFC 204: Isn't it fairly intuitive that:
 
$a[ 

Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-16 Thread Ilya Zakharevich

On Sun, Sep 17, 2000 at 11:07:09AM +1100, Jeremy Howard wrote:
  I repeat: what does
 
  $a[ $ind ]
 
  does if $ind is a (blessed) reference to array (1,1), but behaves as
  if it were 11 (due to overloading)?
 
 How $ind is implemented (ie the actual structure that is blessed) does not
 matter. What matters is what interface its class provides.

As I said: it provides *both* numeric value and list reference
interface (as complex values may do).

 quote
 When a lazy list is passed to a function it is not evaluated. The function
 can then access only the elements it needs, which are calculated as
 required. Furthermore, the arguments that generated the list are available
 as attributes of the list, and can therefore be used directly without
 actually accessing the list
 /quote

f(1, 10..1e6, 1e8..2e8, 1e9)

How can the body of f() query the "attributes" to see that it got
something lazy?

 Furthermore, RFC 81 proposes syntax beyond just ($start..$stop: $step).
 Implementing it using tie::multi::range() followed by 3 numbers would not be
 enough.

Another example of "bizzare" (and not completely defined) interface.
I would think it stands a very little chance to become reality.

Apparently, the authors of RFC81 assume that iterators become better
integrated if they are introduced by a funny syntax.  Since what they
want to accomplish is exactly this...

  my $iter = new iterator start = $a, next = sub {};
  foreach my $i (each $iter) {...}

Here an iterator is something which overloads '' (in Perl5 speak).
A way to integrate iterators would be very convinient indeed.  As you
see, in principle it does not need any funny syntax...

 Anyway, we're defining a language interface here, not an
 implementation, so we don't really need to nail this down immediately.

No, an interface without a feasible implementation in mind is not viable.

   # Slice syntax extension
   @2d_slice = (0..1 ; 0..1);   # ([0,0],[0,1],[1,0],[1,1])

This is very expensive.  Do you know any example when such a list is
needed as a final result, not as a temporary?

@a = ($a .. $b);
tie @a, Array::Range, $a, $b;
 
  No other usage of .. is covered.
 
 RFC 81 defines 4 uses of C...

Sorry, the only context which I could find is the one above.

   The index in $a[1..100;1..100] should be generated lazily.
 
  This is *exactly* what my proposal is doing.  The difference is that
  it defines what "lazily" means.
 
 Except that your proposal changes the language interface. In particular, it
 doesn't allow the creation of contiguous slices, AFAICS. @a[1..100;1..100]
 should refer to the whole box bounded by (1,1) and (100,100).

I have no idea what you are talking about.  What else can it *mean*
but the whole box?  Having different calling conventions does not mean
that the *results* are different.

 It's very important. It shows that a particular syntax is intuitive enough
 that it is understand by people with a wide range of backgrounds. Intuitive
 syntax is an important language design goal.

The syntax and the access-semantic of RFC81 and of RFC231 are the same.
However, RFC231 explain how this semantic can be achieved via simple
tie() interfaces.

 RFC 231 does not (yet) effectively cover the same range of problems that the
 array RFCs do. We need multidimensional slicing (not just multiple
 indexing)

Is in RFC231.

 flexible list generation,

This is orthogonal.  And I do not see why this is needed to be in the
core language at all.  I would guess that an appropriate module with
interfaces to generate efficient arrays is a better place for this
(see the example above).

 multiple levels of indirection,

Do not know what you mean here.

 and fast and compact reshaping.

If we want the reshaping to be supported by builtin arrays *and*
transparently by overloaded arrays, then yes, it is needed to be in
the core.  But I see no need for this.

Again, this looks as belonging to a module, not to the core.

Ilya