RFC 148 (v2) Add reshape() for multi-dimensional array reshaping

2000-09-18 Thread Perl6 RFC Librarian

=head1 VERSION
Reply-To: [EMAIL PROTECTED]

This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Add reshape() for multi-dimensional array reshaping

=head1 VERSION

  Maintainer: Nathan Wiger [EMAIL PROTECTED]
  Date: 24 Aug 2000
  Last Modified: 18 Sep 2000
  Mailing List: [EMAIL PROTECTED]
  Number: 148
  Version: 2
  Status: Developing

=head1 CHANGES

   1. Altered syntax to increase flexibility

   2. Removed arbitrary interleaving feature

   3. Almost had a stroke trying to update all my RFC's

=head1 ABSTRACT

Currently, there is no easy way to reshape existing arrays into multiple
arrays or matrices. This makes nifty array manipulation and complex math
hard.

A general-purpose tool that can do arbitrary multi-dimensional array
reshaping, from which other array manipulation functions can be derived,
makes data manipulation easier.

=head1 DESCRIPTION

Let's jump in. This RFC proposes a Creshape builtin with the following
syntax:

  @reshaped = reshape [$x, $y, $z, ..], @a, @b ...

The prototype would look something like this:

  sub reshape (\@;\@\@\@\@\@\@...);

The first argument is an array of dimensions, where C$x and C$y are
the shape of the array to produce. The order and meaning of the
arguments are the same ones used by the C:shape array attribute
described in BRFC 203.

We only need one Creshape since it is a multipurpose tool that works
in any direction, serving as its own inverse.

The dimensions used are subject to the following properties:

  1. Less data than specified causes Creshape to
 return undef

  2. More data than specified is silently discarded

If any of the dimensions is specified as C-1, then that indicates a
wildcard and grabs enough data to fill up the list. See below.

=head2 Single Array Form - SPLIT

When one array is passed in, it is split up. Here, the C$x and C$y
determine the dimensions of the resulting lists. The C$i determines
the interleave. For example, assume reshape is called with the list
(1..23) in the following forms:

   @a = 1..23;
   @results = reshape [$x,$y], @a;

   $x,$y   @results
   -   --
   3, 2( [1,2,3], [4,5,6] ) 
   2, 4( [1,2],[3,4],[5,6],[7,8] )
   14,20   undef - not enough data to fill

Notice how both dimensions work together to Creshape the array. As
such, the combination of the arguments is more significant than the
individual arguments themselves. Also, note that any excess data left
over after the dimensions have been fulfilled is discarded. In the final
example, undef is returned, allowing you to easily check if you have
enough data:

   @matrix = reshape [14,20], @input or die "Not enough data!";

In addition, wildcards can be used. With a fixed C$y, only that many
lists are returned.  However, with a wildcard C$y, any number of
C$x-long lists are returned:

   $x,$y   @results
   -   --
   4, -1   ( [1,2,3,4], [5,6,7,8],
 [9,10,11,12], [13,14,15,16],
 [17,18,19,20] )   # lose 21..23

Note that we lose data here because we can't get an exact number of
lists length C$x. With a fixed C$x, lists Imust be returned that
fixed length.

However, with a wildcard C$x, lists will be expanded to fill the
number specified by C$y, even in mismatched sizes:

   $x,$y   @results
   -   --
   -1, 2   ( [1,2,3,4,5,6,7,8,9,10,11,12],
 [13,14,15,16,17,18,19,20,21,22,23])

Here, all the data is guaranteed to be preserved. It is simply split
into exactly the number of parts specified by C$y, even if that
results in some lists being different sizes.

=head1 Multiple Array Form - JOIN

In this form, multiple arrays are joined back together. Here, C$x
and C$y, specify the dimensions to use to rejoin the lists, not
to split them up. The dimensions simply work in reverse: Rather than
specifying how many lists to create, they specify which elements of the
input lists are joined back together.

So, we'll assume an input array of the form:

   ( [1,2,3,4], [5,6,7,8], [9,10,11,12] )

Which is called by Creshape with the following dimensions:

   $x,$y   @results
   -   --
   -1,-1   ( 1,4,7,10,2,5,8,3,6,9 )  # simple concat
   3, -1   ( 1,2,3,5,6,7,8,9,10 )# 3 vals from all lists
   -1, 2   ( 1,2,3,4,5,6,7,8 )   # all vals from 2 lists
   3, 2( 1,2,3,4,5,6 )   # 3 vals x 2 lists

Hopefully this is easy to understand. C$x controls how many elements
of each list are used, and C$y controls how many lists are used. This
is just like the splitting operation, but in reverse. Again, wildcards
of C-1 can be used here as well.

=head2 Matrix Calculations and Extensions

It is the opinion of the author that extensive matrix calculations and
manipulations be left to external modules. A function such as this
should be able to take care of most of the 

Re: RFC 148 (v2) Add reshape() for multi-dimensional array reshaping

2000-09-18 Thread Jeremy Howard

 Let's jump in. This RFC proposes a Creshape builtin with the following
 syntax:

Err... this syntax isn't what I expected at all! I thought the first
argument would define the shape of the result, like NumPy or PDL...

 When one array is passed in, it is split up. Here, the C$x and C$y
 determine the dimensions of the resulting lists. The C$i determines
 the interleave.

This $i definition should be removed now.

 So, we'll assume an input array of the form:

( [1,2,3,4], [5,6,7,8], [9,10,11,12] )

 Which is called by Creshape with the following dimensions:

$x,$y   @results
-   --
-1,-1   ( 1,4,7,10,2,5,8,3,6,9 )  # simple concat

I think that a simple concat is:

  reshape ([-1], @a);

since here the rank of a list is one, so the length of the first argument to
reshape is one. The -1 means 'use up the whole list'. It should be an error
to have more than one arg of -1.

3, -1   ( 1,2,3,5,6,7,8,9,10 )# 3 vals from all lists

That should be a rank 2 matrix of shape (3,4), i.e.
([1,2,3],[4,5,6],[7,8,9],[10,11,12]).

-1, 2   ( 1,2,3,4,5,6,7,8 )   # all vals from 2 lists

That's a rank 2 matrix of shape (6,2).

3, 2( 1,2,3,4,5,6 )   # 3 vals x 2 lists

That's a rank 2 matrix of shape (3,2), which would discard the last 6
elements.

 Hopefully this is easy to understand. C$x controls how many elements
 of each list are used, and C$y controls how many lists are used. This
 is just like the splitting operation, but in reverse. Again, wildcards
 of C-1 can be used here as well.

I don't think that there's 2 types of reshape(). There's one, and it takes
an array of one shape, and returns an array of a different shape. The shape
of the new array is specified by the first argument. The second argument is
a list, so it succumbs to Perl's normal list flattening behaviour.

The behaviour of reshape() should reflect:

 - PDL's reshape()
 - NumPy's reshape() -- which is the only one allowing '-1' in the shape
 - J's '$' verb

which all behave the same way.

Also, the examples should show reshaping into and out of arrays other than
rank 1 or 2.

Sorry Nate--I know we thought we were on the same wavelength here, but it
looks like we weren't at all! Would you like me to redraft this for you, or
create a new RFC?





Re: RFC 148 (v2) Add reshape() for multi-dimensional array reshaping

2000-09-18 Thread Nathan Wiger

 Sorry Nate--I know we thought we were on the same wavelength here, but it
 looks like we weren't at all! Would you like me to redraft this for you, or
 create a new RFC?

It's all yours. My brain is toast, and I'm totally RFC'ed out. The only
thing I care about is that the lists wind up on the end, i.e.

   @results = reshape [...], @a, @b, ...;

Sorry for screwing it up. Please take over as maintainer.

-Nate