Re: [S09] Whatever indices and shaped arrays

2007-03-08 Thread Dr.Ruud
David Green schreef:
 Jonathan Lang:


 (In fact, the semantics for @x[*+n] follows directly from the fact
 that an array returns the count of its elements in scalar context.)
 And @x[*] would be the same as @x[0..^*] or @x[0..(*-1)].

 That's an elegance in its favour.

In Perl5 a + can creep in, for example:

$ perl -wle '$s = -123; $n = -123; print -$s; print -$n'
+123
123

so maybe it is not a bad idea to keep treating a unary + as (almost) a
no-op.

-- 
Affijn, Ruud

Gewoon is een tijger.



Re: [S09] Whatever indices and shaped arrays

2007-03-07 Thread Jonathan Lang

OK: before I submit a patch, let me make sure that I've got the
concepts straight:

@x[0] always means the first element of the array; @x[-1] always
means the last element of the array; @x[*+0] always means the
first element after the end of the array; @x[*-1] always means the
first element before the beginning of the array.  That is, the
indices go:

..., *-3, *-2, *-1, 0, 1, 2, ..., -3, -2, -1, *+0, *+1, *+2, ...
   ^  ^
   |  |
 first   last

As well, a Whatever in square braces is treated as an array of the
valid positions; so @x[*] is equivalent to @x[0..-1].

If you want to use sparse indices and/or indices that begin somewhere
other than zero, access them using curly braces.  Consider an array
with valid indices ranging from -2 to +2: @x{-2} means element -2,
which would be equivalent to @x[0]; @x{+2} means element 2, which
would be equivalent to @x[-1].  Likewise, @x{0} is the same as @x[2],
@x{-3} is the same as @x[*-1], @x{+3} is the same as @x[*+0], and so
on.  If @y has a series of five indices that start at 1 and double
with each step, then @y{1} will be the same as @y[0]; @y{4} will be
the same as @y[2], and so on.

A Whatever in curly braces is treated as an array of the valid index
names; so @x{*} means @x{-2..+2}, and @y{*} means @y{1, 2, 4, 8, 16}.
Because it is treated as an array, individual index names can be
accessed by position: @x{*[0]} is a rather verbose way of saying
@x[0].  This lets you embed ordinal indices into slices involving
named indices.  Conversely, using *{...} inside square braces lets you
embed named indices into slices involving ordinal indices: @x[*{-2}]
is the same as @x{-2}.

Multidimensional arrays follow the above conventions for each of their
dimensions; so a single-splat provide a list of every index in a given
dimension, a 0 refers to the first index in that dimension, and so on.
A double-splat extends the concept to a multidimensional list that
handles an arbitrary number of dimensions at once.

--

Commentary: I find the sequence of ordinals outlined above to be a bit
messy, especially when you start using ranges of indices: you need to
make sure that @x[0..-1] dwims, that @x[-1..(*+0)] dwims, that
@x[(*-2)..(*+3)] dwims, and so on.  This is a potentially very ugly
process.  As well, the fact that @x[-1] doesn't refer to the element
immediately before @x[0] is awkward, as is the fact that @x[*-1]
doesn't refer to the element immediately before @x[*+0].  IMHO, it
would be cleaner to have @x[n] count forward and backward from the
front of the array, while @x[*+n] counts forward and backward from
just past the end of the array:

..., -3, -2, -1, 0, 1, 2, ..., *-3, *-2, *-1, *+0, *+1, *+2, ...
^^
 ||
   first last

So perl 5's $x[-1] would always translate to @x[*-1] in perl 6.
Always.  Likewise, @x[+*] would be the same as @x[*+0].  (In fact,
the semantics for @x[*+n] follows directly from the fact that an
array returns the count of its elements in scalar context.)  And
@x[*] would be the same as @x[0..^*] or @x[0..(*-1)].

You would lose one thing: the ability to select an open-ended Range of
elements.  For a five-element list, @x[1..^*] means @x[1, 2, 3,
4], not @x[1, 2, 3, 4, 5, 6, 7, 8, ...].

Technically, one could say @x{+*} to reference the index that
coincides with the number of indices; but it would only be useful in
specific cases, such as referencing the last element of a one-based
contiguous array.

--
Jonathan Dataweaver Lang


Re: [S09] Whatever indices and shaped arrays

2007-03-07 Thread David Green

On 3/7/07, Jonathan Lang wrote:

summary snipped


Looks good to me.

As well, the fact that @x[-1] doesn't refer to the element 
immediately before @x[0] is awkward, as is the fact that @x[*-1] 
doesn't refer to the element immediately before @x[*+0].  IMHO, it 
would be cleaner to have @x[n] count forward and backward from the 
front of the array, while @x[*+n] counts forward and backward from 
just past the end of the array:


I suggested that at one point, so I'd agree that makes sense too.  It 
avoids the discontinuity at either end of the array -- although 
arguably, points off the end of a list aren't in the same boat as 
elements that actually exist, so the discontinuity might be 
conceptually justified.  (Make the weird things look weird?)


(In fact, the semantics for @x[*+n] follows directly from the fact 
that an array returns the count of its elements in scalar context.) 
And @x[*] would be the same as @x[0..^*] or @x[0..(*-1)].


That's an elegance in its favour.

One possible downside is that it wouldn't work for cyclic/wrap-around 
arrays (where the indices are always interpreted mod n) -- since any 
number would always refer to an existing element.  Oh -- but if an 
index isn't a plain counter, then it should be a named key, so scrap 
that.
(The question then is: how to have reducible hash keys?  By which I 
mean different keys that get reduced to the same thing, e.g. %x{1} 
=== %x{5} === %x{9} === %x{13}, etc.  Presumably you can just 
override the .{} method on your hash, right?)


You would lose one thing: the ability to select an open-ended Range 
of elements.  For a five-element list, @x[1..^*] means @x[1, 2, 
3, 4], not @x[1, 2, 3, 4, 5, 6, 7, 8, ...].


Except wouldn't the .. interpret the * before the [] did?  So 1..* 
would yield a range-object from 1 to Inf, and then the array-deref 
would interpret 1..Inf accordingly.


Actually, it seems more useful if the * could mean the count; you can 
always say 1..Inf if that's what you want, but otherwise how would 
you get [1..^*] meaning [1,2,3,4]?  Perhaps the range could note when 
it's occurring in []-context, and interpret the * as count rather 
than as Inf?



-David


Re: [S09] Whatever indices and shaped arrays

2007-03-06 Thread Larry Wall
I like it.  I'm a bit strapped for time at the moment, but if you send
me a patch for S09 I can probably dig up a program to apply it with.  :)

Larry


Re: [S09] Whatever indices and shaped arrays

2007-03-06 Thread Jonathan Lang

Larry Wall wrote:

I like it.  I'm a bit strapped for time at the moment, but if you send
me a patch for S09 I can probably dig up a program to apply it with.  :)


Could someone advise me on how to create patches?

--
Jonathan Dataweaver Lang


Re: [S09] Whatever indices and shaped arrays

2007-03-06 Thread Juerd Waalboer
Jonathan Lang skribis 2007-03-06 13:35 (-0800):
 Could someone advise me on how to create patches?

Single file:

diff -u oldfile newfile

Entire tree:

diff -Nur oldtree/ newtree/

See also diff(1), and note that when diffing trees, you want to
distclean them first :)
-- 
korajn salutojn,

  juerd waalboer:  perl hacker  [EMAIL PROTECTED]  http://juerd.nl/sig
  convolution: ict solutions and consultancy [EMAIL PROTECTED]

Ik vertrouw stemcomputers niet.
Zie http://www.wijvertrouwenstemcomputersniet.nl/.


Re: [S09] Whatever indices and shaped arrays

2007-03-05 Thread David Green

On 2/27/07, Jonathan Lang wrote:

David Green wrote:
So I end up back at one of Larry's older ideas, which basically is: 
[] for counting, {} for keys.


What if you want to mix the two?  I want the third element of row 
5. In my proposal, that would be @array[5, *[2]]; in your 
proposal, there does not appear to be a way to do it.


Unless the two approaches aren't mutually exclusive: @array{5, 
*[2]}.  [...] Since this is an unlikely situation, the fact that 
nesting square braces inside curly braces is a bit uncomfortable 
isn't a problem: this is a case of making hard things possible, not 
making easy things easy.


Oh, good point.  Yes, I think that mixing them together that way makes sense.
It also suggests that you could get at the named keys by applying {} to *:
%foo[0, 1, *{'bar'}]; #first column, second row, bar layer

The one gotcha that I see here is with the possibility of 
multi-dimensional arrays.  In particular, should  multi-dimensional 
indices be allowed inside square braces? [...]  With that promise, 
you can always guarantee that the wrap-around semantics will work 
inside [], while nobody will expect them to work inside {}.


Right, I don't see a problem with handling any number of dimensions that way.

Furthermore, you could do away with the notion of shaped vs. 
unshaped: just give everything a default shape.  The default shape 
for arrays would be '[*]' - that is, one dimension with an 
indeterminate number of ordinals.


Meanwhile, shapes for {} would continue to use the current syntax.
'[$x, $y, $z]' would be nearly equivalent to '{0..^$x; 0..^$y; 0..^$z}'.


Agreed.

it can work in the usual way: start at 0, end at -1.  It is useful 
to be able to count past the ends of an array, and * can do this by 
going beyond the end: *+1, *+2, etc., or before the beginning: *-1, 
*-2, etc.  (This neatly preserves the notion of * as all the 
elements -- *-1 is the position before everything, and  *+1 is the 
position after everything else.)


Regardless, I would prefer this notion to the offset from the 
endpoint notion currently in use.  Note, however, that [*-1] 
wouldn't work in the ordinals paradigm; there simply is nothing 
before the first element.  About the only use I could see for it 
would be to provide an assignment equivalent of unshift: 
'@array[*-1] = $x' could be equivalent to 'unshift @array, $x'.  But 
note that, unlike the 'push'-type assignments, this would change 
what existing ordinals point to.


I figured that *-1 or *+1 would work like unshift/push, which 
effectively does change what the ordinals point to (e.g.  unshifting 
a P5 array).  If the array is not extensible, then it should fail in 
the same way as unshift/push would.


Meanwhile, {*-1} would only make sense in cases where keys are 
ordered and new keys can be auto-generated.  Note also that {*+$x} 
is compatible with {*[$x]}: the former would reference outside of 
the known set of keys, while {*[$x]} would reference within them.


Exactly.


-David


Re: [S09] Whatever indices and shaped arrays

2007-02-27 Thread David Green

On 2/24/07, Jonathan Lang wrote:
In effect, using * as an array of indices gives us the ordinals 
notation that has been requested on occasion: '*[0]' means 'first 
element', '*[1]' means 'second element', '*[-1]' means 'last 
element',
'*[0..2]' means 'first three elements', and so on - and this works 
regardless of what the actual indices are.


Using * that way works, but it still is awkward, which makes me think 
there's something not quite dropping into place yet.  We have the 
notion of keyed indexing via [] and counting/ordinal indexing via 
[*[]], which is rather a mouthful.  So I end up back at one of 
Larry's older ideas, which basically is: [] for counting, {} for keys.


To put a slight twist on it: instead of adding {}-indexing to arrays, 
consider that what makes something an array is that it doesn't have 
keys -- it's a collection of things that you can count through, as 
opposed to a collection that you search through by meaningful 
keys/names/tags/references/etc.  (E.g., consider positional vs. named 
params, and how they naturally map onto an array and a hash 
respectively.)


Now something that is countable doesn't have to have meaningful keys, 
but any keyed collection can be counted through; hence it makes sense 
to give hashes an array-like [] accessor for getting the 
first/last/nth item in the hash.  In fact, this is basically what 
%h.values gives you -- turning the hash values into an array (well, a 
list).  Saying %h[n] would amount to a direct way of saying 
@(%h.values)[n].


This becomes much more handy in P6, because hashes can be ordered. 
(Not that there's anything stopping you from counting through an 
unordered hash; %h[0] is always the first element of %h, you just 
might not know what that is, the same as with %h.values.)  If Perl 
knows how to generate new keys on the fly (say, because your possible 
hash keys were declared as something inc-/dec-rementable), then you 
can even access elements off the ends of your hash (push/unshift).


What about shaped arrays?  A shape means the indices *signify* 
something (if they didn't, you wouldn't care, you'd just start at 
0!).  So they really are *keys*, and thus should use a hash (which 
may not use any hash tables at all, but it's still an associative 
array because it associates meaningful keys with elements).  I'm not 
put off by calling it a hash -- I trust P6 to recognise when I 
declare a hash that is restricted to consecutive int keys, is 
ordered, etc. and to optimise accordingly.


If there are no meaningful lookup keys, if all I can do to get 
through my list is count the items, then an array is called for, and 
it can work in the usual way: start at 0, end at -1.  It is useful to 
be able to count past the ends of an array, and * can do this by 
going beyond the end: *+1, *+2, etc., or before the beginning: *-1, 
*-2, etc.  (This neatly preserves the notion of * as all the 
elements -- *-1 is the position before everything, and *+1 is the 
position after everything else.)



Well, at least this keeps the easy stuff (counting) easy, and the 
barely-harder stuff (keying) possible.  In fact, since hashes would 
always have both views available, nothing is lost; we get ordinals 
for hashes, shaped collections, and ones that you can pass to a sub 
without losing their shape, it solves the problem of distinguishing 
between ordinal vs. funny indices (and the related issues of 
wrap-around), you can count past the edges, and all while preserving 
familiar array behaviour (especially for P5 veterans), the meaning of 
* as everything, and uncluttered syntax.



-David


Re: [S09] Whatever indices and shaped arrays

2007-02-27 Thread Jonathan Lang

David Green wrote:

On 2/24/07, Jonathan Lang wrote:
In effect, using * as an array of indices gives us the ordinals
notation that has been requested on occasion: '*[0]' means 'first
element', '*[1]' means 'second element', '*[-1]' means 'last
element',
'*[0..2]' means 'first three elements', and so on - and this works
regardless of what the actual indices are.

Using * that way works, but it still is awkward, which makes me think
there's something not quite dropping into place yet.  We have the
notion of keyed indexing via [] and counting/ordinal indexing via
[*[]], which is rather a mouthful.  So I end up back at one of
Larry's older ideas, which basically is: [] for counting, {} for keys.


What if you want to mix the two?  I want the third element of row 5.
In my proposal, that would be @array[5, *[2]]; in your proposal,
there does not appear to be a way to do it.

Unless the two approaches aren't mutually exclusive: @array{5,
*[2]}.  That is, allow subscripted Whatevers within curly braces for
to enable the mixing of ordinals and keys.  Since this is an unlikely
situation, the fact that nesting square braces inside curly braces is
a bit uncomfortable isn't a problem: this is a case of making hard
things possible, not making easy things easy.


What about shaped arrays?  A shape means the indices *signify*
something (if they didn't, you wouldn't care, you'd just start at
0!).  So they really are *keys*, and thus should use a hash (which
may not use any hash tables at all, but it's still an associative
array because it associates meaningful keys with elements).  I'm not
put off by calling it a hash -- I trust P6 to recognise when I
declare a hash that is restricted to consecutive int keys, is
ordered, etc. and to optimise accordingly.


The one gotcha that I see here is with the possibility of
multi-dimensional arrays.  In particular, should multi-dimensional
indices be allowed inside square braces?  My gut instinct is yes;
conceptually, the third row of the fourth column is perfectly
reasonable terminology to use.  The thing that would distinguish []
from {} would be a promise to always use zero-based, consecutive
integers as your indices, however many dimensions you specify.  With
that promise, you can always guarantee that the wrap-around semantics
will work inside [], while nobody will expect them to work inside {}.

In short, the distinction being made here isn't unshaped vs.
shaped; it's ordinal indices vs. named indices, or ordinals
vs. keys.

That said, note that - in the current conception, at least - one of
the defining features of a shaped array is that trying to access
anything outside of the shape will cause an exception.  How would
shapes work with the ordinals-and-keys paradigm?

First: Ordinals have some severe restrictions on how they can be
shaped, as specified above.  The only degrees of freedom you have are
how many dimensions are allowed and, for each dimension, how many
ordinals are permitted.  Well, also the value type (although the key
type is fixed as Int where 0..*.  So you could say something like:

 my @array[2, 3, *]

...which would mean that the array must be three-dimensional; that the
first dimension is allowed two ordinals, the second is allowed three,
and the third is allowed any number of them - i.e., 'my @array[^2; ^3;
0..*]' in the current syntax.  Or you could say:

 my @array[2, **, 2]

...meaning that you can have any number of dimensions, but the first
and the last would be constrained to two ordinals each: 'my @array[^2;
**; ^2]'.

Note the use of commas above.  Since each dimension can only take a
single value (a non-negative integer), there's no reason to use a
multidimensional list to define the shape.  Personally, I like this
approach: it strikes me as being refreshingly uncluttered.

Furthermore, you could do away with the notion of shaped vs.
unshaped: just give everything a default shape.  The default shape
for arrays would be '[*]' - that is, one dimension with an
indeterminate number of ordinals.

Meanwhile, shapes for {} would continue to use the current syntax.
'[$x, $y, $z]' would be nearly equivalent to '{0..^$x; 0..^$y;
0..^$z}'.


If there are no meaningful lookup keys, if all I can do to get
through my list is count the items, then an array is called for, and
it can work in the usual way: start at 0, end at -1.  It is useful to
be able to count past the ends of an array, and * can do this by
going beyond the end: *+1, *+2, etc., or before the beginning: *-1,
*-2, etc.  (This neatly preserves the notion of * as all the
elements -- *-1 is the position before everything, and *+1 is the
position after everything else.)


Regardless, I would prefer this notion to the offset from the
endpoint notion currently in use.  Note, however, that [*-1] wouldn't
work in the ordinals paradigm; there simply is nothing before the
first element.  About the only use I could see for it would be to
provide an assignment equivalent of unshift: '@array[*-1] = $x'
could 

Re: [S09] Whatever indices and shaped arrays

2007-02-24 Thread Jonathan Lang

Jonathan Lang  wrote:

Larry Wall wrote:
 : If you want the last index, say '*[-1]' instead of '* - 1'.
 : If you want the first index, say '*[0]' instead of '* + 0'.

 So the generic version of leaving off both ends would be *[1]..*[-2]
 (ignoring that we'd probably write *[0]^..^*[-1] for that instead).

Correct - although that assumes that the indices are consecutive (as
opposed to, say, 1, 2, 4, 8, 16...); this version of * makes no such
assumption.


Another thought: '*[1..-2]' or '*[0^..^-1]' would do the trick here -
except for the fact that the Range 1..-2 doesn't normally make sense.
Suggestion: when dealing with Ranges in unshaped arrays, negative
endpoints are treated like negative indices (i.e., '$_ += [EMAIL PROTECTED]').

In effect, using * as an array of indices gives us the ordinals
notation that has been requested on occasion: '*[0]' means 'first
element', '*[1]' means 'second element', '*[-1]' means 'last element',
'*[0..2]' means 'first three elements', and so on - and this works
regardless of what the actual indices are.


Like I said, I tend to miss intricacies.  For instance, I never
considered what would be involved in applying a subscriptor to a
multidimensional Whatever (e.g., what can you do with '**[...]'?).
Part of that is that I'm not yet comfortable with multidimensional
slices (or arrays, for that matter); when reading about them, I keep
on getting the feeling that there's something going on here that the
big boys know about that I don't - implicit assumptions, et al.


I think I've got a better grip on it now.  Here's how I understand it to work:

A multidimensional array is defined by providing a list of lists, each
giving all of the valid indices along one axis (i.e., in one
dimension).  The overall shape of the array will be rectangular, or a
higher-dimensional analog of rectangular.  There may be gaps in the
indices (in which case the array is a sparse array as well as a
multidimensional array); but if there are, the gaps also conform to
the rectangular structure: it's as if you carved a solid rectangle
into two or more rectangular pieces and pulled them apart a bit.  That
is, @array[-1, +1; -1 +1] is effectively a 2x2 square array with valid
x-indices of -1 and +1 and valid y-indices of -1 and +1.

To access an element in a multidimensional array, use a
semicolon-delimited list of indices in the square braces:
'@cube[1;1;1]' will access the center element of a [^3;^3;^3] shaped
array, while '@array[*;*;1]' will access a 3x3 horizontal slice of it.

When putting together a list literal, things work a bit differently.
Create a one-dimensional literal by means of a comma-delimited list of
values; create a two-dimensional literal by means of a
semicolon-delimited list of comma-delimited lists of values:

1, 2, 3 # one-dimensional list literal with a length of 3
(1, 2, 3; 4, 5, 6) # two-dimensional list literal with a length of 2
and a width of 3.
(1; 2; 3) # two-dimensional list literal with a length of 3 and a width of 1.

I would guess that you would build higher-dimensional literals by
nesting parentheses-enclosed semicolon-delimited lists:

(( 0,  1;  2,  3;  4,  5;  6,  7;  8,  9);
(10, 11; 12, 13; 14, 15; 16, 17; 18, 19);
(20, 21; 22, 23; 24, 25; 26, 27; 28, 29))
# three-dimensional list literal with a length of 3, a width of 5, and
a height of 2.

The outermost set of semicolons delimits the first dimension, and the
commas delimit the last dimension.  That is, semicolon-delimited lists
nest, and comma-delimited lists flatten.

Furthermore, the list literal gets assigned to the array by means of
ordinal coordinates:

 my @cube[-1..+1; -1..+1; -1..+1] =
   ((1, 2, 3; 4, 5, 6; 7, 8, 9);
(10, 11, 12; 13, 14, 15; 16, 17, 18);
(19, 20, 21; 22, 23, 24; 25, 26, 27));

would be equivalent to

 my @cube[1..3; 1..3; 1..3];
 @cube[**[0; **]] = (1, 2, 3; 4, 5, 6; 7, 8, 9);
 @cube[**[0; **]] = (10, 11, 12; 13, 14, 15; 16, 17, 18);
 @cube[**[0; **]] = (19, 20, 21; 22, 23, 24; 25, 26, 27);

or

 my @cube[1..3; 1..3; 1..3];
 @cube[**[0; 0; *]] = 1, 2, 3;
 @cube[**[0; 1; *]] = 4, 5, 6;
 @cube[**[0; 2; *]] = 7, 8, 9;
 @cube[**[1; 0; *]] = 10, 11, 12;
 @cube[**[1; 1; *]] = 13, 14, 15;
 @cube[**[1; 2; *]] = 16, 17, 18;
 @cube[**[2; 0; *]] = 19, 20, 21;
 @cube[**[2; 1; *]] = 22, 23, 24;
 @cube[**[2; 2; *]] = 25, 26, 27;

or

 my @cube[1..3; 1..3; 1..3];
 @cube[**[0; 0; 0]] = 1;
 @cube[**[0; 0; 1]] = 2;
 @cube[**[0; 0; 2]] = 3;
 @cube[**[0; 1; 0]] = 4;
 @cube[**[0; 1; 1]] = 5;
 @cube[**[0; 1; 2]] = 6;
 ...

where

 say @cube[**[1; 1; 1]];

would be equivalent to

 say @cube[0; 0; 0];

Do I have the general idea?

--
Jonathan Dataweaver Lang


[S09] Whatever indices and shaped arrays

2007-02-23 Thread Jonathan Lang

From S09:

When you use * with + and -, it creates a value of Whatever but Num,
which the array subscript interpreter will interpret as the subscript
one off the end of that dimension of the array.

Alternately, *+0 is the first element, and the subscript dwims from
the front or back depending on the sign. That would be more
symmetrical, but makes the idea of * in a subscript a little more
distant from the notion of 'all the keys', which would be a loss, and
potentially makes +* not mean the number of keys.

If '*+0' isn't the first element, then '*+$x' is only meaningful if $x  0.

That said, I think I can do one better:

Ditch all of the above.  Instead, '*' always acts like a list of all
valid indices when used in the context of postcircumfix:[ ].

If you want the last index, say '*[-1]' instead of '* - 1'.
If you want the first index, say '*[0]' instead of '* + 0'.

So the four corners of a two-dimensional array would be:

 @array[ *[0]; *[0] ];  @array[ *[-1]; *[0] ];
 @array[ *[0]; *[-1] ]; @array[ *[-1]; *[-1] ];

The only thing lost here is that '@array[+*]' is unlikely to point
just past the end of a shaped array.  But then, one of the points of
shaped arrays is that if you point at an invalid index, you get a
complaint; so I don't see why one would want to knowingly point to
one.

--

Also, has the syntax for accessing an array's shape been determined
yet?  If not, I'd like to propose the following:

@array.shape returns a list of lists, with the top-level list's
indices corresponding to the dimensions of the shape and each nested
list containing every valid index in that dimension.  In boolean
context, the shape method returns true if the array is shaped and
false if not - though an unshaped array will otherwise pretend to be a
one-dimensional, zero-based, non-sparse, shaped array.

So:

 @array.shape[0][2] # the third valid index of the first dimension of the shape
 @array.shape[-1][0] # the first valid index of the last dimension of the shape
 @array.shape[1] # every valid index of the second dimension of the shape
 @array.shape[1][*] # same as @array.shape[1]

 [EMAIL PROTECTED] # is this a shaped array?

 exists @array.shape[2] # does the array have a third dimension?
 exists @array.shape[3][4] # does the fourth dimension have a fifth element?

 [EMAIL PROTECTED] # how many dimensions does the shape have?
 [EMAIL PROTECTED] # how many indices does the first dimension have?

If we use this notation, then

 @array[ *; * ]

is shorthand for

 @array[ @array.shape[0]; @array.shape[1] ]

--
Jonathan Dataweaver Lang


Re: [S09] Whatever indices and shaped arrays

2007-02-23 Thread Larry Wall
On Fri, Feb 23, 2007 at 10:49:34AM -0800, Jonathan Lang wrote:
: That said, I think I can do one better:
: 
: Ditch all of the above.  Instead, '*' always acts like a list of all
: valid indices when used in the context of postcircumfix:[ ].

Ooh, shiny!  Or at least, shiny on the shiny side...

: If you want the last index, say '*[-1]' instead of '* - 1'.
: If you want the first index, say '*[0]' instead of '* + 0'.

So the generic version of leaving off both ends would be *[1]..*[-2]
(ignoring that we'd probably write *[0]^..^*[-1] for that instead).

: So the four corners of a two-dimensional array would be:
: 
:  @array[ *[0]; *[0] ];  @array[ *[-1]; *[0] ];
:  @array[ *[0]; *[-1] ]; @array[ *[-1]; *[-1] ];

A point against it visually is the nested use of [].

: The only thing lost here is that '@array[+*]' is unlikely to point
: just past the end of a shaped array.  But then, one of the points of
: shaped arrays is that if you point at an invalid index, you get a
: complaint; so I don't see why one would want to knowingly point to
: one.

I would expect that to point to one off the end in the first dimension only,
which might make sense if that dimension is extensible:

my @array[*;2;2];

Adding something at +* would then add another 2x2 under it.

: Also, has the syntax for accessing an array's shape been determined
: yet?  If not, I'd like to propose the following:
: 
: @array.shape returns a list of lists, with the top-level list's
: indices corresponding to the dimensions of the shape and each nested
: list containing every valid index in that dimension.  In boolean
: context, the shape method returns true if the array is shaped and
: false if not - though an unshaped array will otherwise pretend to be a
: one-dimensional, zero-based, non-sparse, shaped array.

That's more or less how I was thinking of it, though I hadn't got as
far as boolean context.

: So:
: 
:  @array.shape[0][2] # the third valid index of the first dimension of the 
:  shape
:  @array.shape[-1][0] # the first valid index of the last dimension of the 
:  shape
:  @array.shape[1] # every valid index of the second dimension of the shape
:  @array.shape[1][*] # same as @array.shape[1]
: 
:  [EMAIL PROTECTED] # is this a shaped array?
: 
:  exists @array.shape[2] # does the array have a third dimension?
:  exists @array.shape[3][4] # does the fourth dimension have a fifth element?
: 
:  [EMAIL PROTECTED] # how many dimensions does the shape have?
:  [EMAIL PROTECTED] # how many indices does the first dimension have?
: 
: If we use this notation, then
: 
:  @array[ *; * ]
: 
: is shorthand for
: 
:  @array[ @array.shape[0]; @array.shape[1] ]

Note also that multidimensional whatever gives us

@array[ ** ]

to mean

@array[ @@( @array.shape[*] ) ]

or some such.  Though ** might want to be even smarter than that if
we want

@array[ 0; **; 42]

to dwim.  That'd have to turn into something like:
 
@array[ 0; @@( @array.shape[*[1]..*[-2]] ); 42 ]

Also +** might return a shape vector, or maybe +«**.

Larry


Re: [S09] Whatever indices and shaped arrays

2007-02-23 Thread Jonathan Lang

Larry Wall wrote:

On Fri, Feb 23, 2007 at 10:49:34AM -0800, Jonathan Lang wrote:
: That said, I think I can do one better:
:
: Ditch all of the above.  Instead, '*' always acts like a list of all
: valid indices when used in the context of postcircumfix:[ ].

Ooh, shiny!  Or at least, shiny on the shiny side...


Thank you...


: If you want the last index, say '*[-1]' instead of '* - 1'.
: If you want the first index, say '*[0]' instead of '* + 0'.

So the generic version of leaving off both ends would be *[1]..*[-2]
(ignoring that we'd probably write *[0]^..^*[-1] for that instead).


Correct - although that assumes that the indices are consecutive (as
opposed to, say, 1, 2, 4, 8, 16...); this version of * makes no such
assumption.

I do find myself wondering what *[-1] would be for an infinite array,
such as @nums[0..*:by(2)].

One possible answer: Inf.

Another possible answer: the shape sets limits on the indices; it does
not set requirements.  For instance:

 my @nums[0..*:by(2)];
 @nums[2 * $_] = $_ for 0..5];
 say @nums[ *[-1] ]; # same as 'say @nums[10];'
 @nums[42] = 21;
 say @nums[ *[-1] ]; # same as 'say @nums[42];'
 say @nums[ *[-2] ]; # same as 'say @nums[40];' - whatever that means.


: Also, has the syntax for accessing an array's shape been determined
: yet?  If not, I'd like to propose the following:
:
: @array.shape returns a list of lists, with the top-level list's
: indices corresponding to the dimensions of the shape and each nested
: list containing every valid index in that dimension.  In boolean
: context, the shape method returns true if the array is shaped and
: false if not - though an unshaped array will otherwise pretend to be a
: one-dimensional, zero-based, non-sparse, shaped array.

That's more or less how I was thinking of it, though I hadn't got as
far as boolean context.


I'm still debating the boolean context myself.  I _think_ it will
work; but I have a tendency to miss intricacies.  You might instead
want to require someone to explicitly check for definedness or
existence instead of merely truth; or you might not.


: If we use this notation, then
:
:  @array[ *; * ]
:
: is shorthand for
:
:  @array[ @array.shape[0]; @array.shape[1] ]

Note also that multidimensional whatever gives us

@array[ ** ]

to mean

@array[ @@( @array.shape[*] ) ]

or some such.


Like I said, I tend to miss intricacies.  For instance, I never
considered what would be involved in applying a subscriptor to a
multidimensional Whatever (e.g., what can you do with '**[...]'?).
Part of that is that I'm not yet comfortable with multidimensional
slices (or arrays, for that matter); when reading about them, I keep
on getting the feeling that there's something going on here that the
big boys know about that I don't - implicit assumptions, et al.


Though ** might want to be even smarter than that if
we want

@array[ 0; **; 42]

to dwim.  That'd have to turn into something like:

@array[ 0; @@( @array.shape[*[1]..*[-2]] ); 42 ]

Also +** might return a shape vector, or maybe +«**.


If by shape vector you mean something that says the array has a
length of 5, a width of 3, and a height of 2, +«** would seem to be
the more appropriate syntax.  Why you'd want that inside an array's
subscriptor is beyond me, for a similar reason to +*.  But that's what
the logic of the syntax gives.

But I _could_ see using '[EMAIL PROTECTED]' to get an array's measurements.

Hmm...

 my @@square = (1, 2; 3, 4);
 say +@@square; # say what?  '2', as in 2 dimensions?  Or '4', as in 4 items?

Answer that, and you'll know what +** would give you.

--

BTW: could the parser handle the following?

 @array[ *[0] * 2; 2 ** **[ [;] 0 x *] ]

--
Jonathan Dataweaver Lang


Re: [S09] Whatever indices and shaped arrays

2007-02-23 Thread Luke Palmer

On 2/23/07, Jonathan Lang [EMAIL PROTECTED] wrote:
' I'm still debating the boolean context myself.  I _think_ it will

work; but I have a tendency to miss intricacies.  You might instead
want to require someone to explicitly check for definedness or
existence instead of merely truth; or you might not.


I should chime in something here.  It may not be practical for Perl,
given how much we have already relied on its opposite, but it is still
worth considering:

I have been extremely satisfied with Ruby's boolean truth model:  nil
and false are false, everything else is true.  So the empty string is
true, 0 is true, 0 is certainly true.  I think it's the same reason
that I like Haskell's function call model: that is, function
application binds most tightly, everything else has various looser
precedence.

I think the nice thing about these two is their extreme simplicity.
In Haskell, when I read:

   foo x ! bar

I don't need to think for a fraction of a second to associate that
correctly in my mind.  Likewise, in Ruby, when I write:

   while line = gets

I don't need to think for a fraction of a second about edge cases of
gets.  Gets returns a string when it reads something, and nil on EOF,
that's all I need to know.  And the simplicity has a way of
propagating to other areas of the language, as in that example, where
gets is able to return the most obvious thing for EOF and have it work
correctly.

So, yeah, simple rules can be a blessing if you find the right ones.
In particular, since I use boolean context a lot (i.e. without
explicit compare operators), I'm a fan of as much boolean
predictability as I can get.  Even if we don't get the same simple
model of booleans as ruby, I'd like to keep the number of
boolean-context overloaded objects reasonably small.  This gives
functions the freedom to return false or undef as a failure mode, when
it is convenient for it to function that way.

Luke