Re: Array Dimensionality

yary Thu, 18 Jun 2009 12:14:16 -0700

I think this proposal goes to far in the dwimmery direction-

On Sat, Jun 13, 2009 at 12:58 PM, John M. Dlugosz<2nb81l...@sneakemail.com>
wrote:
> Daniel Ruoso daniel-at-ruoso.com |Perl 6| wrote:
>>
>> So, how do I deal with a multidim array? Well, TIMTOWTDI...
>>
>>  my @a = 1,[2,[3,4]];
>>  say @a[1][1][1];
>>  say @a[1;1;1]; # I'm not sure this is correct
>>
>>
>
> I think that it should be.  That is, multi-dim subscript is always the
same
> as chained subscripts, regardless of whether the morphology is an array
> stored as an element, or a multi-dim container, or any mixture of that as
> you drill through them.
>
> I've not written out a full formalism yet, but I've thought about it.
> The multi-dim subscript would return a sub-array if there were fewer
> parameters than dimensions, an element if exact match, and recursively
apply
> the remaining subscripts to the element if too many.
>
>
>> Or.. (I'm using the proposed capture sigil here, which has '@%a' as its
>> expanded form)
>>
>>  my ¢a = 1,(2,(3,4);
>>  say ¢a[1][1][1];
>>  say ¢a[1;1;1];
>>
>> I think that makes the semantics of the API more clear...
>>
>> daniel
>>
>>
>>
>
> The plain Array would work too, in the nested morphology:
>
>   my @a = 1,[2,[3,4]];
>
> @a has 2 elements, the second of which is type Array.
>
>   say @a[1][1][1];
>
> naturally.
>
>   say @a[1;1;1];
>
> means the same thing, intentionally.
>
>   say @a[1][1;1];
>   say @a[1;1][1];
>
> ditto.


My thought is that captures, multi-D arrays, and arrays of arrays are all
different data structures, the programmer will pick them or some mix of them
for a reason, and expect consistent access semantics. I agree that the
various types should be transparently converted when necessary, but the
dwimmery proposed on indexing could make it hard to find bugs in code
dealing with complicated data structures.

The problem comes with nested structures. Let's talk about a multi-D array,
where each element is another multi-D array. This is also an example of my
 understanding of multi-D list initialization- the specs are silent on that
other than initializing elements one at a time eg. "@md[1;0] = 4;"-
apologies for squeezing two topics into one post.

# Build it piece by piece, first using explicitly dimensioned sub-arrays
# Doesn't matter if the initialization is a list, array, capture of arrays.
The RHS is in list context which flattens a capture, and the explicit
dimension will pour them all into a 2x2 array.
my @sub1[2;2]=(99,\('a',[<b BB>]; 'c'; CC) ; 88, [1,2,3]);
my @sub2[2;2]=77,[<d e f>], 66,[4,5,6];
my @sub3[2;2]=(55; [<g h i>], 44, [7,8,9]);
# Use slice context to retain the 2x2 shape
my @@sub4=([<p q r>], 33; [10,11,12], 22);

# A single column, two high
my @sub5[1;2]=([<s t u>]; [13,14,15]);

# 3 ragged rows, 1 long then 2 long then 3 long
my @sub6[3;*]=('row1'; <row2a row2b>; <row3a row3b row3c>);

=begin comment
3 ragged columns, first 3 high, the 2 high, then 3 high
c1a c2a c3a
c1b c2b c3b
c1c     c3c
=end comment
my @sub7[*;3]=(<c1a c2a c3a>; <c1b c2b c3b>; 'c3a', Nil, 'c3c'); # Perilous?

# Simulate a sparse array, set two elements
my @sub8[*;*]; @sub8[(5;6),(8;0)]=<elem5_6 elem8_0>;

# Now build a multi-dimensional array, each element of which is a multi-D
array
my @a[2;2;2]=\(@@sub1; @@sub2; @@sub3; @@sub4; @@sub5; @@sub6; @@sub7;
@@sub8);

# This also builds an 8-element 3D cube. Not sure about , vs ; below
my @@b=\( \( \(@@sub1; @@sub2); \(@@sub3; @@sub4));
\(\(@@sub5; @@sub6); \(@@sub7; @@sub8)));

# Same as above, but no captures, use slices all the way. Valid?
my @@c=@@( @@( @@(@@sub1; @@sub2); @@(@@sub3; @@sub4));
@@(@@(@@sub5; @@sub6); @@(@@sub7; @@sub8)));

Returning to John's post- In this case all these accessors return different
elements-

>   say @a[1][1][1];
BB
@a[1] is accessing @a as a flat array, so that returns the 2nd element of @a
which is \('a',[<b BB>]; 'c'; CC), which is then treated as a flat list by
the next [1] subscript. The 2nd element of the 2nd element of that is BB.

>say @a[1;1;1];
@sub8

>   say @a[1][1;1];
CC
@a[1] is \('a',[<b BB>]; 'c'; CC) which is now treated as a multi-D array.
[1;1] asks for the lower-right corner of that 2x2 array, which is CC.

>   say @a[1;1][1];
@sub8
S09 states:
You need not specify all the dimensions; if you don't, the unspecified
dimensions are "wildcarded".
So the above becomes
@a[1;1;*][1]
@a[1;1;*] is \(@@sub7;@@sub8), 2nd element of that is @sub8

S09's "Cascaded subscripting of multidimensional arrays" says the above
"will either fail or produce the same results as the equivalent semicolon
subscripts." Following that part of the spec, it should convert to @a[1;1;1]
and still return @sub8.

But what I would really like is a "strict array mode" that would give me an
error when using a subscript dimensioned different from the array's
dimensions. I think that if an array has explicit dimensions they need to be
obeyed, with 1D access a specific allowed exception. These examples shows a
necessity for distinctly different semantics for @a[1][1][1], @a[1][1;1],
and @a[1;1;1], which conflicts with S09's "Cascaded subscripting" section.
Heck we have the "**" to be used when we don't know or don't care about
higher dimensions, use it when you mean it and always be strict with array
dimensions!

say @b[1;1][1]
In this case, another part of S09 conjectures that @b knows it's shape is
[2;2;2], even though @b's declaration didn't give it any fixed shape. Here
I'm more inclined to let there be some dwimmery defined along the lines of
John's proposal and S09's "Cascading" section.

Speaking generally, I'm a little disturbed when I see posts that treat a
multi-dimensional array as equivalent to an array of arrays.  Multi-D to
array-of-arrays transforms (and vice versa) can be useful, but something
that should only be done explicitly, IMHO. S09 says "...even if the cascaded
subscript form must be implemented inefficiently by constructing temporary
slice objects for later subscripts to use" but that whole section leads to
ambiguity.

Third related topic, I've been assuming that a declaration like
my @dimensioned[*;*;*];
creates an array with a shape that's read-only (in this case, 3 dimensions
of any size), whereas
my @a;
creates arrays with an initially 1d "*" - or any-D "**" - or undefined?-
read-write shape.

S09 discusses the .shape method and the :shape adverb, and has the example-

my int @array[2;2] is Puddle .= new(:shape(4) <== 0,1,2,3);

which is good for accessing an array of one shape in terms of another.

What should the following say?


my @a = my int @array[2;2] is Puddle .= new(:shape(4) <== 0,1,2,3);
say @a.shape; # 2;2 or 4 or * or **?

There's the "CONJECTURE" in S09 that says "since @@x and @x are really
the same object, any array can keep track of its dimensionality, and
it only matters how you use it in contexts that care about the
dimensionality". Which leads me to believe that @x's "shape" property
is mutable, so @a would take the shape of the thing it was assigned-
in this case "@array" which is 2;2

Enough from me for now

-y

Re: Array Dimensionality

Reply via email to