I think this proposal goes to far in the dwimmery direction- On Sat, Jun 13, 2009 at 12:58 PM, John M. Dlugosz<2nb81l...@sneakemail.com> wrote: > Daniel Ruoso daniel-at-ruoso.com |Perl 6| wrote: >> >> So, how do I deal with a multidim array? Well, TIMTOWTDI... >> >> my @a = 1,[2,[3,4]]; >> say @a[1][1][1]; >> say @a[1;1;1]; # I'm not sure this is correct >> >> > > I think that it should be. That is, multi-dim subscript is always the same > as chained subscripts, regardless of whether the morphology is an array > stored as an element, or a multi-dim container, or any mixture of that as > you drill through them. > > I've not written out a full formalism yet, but I've thought about it. > The multi-dim subscript would return a sub-array if there were fewer > parameters than dimensions, an element if exact match, and recursively apply > the remaining subscripts to the element if too many. > > >> Or.. (I'm using the proposed capture sigil here, which has '@%a' as its >> expanded form) >> >> my ¢a = 1,(2,(3,4); >> say ¢a[1][1][1]; >> say ¢a[1;1;1]; >> >> I think that makes the semantics of the API more clear... >> >> daniel >> >> >> > > The plain Array would work too, in the nested morphology: > > my @a = 1,[2,[3,4]]; > > @a has 2 elements, the second of which is type Array. > > say @a[1][1][1]; > > naturally. > > say @a[1;1;1]; > > means the same thing, intentionally. > > say @a[1][1;1]; > say @a[1;1][1]; > > ditto.
My thought is that captures, multi-D arrays, and arrays of arrays are all different data structures, the programmer will pick them or some mix of them for a reason, and expect consistent access semantics. I agree that the various types should be transparently converted when necessary, but the dwimmery proposed on indexing could make it hard to find bugs in code dealing with complicated data structures. The problem comes with nested structures. Let's talk about a multi-D array, where each element is another multi-D array. This is also an example of my understanding of multi-D list initialization- the specs are silent on that other than initializing elements one at a time eg. "@md[1;0] = 4;"- apologies for squeezing two topics into one post. # Build it piece by piece, first using explicitly dimensioned sub-arrays # Doesn't matter if the initialization is a list, array, capture of arrays. The RHS is in list context which flattens a capture, and the explicit dimension will pour them all into a 2x2 array. my @sub1[2;2]=(99,\('a',[<b BB>]; 'c'; CC) ; 88, [1,2,3]); my @sub2[2;2]=77,[<d e f>], 66,[4,5,6]; my @sub3[2;2]=(55; [<g h i>], 44, [7,8,9]); # Use slice context to retain the 2x2 shape my @@sub4=([<p q r>], 33; [10,11,12], 22); # A single column, two high my @sub5[1;2]=([<s t u>]; [13,14,15]); # 3 ragged rows, 1 long then 2 long then 3 long my @sub6[3;*]=('row1'; <row2a row2b>; <row3a row3b row3c>); =begin comment 3 ragged columns, first 3 high, the 2 high, then 3 high c1a c2a c3a c1b c2b c3b c1c c3c =end comment my @sub7[*;3]=(<c1a c2a c3a>; <c1b c2b c3b>; 'c3a', Nil, 'c3c'); # Perilous? # Simulate a sparse array, set two elements my @sub8[*;*]; @sub8[(5;6),(8;0)]=<elem5_6 elem8_0>; # Now build a multi-dimensional array, each element of which is a multi-D array my @a[2;2;2]=\(@@sub1; @@sub2; @@sub3; @@sub4; @@sub5; @@sub6; @@sub7; @@sub8); # This also builds an 8-element 3D cube. Not sure about , vs ; below my @@b=\( \( \(@@sub1; @@sub2); \(@@sub3; @@sub4)); \(\(@@sub5; @@sub6); \(@@sub7; @@sub8))); # Same as above, but no captures, use slices all the way. Valid? my @@c=@@( @@( @@(@@sub1; @@sub2); @@(@@sub3; @@sub4)); @@(@@(@@sub5; @@sub6); @@(@@sub7; @@sub8))); Returning to John's post- In this case all these accessors return different elements- > say @a[1][1][1]; BB @a[1] is accessing @a as a flat array, so that returns the 2nd element of @a which is \('a',[<b BB>]; 'c'; CC), which is then treated as a flat list by the next [1] subscript. The 2nd element of the 2nd element of that is BB. >say @a[1;1;1]; @sub8 > say @a[1][1;1]; CC @a[1] is \('a',[<b BB>]; 'c'; CC) which is now treated as a multi-D array. [1;1] asks for the lower-right corner of that 2x2 array, which is CC. > say @a[1;1][1]; @sub8 S09 states: You need not specify all the dimensions; if you don't, the unspecified dimensions are "wildcarded". So the above becomes @a[1;1;*][1] @a[1;1;*] is \(@@sub7;@@sub8), 2nd element of that is @sub8 S09's "Cascaded subscripting of multidimensional arrays" says the above "will either fail or produce the same results as the equivalent semicolon subscripts." Following that part of the spec, it should convert to @a[1;1;1] and still return @sub8. But what I would really like is a "strict array mode" that would give me an error when using a subscript dimensioned different from the array's dimensions. I think that if an array has explicit dimensions they need to be obeyed, with 1D access a specific allowed exception. These examples shows a necessity for distinctly different semantics for @a[1][1][1], @a[1][1;1], and @a[1;1;1], which conflicts with S09's "Cascaded subscripting" section. Heck we have the "**" to be used when we don't know or don't care about higher dimensions, use it when you mean it and always be strict with array dimensions! say @b[1;1][1] In this case, another part of S09 conjectures that @b knows it's shape is [2;2;2], even though @b's declaration didn't give it any fixed shape. Here I'm more inclined to let there be some dwimmery defined along the lines of John's proposal and S09's "Cascading" section. Speaking generally, I'm a little disturbed when I see posts that treat a multi-dimensional array as equivalent to an array of arrays. Multi-D to array-of-arrays transforms (and vice versa) can be useful, but something that should only be done explicitly, IMHO. S09 says "...even if the cascaded subscript form must be implemented inefficiently by constructing temporary slice objects for later subscripts to use" but that whole section leads to ambiguity. Third related topic, I've been assuming that a declaration like my @dimensioned[*;*;*]; creates an array with a shape that's read-only (in this case, 3 dimensions of any size), whereas my @a; creates arrays with an initially 1d "*" - or any-D "**" - or undefined?- read-write shape. S09 discusses the .shape method and the :shape adverb, and has the example- my int @array[2;2] is Puddle .= new(:shape(4) <== 0,1,2,3); which is good for accessing an array of one shape in terms of another. What should the following say? my @a = my int @array[2;2] is Puddle .= new(:shape(4) <== 0,1,2,3); say @a.shape; # 2;2 or 4 or * or **? There's the "CONJECTURE" in S09 that says "since @@x and @x are really the same object, any array can keep track of its dimensionality, and it only matters how you use it in contexts that care about the dimensionality". Which leads me to believe that @x's "shape" property is mutable, so @a would take the shape of the thing it was assigned- in this case "@array" which is 2;2 Enough from me for now -y