Re: [S09] Whatever indices and shaped arrays
David Green schreef: Jonathan Lang: (In fact, the semantics for @x[*+n] follows directly from the fact that an array returns the count of its elements in scalar context.) And @x[*] would be the same as @x[0..^*] or @x[0..(*-1)]. That's an elegance in its favour. In Perl5 a + can creep in, for example: $ perl -wle '$s = -123; $n = -123; print -$s; print -$n' +123 123 so maybe it is not a bad idea to keep treating a unary + as (almost) a no-op. -- Affijn, Ruud Gewoon is een tijger.
Re: [S09] Whatever indices and shaped arrays
OK: before I submit a patch, let me make sure that I've got the concepts straight: @x[0] always means the first element of the array; @x[-1] always means the last element of the array; @x[*+0] always means the first element after the end of the array; @x[*-1] always means the first element before the beginning of the array. That is, the indices go: ..., *-3, *-2, *-1, 0, 1, 2, ..., -3, -2, -1, *+0, *+1, *+2, ... ^ ^ | | first last As well, a Whatever in square braces is treated as an array of the valid positions; so @x[*] is equivalent to @x[0..-1]. If you want to use sparse indices and/or indices that begin somewhere other than zero, access them using curly braces. Consider an array with valid indices ranging from -2 to +2: @x{-2} means element -2, which would be equivalent to @x[0]; @x{+2} means element 2, which would be equivalent to @x[-1]. Likewise, @x{0} is the same as @x[2], @x{-3} is the same as @x[*-1], @x{+3} is the same as @x[*+0], and so on. If @y has a series of five indices that start at 1 and double with each step, then @y{1} will be the same as @y[0]; @y{4} will be the same as @y[2], and so on. A Whatever in curly braces is treated as an array of the valid index names; so @x{*} means @x{-2..+2}, and @y{*} means @y{1, 2, 4, 8, 16}. Because it is treated as an array, individual index names can be accessed by position: @x{*[0]} is a rather verbose way of saying @x[0]. This lets you embed ordinal indices into slices involving named indices. Conversely, using *{...} inside square braces lets you embed named indices into slices involving ordinal indices: @x[*{-2}] is the same as @x{-2}. Multidimensional arrays follow the above conventions for each of their dimensions; so a single-splat provide a list of every index in a given dimension, a 0 refers to the first index in that dimension, and so on. A double-splat extends the concept to a multidimensional list that handles an arbitrary number of dimensions at once. -- Commentary: I find the sequence of ordinals outlined above to be a bit messy, especially when you start using ranges of indices: you need to make sure that @x[0..-1] dwims, that @x[-1..(*+0)] dwims, that @x[(*-2)..(*+3)] dwims, and so on. This is a potentially very ugly process. As well, the fact that @x[-1] doesn't refer to the element immediately before @x[0] is awkward, as is the fact that @x[*-1] doesn't refer to the element immediately before @x[*+0]. IMHO, it would be cleaner to have @x[n] count forward and backward from the front of the array, while @x[*+n] counts forward and backward from just past the end of the array: ..., -3, -2, -1, 0, 1, 2, ..., *-3, *-2, *-1, *+0, *+1, *+2, ... ^^ || first last So perl 5's $x[-1] would always translate to @x[*-1] in perl 6. Always. Likewise, @x[+*] would be the same as @x[*+0]. (In fact, the semantics for @x[*+n] follows directly from the fact that an array returns the count of its elements in scalar context.) And @x[*] would be the same as @x[0..^*] or @x[0..(*-1)]. You would lose one thing: the ability to select an open-ended Range of elements. For a five-element list, @x[1..^*] means @x[1, 2, 3, 4], not @x[1, 2, 3, 4, 5, 6, 7, 8, ...]. Technically, one could say @x{+*} to reference the index that coincides with the number of indices; but it would only be useful in specific cases, such as referencing the last element of a one-based contiguous array. -- Jonathan Dataweaver Lang
Re: [S09] Whatever indices and shaped arrays
On 3/7/07, Jonathan Lang wrote: summary snipped Looks good to me. As well, the fact that @x[-1] doesn't refer to the element immediately before @x[0] is awkward, as is the fact that @x[*-1] doesn't refer to the element immediately before @x[*+0]. IMHO, it would be cleaner to have @x[n] count forward and backward from the front of the array, while @x[*+n] counts forward and backward from just past the end of the array: I suggested that at one point, so I'd agree that makes sense too. It avoids the discontinuity at either end of the array -- although arguably, points off the end of a list aren't in the same boat as elements that actually exist, so the discontinuity might be conceptually justified. (Make the weird things look weird?) (In fact, the semantics for @x[*+n] follows directly from the fact that an array returns the count of its elements in scalar context.) And @x[*] would be the same as @x[0..^*] or @x[0..(*-1)]. That's an elegance in its favour. One possible downside is that it wouldn't work for cyclic/wrap-around arrays (where the indices are always interpreted mod n) -- since any number would always refer to an existing element. Oh -- but if an index isn't a plain counter, then it should be a named key, so scrap that. (The question then is: how to have reducible hash keys? By which I mean different keys that get reduced to the same thing, e.g. %x{1} === %x{5} === %x{9} === %x{13}, etc. Presumably you can just override the .{} method on your hash, right?) You would lose one thing: the ability to select an open-ended Range of elements. For a five-element list, @x[1..^*] means @x[1, 2, 3, 4], not @x[1, 2, 3, 4, 5, 6, 7, 8, ...]. Except wouldn't the .. interpret the * before the [] did? So 1..* would yield a range-object from 1 to Inf, and then the array-deref would interpret 1..Inf accordingly. Actually, it seems more useful if the * could mean the count; you can always say 1..Inf if that's what you want, but otherwise how would you get [1..^*] meaning [1,2,3,4]? Perhaps the range could note when it's occurring in []-context, and interpret the * as count rather than as Inf? -David
Re: [S09] Whatever indices and shaped arrays
I like it. I'm a bit strapped for time at the moment, but if you send me a patch for S09 I can probably dig up a program to apply it with. :) Larry
Re: [S09] Whatever indices and shaped arrays
Larry Wall wrote: I like it. I'm a bit strapped for time at the moment, but if you send me a patch for S09 I can probably dig up a program to apply it with. :) Could someone advise me on how to create patches? -- Jonathan Dataweaver Lang
Re: [S09] Whatever indices and shaped arrays
Jonathan Lang skribis 2007-03-06 13:35 (-0800): Could someone advise me on how to create patches? Single file: diff -u oldfile newfile Entire tree: diff -Nur oldtree/ newtree/ See also diff(1), and note that when diffing trees, you want to distclean them first :) -- korajn salutojn, juerd waalboer: perl hacker [EMAIL PROTECTED] http://juerd.nl/sig convolution: ict solutions and consultancy [EMAIL PROTECTED] Ik vertrouw stemcomputers niet. Zie http://www.wijvertrouwenstemcomputersniet.nl/.
Re: [S09] Whatever indices and shaped arrays
On 2/27/07, Jonathan Lang wrote: David Green wrote: So I end up back at one of Larry's older ideas, which basically is: [] for counting, {} for keys. What if you want to mix the two? I want the third element of row 5. In my proposal, that would be @array[5, *[2]]; in your proposal, there does not appear to be a way to do it. Unless the two approaches aren't mutually exclusive: @array{5, *[2]}. [...] Since this is an unlikely situation, the fact that nesting square braces inside curly braces is a bit uncomfortable isn't a problem: this is a case of making hard things possible, not making easy things easy. Oh, good point. Yes, I think that mixing them together that way makes sense. It also suggests that you could get at the named keys by applying {} to *: %foo[0, 1, *{'bar'}]; #first column, second row, bar layer The one gotcha that I see here is with the possibility of multi-dimensional arrays. In particular, should multi-dimensional indices be allowed inside square braces? [...] With that promise, you can always guarantee that the wrap-around semantics will work inside [], while nobody will expect them to work inside {}. Right, I don't see a problem with handling any number of dimensions that way. Furthermore, you could do away with the notion of shaped vs. unshaped: just give everything a default shape. The default shape for arrays would be '[*]' - that is, one dimension with an indeterminate number of ordinals. Meanwhile, shapes for {} would continue to use the current syntax. '[$x, $y, $z]' would be nearly equivalent to '{0..^$x; 0..^$y; 0..^$z}'. Agreed. it can work in the usual way: start at 0, end at -1. It is useful to be able to count past the ends of an array, and * can do this by going beyond the end: *+1, *+2, etc., or before the beginning: *-1, *-2, etc. (This neatly preserves the notion of * as all the elements -- *-1 is the position before everything, and *+1 is the position after everything else.) Regardless, I would prefer this notion to the offset from the endpoint notion currently in use. Note, however, that [*-1] wouldn't work in the ordinals paradigm; there simply is nothing before the first element. About the only use I could see for it would be to provide an assignment equivalent of unshift: '@array[*-1] = $x' could be equivalent to 'unshift @array, $x'. But note that, unlike the 'push'-type assignments, this would change what existing ordinals point to. I figured that *-1 or *+1 would work like unshift/push, which effectively does change what the ordinals point to (e.g. unshifting a P5 array). If the array is not extensible, then it should fail in the same way as unshift/push would. Meanwhile, {*-1} would only make sense in cases where keys are ordered and new keys can be auto-generated. Note also that {*+$x} is compatible with {*[$x]}: the former would reference outside of the known set of keys, while {*[$x]} would reference within them. Exactly. -David
Re: [S09] Whatever indices and shaped arrays
On 2/24/07, Jonathan Lang wrote: In effect, using * as an array of indices gives us the ordinals notation that has been requested on occasion: '*[0]' means 'first element', '*[1]' means 'second element', '*[-1]' means 'last element', '*[0..2]' means 'first three elements', and so on - and this works regardless of what the actual indices are. Using * that way works, but it still is awkward, which makes me think there's something not quite dropping into place yet. We have the notion of keyed indexing via [] and counting/ordinal indexing via [*[]], which is rather a mouthful. So I end up back at one of Larry's older ideas, which basically is: [] for counting, {} for keys. To put a slight twist on it: instead of adding {}-indexing to arrays, consider that what makes something an array is that it doesn't have keys -- it's a collection of things that you can count through, as opposed to a collection that you search through by meaningful keys/names/tags/references/etc. (E.g., consider positional vs. named params, and how they naturally map onto an array and a hash respectively.) Now something that is countable doesn't have to have meaningful keys, but any keyed collection can be counted through; hence it makes sense to give hashes an array-like [] accessor for getting the first/last/nth item in the hash. In fact, this is basically what %h.values gives you -- turning the hash values into an array (well, a list). Saying %h[n] would amount to a direct way of saying @(%h.values)[n]. This becomes much more handy in P6, because hashes can be ordered. (Not that there's anything stopping you from counting through an unordered hash; %h[0] is always the first element of %h, you just might not know what that is, the same as with %h.values.) If Perl knows how to generate new keys on the fly (say, because your possible hash keys were declared as something inc-/dec-rementable), then you can even access elements off the ends of your hash (push/unshift). What about shaped arrays? A shape means the indices *signify* something (if they didn't, you wouldn't care, you'd just start at 0!). So they really are *keys*, and thus should use a hash (which may not use any hash tables at all, but it's still an associative array because it associates meaningful keys with elements). I'm not put off by calling it a hash -- I trust P6 to recognise when I declare a hash that is restricted to consecutive int keys, is ordered, etc. and to optimise accordingly. If there are no meaningful lookup keys, if all I can do to get through my list is count the items, then an array is called for, and it can work in the usual way: start at 0, end at -1. It is useful to be able to count past the ends of an array, and * can do this by going beyond the end: *+1, *+2, etc., or before the beginning: *-1, *-2, etc. (This neatly preserves the notion of * as all the elements -- *-1 is the position before everything, and *+1 is the position after everything else.) Well, at least this keeps the easy stuff (counting) easy, and the barely-harder stuff (keying) possible. In fact, since hashes would always have both views available, nothing is lost; we get ordinals for hashes, shaped collections, and ones that you can pass to a sub without losing their shape, it solves the problem of distinguishing between ordinal vs. funny indices (and the related issues of wrap-around), you can count past the edges, and all while preserving familiar array behaviour (especially for P5 veterans), the meaning of * as everything, and uncluttered syntax. -David
Re: [S09] Whatever indices and shaped arrays
David Green wrote: On 2/24/07, Jonathan Lang wrote: In effect, using * as an array of indices gives us the ordinals notation that has been requested on occasion: '*[0]' means 'first element', '*[1]' means 'second element', '*[-1]' means 'last element', '*[0..2]' means 'first three elements', and so on - and this works regardless of what the actual indices are. Using * that way works, but it still is awkward, which makes me think there's something not quite dropping into place yet. We have the notion of keyed indexing via [] and counting/ordinal indexing via [*[]], which is rather a mouthful. So I end up back at one of Larry's older ideas, which basically is: [] for counting, {} for keys. What if you want to mix the two? I want the third element of row 5. In my proposal, that would be @array[5, *[2]]; in your proposal, there does not appear to be a way to do it. Unless the two approaches aren't mutually exclusive: @array{5, *[2]}. That is, allow subscripted Whatevers within curly braces for to enable the mixing of ordinals and keys. Since this is an unlikely situation, the fact that nesting square braces inside curly braces is a bit uncomfortable isn't a problem: this is a case of making hard things possible, not making easy things easy. What about shaped arrays? A shape means the indices *signify* something (if they didn't, you wouldn't care, you'd just start at 0!). So they really are *keys*, and thus should use a hash (which may not use any hash tables at all, but it's still an associative array because it associates meaningful keys with elements). I'm not put off by calling it a hash -- I trust P6 to recognise when I declare a hash that is restricted to consecutive int keys, is ordered, etc. and to optimise accordingly. The one gotcha that I see here is with the possibility of multi-dimensional arrays. In particular, should multi-dimensional indices be allowed inside square braces? My gut instinct is yes; conceptually, the third row of the fourth column is perfectly reasonable terminology to use. The thing that would distinguish [] from {} would be a promise to always use zero-based, consecutive integers as your indices, however many dimensions you specify. With that promise, you can always guarantee that the wrap-around semantics will work inside [], while nobody will expect them to work inside {}. In short, the distinction being made here isn't unshaped vs. shaped; it's ordinal indices vs. named indices, or ordinals vs. keys. That said, note that - in the current conception, at least - one of the defining features of a shaped array is that trying to access anything outside of the shape will cause an exception. How would shapes work with the ordinals-and-keys paradigm? First: Ordinals have some severe restrictions on how they can be shaped, as specified above. The only degrees of freedom you have are how many dimensions are allowed and, for each dimension, how many ordinals are permitted. Well, also the value type (although the key type is fixed as Int where 0..*. So you could say something like: my @array[2, 3, *] ...which would mean that the array must be three-dimensional; that the first dimension is allowed two ordinals, the second is allowed three, and the third is allowed any number of them - i.e., 'my @array[^2; ^3; 0..*]' in the current syntax. Or you could say: my @array[2, **, 2] ...meaning that you can have any number of dimensions, but the first and the last would be constrained to two ordinals each: 'my @array[^2; **; ^2]'. Note the use of commas above. Since each dimension can only take a single value (a non-negative integer), there's no reason to use a multidimensional list to define the shape. Personally, I like this approach: it strikes me as being refreshingly uncluttered. Furthermore, you could do away with the notion of shaped vs. unshaped: just give everything a default shape. The default shape for arrays would be '[*]' - that is, one dimension with an indeterminate number of ordinals. Meanwhile, shapes for {} would continue to use the current syntax. '[$x, $y, $z]' would be nearly equivalent to '{0..^$x; 0..^$y; 0..^$z}'. If there are no meaningful lookup keys, if all I can do to get through my list is count the items, then an array is called for, and it can work in the usual way: start at 0, end at -1. It is useful to be able to count past the ends of an array, and * can do this by going beyond the end: *+1, *+2, etc., or before the beginning: *-1, *-2, etc. (This neatly preserves the notion of * as all the elements -- *-1 is the position before everything, and *+1 is the position after everything else.) Regardless, I would prefer this notion to the offset from the endpoint notion currently in use. Note, however, that [*-1] wouldn't work in the ordinals paradigm; there simply is nothing before the first element. About the only use I could see for it would be to provide an assignment equivalent of unshift: '@array[*-1] = $x' could
Re: [S09] Whatever indices and shaped arrays
Jonathan Lang wrote: Larry Wall wrote: : If you want the last index, say '*[-1]' instead of '* - 1'. : If you want the first index, say '*[0]' instead of '* + 0'. So the generic version of leaving off both ends would be *[1]..*[-2] (ignoring that we'd probably write *[0]^..^*[-1] for that instead). Correct - although that assumes that the indices are consecutive (as opposed to, say, 1, 2, 4, 8, 16...); this version of * makes no such assumption. Another thought: '*[1..-2]' or '*[0^..^-1]' would do the trick here - except for the fact that the Range 1..-2 doesn't normally make sense. Suggestion: when dealing with Ranges in unshaped arrays, negative endpoints are treated like negative indices (i.e., '$_ += [EMAIL PROTECTED]'). In effect, using * as an array of indices gives us the ordinals notation that has been requested on occasion: '*[0]' means 'first element', '*[1]' means 'second element', '*[-1]' means 'last element', '*[0..2]' means 'first three elements', and so on - and this works regardless of what the actual indices are. Like I said, I tend to miss intricacies. For instance, I never considered what would be involved in applying a subscriptor to a multidimensional Whatever (e.g., what can you do with '**[...]'?). Part of that is that I'm not yet comfortable with multidimensional slices (or arrays, for that matter); when reading about them, I keep on getting the feeling that there's something going on here that the big boys know about that I don't - implicit assumptions, et al. I think I've got a better grip on it now. Here's how I understand it to work: A multidimensional array is defined by providing a list of lists, each giving all of the valid indices along one axis (i.e., in one dimension). The overall shape of the array will be rectangular, or a higher-dimensional analog of rectangular. There may be gaps in the indices (in which case the array is a sparse array as well as a multidimensional array); but if there are, the gaps also conform to the rectangular structure: it's as if you carved a solid rectangle into two or more rectangular pieces and pulled them apart a bit. That is, @array[-1, +1; -1 +1] is effectively a 2x2 square array with valid x-indices of -1 and +1 and valid y-indices of -1 and +1. To access an element in a multidimensional array, use a semicolon-delimited list of indices in the square braces: '@cube[1;1;1]' will access the center element of a [^3;^3;^3] shaped array, while '@array[*;*;1]' will access a 3x3 horizontal slice of it. When putting together a list literal, things work a bit differently. Create a one-dimensional literal by means of a comma-delimited list of values; create a two-dimensional literal by means of a semicolon-delimited list of comma-delimited lists of values: 1, 2, 3 # one-dimensional list literal with a length of 3 (1, 2, 3; 4, 5, 6) # two-dimensional list literal with a length of 2 and a width of 3. (1; 2; 3) # two-dimensional list literal with a length of 3 and a width of 1. I would guess that you would build higher-dimensional literals by nesting parentheses-enclosed semicolon-delimited lists: (( 0, 1; 2, 3; 4, 5; 6, 7; 8, 9); (10, 11; 12, 13; 14, 15; 16, 17; 18, 19); (20, 21; 22, 23; 24, 25; 26, 27; 28, 29)) # three-dimensional list literal with a length of 3, a width of 5, and a height of 2. The outermost set of semicolons delimits the first dimension, and the commas delimit the last dimension. That is, semicolon-delimited lists nest, and comma-delimited lists flatten. Furthermore, the list literal gets assigned to the array by means of ordinal coordinates: my @cube[-1..+1; -1..+1; -1..+1] = ((1, 2, 3; 4, 5, 6; 7, 8, 9); (10, 11, 12; 13, 14, 15; 16, 17, 18); (19, 20, 21; 22, 23, 24; 25, 26, 27)); would be equivalent to my @cube[1..3; 1..3; 1..3]; @cube[**[0; **]] = (1, 2, 3; 4, 5, 6; 7, 8, 9); @cube[**[0; **]] = (10, 11, 12; 13, 14, 15; 16, 17, 18); @cube[**[0; **]] = (19, 20, 21; 22, 23, 24; 25, 26, 27); or my @cube[1..3; 1..3; 1..3]; @cube[**[0; 0; *]] = 1, 2, 3; @cube[**[0; 1; *]] = 4, 5, 6; @cube[**[0; 2; *]] = 7, 8, 9; @cube[**[1; 0; *]] = 10, 11, 12; @cube[**[1; 1; *]] = 13, 14, 15; @cube[**[1; 2; *]] = 16, 17, 18; @cube[**[2; 0; *]] = 19, 20, 21; @cube[**[2; 1; *]] = 22, 23, 24; @cube[**[2; 2; *]] = 25, 26, 27; or my @cube[1..3; 1..3; 1..3]; @cube[**[0; 0; 0]] = 1; @cube[**[0; 0; 1]] = 2; @cube[**[0; 0; 2]] = 3; @cube[**[0; 1; 0]] = 4; @cube[**[0; 1; 1]] = 5; @cube[**[0; 1; 2]] = 6; ... where say @cube[**[1; 1; 1]]; would be equivalent to say @cube[0; 0; 0]; Do I have the general idea? -- Jonathan Dataweaver Lang
[S09] Whatever indices and shaped arrays
From S09: When you use * with + and -, it creates a value of Whatever but Num, which the array subscript interpreter will interpret as the subscript one off the end of that dimension of the array. Alternately, *+0 is the first element, and the subscript dwims from the front or back depending on the sign. That would be more symmetrical, but makes the idea of * in a subscript a little more distant from the notion of 'all the keys', which would be a loss, and potentially makes +* not mean the number of keys. If '*+0' isn't the first element, then '*+$x' is only meaningful if $x 0. That said, I think I can do one better: Ditch all of the above. Instead, '*' always acts like a list of all valid indices when used in the context of postcircumfix:[ ]. If you want the last index, say '*[-1]' instead of '* - 1'. If you want the first index, say '*[0]' instead of '* + 0'. So the four corners of a two-dimensional array would be: @array[ *[0]; *[0] ]; @array[ *[-1]; *[0] ]; @array[ *[0]; *[-1] ]; @array[ *[-1]; *[-1] ]; The only thing lost here is that '@array[+*]' is unlikely to point just past the end of a shaped array. But then, one of the points of shaped arrays is that if you point at an invalid index, you get a complaint; so I don't see why one would want to knowingly point to one. -- Also, has the syntax for accessing an array's shape been determined yet? If not, I'd like to propose the following: @array.shape returns a list of lists, with the top-level list's indices corresponding to the dimensions of the shape and each nested list containing every valid index in that dimension. In boolean context, the shape method returns true if the array is shaped and false if not - though an unshaped array will otherwise pretend to be a one-dimensional, zero-based, non-sparse, shaped array. So: @array.shape[0][2] # the third valid index of the first dimension of the shape @array.shape[-1][0] # the first valid index of the last dimension of the shape @array.shape[1] # every valid index of the second dimension of the shape @array.shape[1][*] # same as @array.shape[1] [EMAIL PROTECTED] # is this a shaped array? exists @array.shape[2] # does the array have a third dimension? exists @array.shape[3][4] # does the fourth dimension have a fifth element? [EMAIL PROTECTED] # how many dimensions does the shape have? [EMAIL PROTECTED] # how many indices does the first dimension have? If we use this notation, then @array[ *; * ] is shorthand for @array[ @array.shape[0]; @array.shape[1] ] -- Jonathan Dataweaver Lang
Re: [S09] Whatever indices and shaped arrays
On Fri, Feb 23, 2007 at 10:49:34AM -0800, Jonathan Lang wrote: : That said, I think I can do one better: : : Ditch all of the above. Instead, '*' always acts like a list of all : valid indices when used in the context of postcircumfix:[ ]. Ooh, shiny! Or at least, shiny on the shiny side... : If you want the last index, say '*[-1]' instead of '* - 1'. : If you want the first index, say '*[0]' instead of '* + 0'. So the generic version of leaving off both ends would be *[1]..*[-2] (ignoring that we'd probably write *[0]^..^*[-1] for that instead). : So the four corners of a two-dimensional array would be: : : @array[ *[0]; *[0] ]; @array[ *[-1]; *[0] ]; : @array[ *[0]; *[-1] ]; @array[ *[-1]; *[-1] ]; A point against it visually is the nested use of []. : The only thing lost here is that '@array[+*]' is unlikely to point : just past the end of a shaped array. But then, one of the points of : shaped arrays is that if you point at an invalid index, you get a : complaint; so I don't see why one would want to knowingly point to : one. I would expect that to point to one off the end in the first dimension only, which might make sense if that dimension is extensible: my @array[*;2;2]; Adding something at +* would then add another 2x2 under it. : Also, has the syntax for accessing an array's shape been determined : yet? If not, I'd like to propose the following: : : @array.shape returns a list of lists, with the top-level list's : indices corresponding to the dimensions of the shape and each nested : list containing every valid index in that dimension. In boolean : context, the shape method returns true if the array is shaped and : false if not - though an unshaped array will otherwise pretend to be a : one-dimensional, zero-based, non-sparse, shaped array. That's more or less how I was thinking of it, though I hadn't got as far as boolean context. : So: : : @array.shape[0][2] # the third valid index of the first dimension of the : shape : @array.shape[-1][0] # the first valid index of the last dimension of the : shape : @array.shape[1] # every valid index of the second dimension of the shape : @array.shape[1][*] # same as @array.shape[1] : : [EMAIL PROTECTED] # is this a shaped array? : : exists @array.shape[2] # does the array have a third dimension? : exists @array.shape[3][4] # does the fourth dimension have a fifth element? : : [EMAIL PROTECTED] # how many dimensions does the shape have? : [EMAIL PROTECTED] # how many indices does the first dimension have? : : If we use this notation, then : : @array[ *; * ] : : is shorthand for : : @array[ @array.shape[0]; @array.shape[1] ] Note also that multidimensional whatever gives us @array[ ** ] to mean @array[ @@( @array.shape[*] ) ] or some such. Though ** might want to be even smarter than that if we want @array[ 0; **; 42] to dwim. That'd have to turn into something like: @array[ 0; @@( @array.shape[*[1]..*[-2]] ); 42 ] Also +** might return a shape vector, or maybe +«**. Larry
Re: [S09] Whatever indices and shaped arrays
Larry Wall wrote: On Fri, Feb 23, 2007 at 10:49:34AM -0800, Jonathan Lang wrote: : That said, I think I can do one better: : : Ditch all of the above. Instead, '*' always acts like a list of all : valid indices when used in the context of postcircumfix:[ ]. Ooh, shiny! Or at least, shiny on the shiny side... Thank you... : If you want the last index, say '*[-1]' instead of '* - 1'. : If you want the first index, say '*[0]' instead of '* + 0'. So the generic version of leaving off both ends would be *[1]..*[-2] (ignoring that we'd probably write *[0]^..^*[-1] for that instead). Correct - although that assumes that the indices are consecutive (as opposed to, say, 1, 2, 4, 8, 16...); this version of * makes no such assumption. I do find myself wondering what *[-1] would be for an infinite array, such as @nums[0..*:by(2)]. One possible answer: Inf. Another possible answer: the shape sets limits on the indices; it does not set requirements. For instance: my @nums[0..*:by(2)]; @nums[2 * $_] = $_ for 0..5]; say @nums[ *[-1] ]; # same as 'say @nums[10];' @nums[42] = 21; say @nums[ *[-1] ]; # same as 'say @nums[42];' say @nums[ *[-2] ]; # same as 'say @nums[40];' - whatever that means. : Also, has the syntax for accessing an array's shape been determined : yet? If not, I'd like to propose the following: : : @array.shape returns a list of lists, with the top-level list's : indices corresponding to the dimensions of the shape and each nested : list containing every valid index in that dimension. In boolean : context, the shape method returns true if the array is shaped and : false if not - though an unshaped array will otherwise pretend to be a : one-dimensional, zero-based, non-sparse, shaped array. That's more or less how I was thinking of it, though I hadn't got as far as boolean context. I'm still debating the boolean context myself. I _think_ it will work; but I have a tendency to miss intricacies. You might instead want to require someone to explicitly check for definedness or existence instead of merely truth; or you might not. : If we use this notation, then : : @array[ *; * ] : : is shorthand for : : @array[ @array.shape[0]; @array.shape[1] ] Note also that multidimensional whatever gives us @array[ ** ] to mean @array[ @@( @array.shape[*] ) ] or some such. Like I said, I tend to miss intricacies. For instance, I never considered what would be involved in applying a subscriptor to a multidimensional Whatever (e.g., what can you do with '**[...]'?). Part of that is that I'm not yet comfortable with multidimensional slices (or arrays, for that matter); when reading about them, I keep on getting the feeling that there's something going on here that the big boys know about that I don't - implicit assumptions, et al. Though ** might want to be even smarter than that if we want @array[ 0; **; 42] to dwim. That'd have to turn into something like: @array[ 0; @@( @array.shape[*[1]..*[-2]] ); 42 ] Also +** might return a shape vector, or maybe +«**. If by shape vector you mean something that says the array has a length of 5, a width of 3, and a height of 2, +«** would seem to be the more appropriate syntax. Why you'd want that inside an array's subscriptor is beyond me, for a similar reason to +*. But that's what the logic of the syntax gives. But I _could_ see using '[EMAIL PROTECTED]' to get an array's measurements. Hmm... my @@square = (1, 2; 3, 4); say +@@square; # say what? '2', as in 2 dimensions? Or '4', as in 4 items? Answer that, and you'll know what +** would give you. -- BTW: could the parser handle the following? @array[ *[0] * 2; 2 ** **[ [;] 0 x *] ] -- Jonathan Dataweaver Lang
Re: [S09] Whatever indices and shaped arrays
On 2/23/07, Jonathan Lang [EMAIL PROTECTED] wrote: ' I'm still debating the boolean context myself. I _think_ it will work; but I have a tendency to miss intricacies. You might instead want to require someone to explicitly check for definedness or existence instead of merely truth; or you might not. I should chime in something here. It may not be practical for Perl, given how much we have already relied on its opposite, but it is still worth considering: I have been extremely satisfied with Ruby's boolean truth model: nil and false are false, everything else is true. So the empty string is true, 0 is true, 0 is certainly true. I think it's the same reason that I like Haskell's function call model: that is, function application binds most tightly, everything else has various looser precedence. I think the nice thing about these two is their extreme simplicity. In Haskell, when I read: foo x ! bar I don't need to think for a fraction of a second to associate that correctly in my mind. Likewise, in Ruby, when I write: while line = gets I don't need to think for a fraction of a second about edge cases of gets. Gets returns a string when it reads something, and nil on EOF, that's all I need to know. And the simplicity has a way of propagating to other areas of the language, as in that example, where gets is able to return the most obvious thing for EOF and have it work correctly. So, yeah, simple rules can be a blessing if you find the right ones. In particular, since I use boolean context a lot (i.e. without explicit compare operators), I'm a fan of as much boolean predictability as I can get. Even if we don't get the same simple model of booleans as ruby, I'd like to keep the number of boolean-context overloaded objects reasonably small. This gives functions the freedom to return false or undef as a failure mode, when it is convenient for it to function that way. Luke