Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
On Sun, Oct 01, 2000 at 08:51:04AM +1100, Jeremy Howard wrote: A prototypeless-function call. get rid of them all!! Please no! Anything that makes it harder to write 'quick-and-dirty' scripts is never going to fly--this is part of what makes Perl special. Why? I see no problem in making -Mstrict and -Wall the defaults. Then make '-E' option to mean what '-e' means today, and '-e' mean -M-strict -Wnone -E Ilya
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
get rid of them all!! Ilya Zakharevich wrote: On Thu, Sep 28, 2000 at 11:39:51AM -0400, Karl Glazebrook wrote: so what is wrong with the statement '@y = 3*@x;' then ? That other constructs *also* create an array context, in which the behaviour of multiplication you propose is not appropriate. for example? A prototypeless-function call. Ilya
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
Karl Glazebrook wrote: Ilya Zakharevich wrote: On Thu, Sep 28, 2000 at 11:39:51AM -0400, Karl Glazebrook wrote: so what is wrong with the statement '@y = 3*@x;' then ? That other constructs *also* create an array context, in which the behaviour of multiplication you propose is not appropriate. for example? A prototypeless-function call. get rid of them all!! Please no! Anything that makes it harder to write 'quick-and-dirty' scripts is never going to fly--this is part of what makes Perl special. I would like to see array operations occur inside prototypeless function calls, which as Ilya notes already creates array context. This is not fundamentally 'inappropriate', although it is a change from P5. It just means having to type 'scalar @arr' when that's what you mean--and having the P52P6 converter do the same.
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
Ilya Zakharevich wrote: so what is wrong with the statement '@y = 3*@x;' then ? That other constructs *also* create an array context, in which the behaviour of multiplication you propose is not appropriate. for example? I did not see any viable proposal on changing things in a major way. To design such a change is a *major* work. We need to keep a lot of possible combinations with other features in mind, and understand all the ramifications and desired/undesired interaction. We need insight. We need to balance the tradeoffs. This is what will happen no doubt, and what will emerge will probably be less than the radicals hope for and more than the conservatives would want! I did not mean interviews. 10 years ago I read the manual. It was clearly there. I am sure it was, the guy is nuts. Karl.
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
On Mon, Sep 25, 2000 at 06:30:22PM -0400, Karl Glazebrook wrote: Well, this shows that you entirely miss the problem of cryptocontexts. Context is determined by the "environment" of the operation, not by the operation. Context is propagated: the-left-hand-side-of-assignment --- the-right-hand-side-of-assignment so what is wrong with the statement '@y = 3*@x;' then ? That other constructs *also* create an array context, in which the behaviour of multiplication you propose is not appropriate. Changing Perl in this respect will make one particular mode of operation a tiny bit simpler, but (without major changes to cryptocontexting - PLUG see for example my interview on perl.com /PLUG) will make life much harder in other modes of operation. I think major changes are what we aree talking about here. I did not see any viable proposal on changing things in a major way. To design such a change is a *major* work. We need to keep a lot of possible combinations with other features in mind, and understand all the ramifications and desired/undesired interaction. We need insight. We need to balance the tradeoffs. I do not think we made *any* step in the correct direction yet. Remember: do you do your system mainainance in Mathematica? Why? Remember that Wolfram *wanted* you to do this? Perl5 is much better balanced. You are pulling the blanket to your side of the bed. I am not sure what point you are trying to make about Mathematica? I have read intevrviews with Woldfram ,he is clearky an egomanica and thinks everything should be an expression, but I am not sure he was arguing for system management in Mathematica. I did not mean interviews. 10 years ago I read the manual. It was clearly there. Ilya
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
[EMAIL PROTECTED] wrote: Ilya Zakharevich wrote: ...Do you say you are confused by using vectors (=scalars) instead of arrays? I'm not having a problem with that personally but *many* users of PDL have complained about being confused by this. They assume ndim == array == perl array. Christian Yes this is the point. I guess another way of looking at it is saying that 3*@a operates in a list context not a scalar context and that we will define the behaviour of '*' in this context. (Currently it is not defined, hence @a is converted to scalar(@a)). Karl
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
Ilya Zakharevich wrote: But with Fortran such things are not *needed*. Compilers are smart enough to convert (equivalents to) map 3*$_, 34..67 This is true, but easier (and less buggy) to say what you exactly what you mean. 102:201:3 Anyway the idea has been proposed, it won't break Perl, we'll see what happens. f(3*@a) would typically be a list context - and suddently instead of 3*(1+$#a) you get Cmap 3*$_, @a. This is true, what I would propose is we declare 3*(1+$#a) outmoded and always have it mean Cmap 3*$_, @a in all contexts. This of course will break perl5 code. Note mine because I always say 3*scalar(@a) because 3*@a does not look like 3*(1+$#a) to me. I don't know how many people would depend on that feature. There is also the problenm that we are arguing somewhat in a vacuum as we don't know how radical perl6 (in terms of syntax changes) will be. Anyhow the various proposals are out there, we'll see what happens. Why? Currently you can make them look like references to array. See Math::Pari for an implementation. Overloading '@{}' gives yet another way to do this. True but the user has to remember 'owe I am now using a special PDL array which means I have to always use a reference to it rather than treat it like a perl array'. Not good. It's really hard to explain why people should use @x[1..10] for perl arrays and $x-slice("1:10") for PDL arrays! Use $x-[1..10] for both. This is true, but inelegant. If perl @x arrays are not considered useful why not get rid of them and always use references? Karl
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
On Fri, Sep 22, 2000 at 11:17:40AM -0400, Karl Glazebrook wrote: [Cryptocontext is:] f(3*@a) would typically be a list context - and suddently instead of 3*(1+$#a) you get Cmap 3*$_, @a. This is true, what I would propose is we declare 3*(1+$#a) outmoded and always have it mean Cmap 3*$_, @a in all contexts. You are trading a frequently used shortcut @a == 1 + $#a for a rarely-used-but-beautiful/intuitive semantic. I'm not sure it is a win. Moveover, $x = 3 * @_; suddently being equivalent to $x = @_; does not look very promising... Why? Currently you can make them look like references to array. See Math::Pari for an implementation. Overloading '@{}' gives yet another way to do this. True but the user has to remember 'owe I am now using a special PDL array which means I have to always use a reference to it rather than treat it like a perl array'. Not good. No, you do not use "a special PDL array", you use "a vector". A subtle change in wording - and no conflict. This is true, but inelegant. If perl @x arrays are not considered useful why not get rid of them and always use references? Actually, this is what Perl is using internally (they are softreferences==globs, but who cares?). Ilya
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
Ilya Zakharevich wrote: You are trading a frequently used shortcut @a == 1 + $#a for a rarely-used-but-beautiful/intuitive semantic. I'm not sure it is a win. It's now boiling down to a matter of opinion and we'll have to agree to differ. Of course I use array arithmetic all the time as a heavy PDL user. Moveover, $x = 3 * @_; suddently being equivalent to $x = @_; does not look very promising... But would it not be easy to catch and warned by a p5tp6 converter? No, you do not use "a special PDL array", you use "a vector". A subtle change in wording - and no conflict. sure, but vector to me means 1D and also some sort of transformation properties whereas a PDL array is just a N-dim square container. anyway semantics - we call them 'piddles' which is moderately amusing but inelegant. This is true, but inelegant. If perl @x arrays are not considered useful why not get rid of them and always use references? Actually, this is what Perl is using internally (they are softreferences==globs, but who cares?). Hmm Karl
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
On Fri, Sep 22, 2000 at 05:24:55PM -0400, Karl Glazebrook wrote: It's now boiling down to a matter of opinion and we'll have to agree to differ. Of course I use array arithmetic all the time as a heavy PDL user. ...Do you say you are confused by using vectors (=scalars) instead of arrays? Moveover, $x = 3 * @_; suddently being equivalent to $x = @_; does not look very promising... But would it not be easy to catch and warned by a p5tp6 converter? Why converters? I'm discussing Perl6 now, not converters. No, you do not use "a special PDL array", you use "a vector". A subtle change in wording - and no conflict. sure, but vector to me means 1D and also some sort of transformation properties whereas a PDL array is just a N-dim square container. An N-dim container is just a vector which contains vectors... Ilya
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
Ilya Zakharevich wrote: Moveover, $x = 3 * @_; suddently being equivalent to $x = @_; does not look very promising... Why are these equivalent? RFC 82 only applies in list context. Am I missing something?
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
On Sat, Sep 23, 2000 at 09:52:51AM +1100, Jeremy Howard wrote: $x = 3 * @_; suddently being equivalent to $x = @_; does not look very promising... Why are these equivalent? RFC 82 only applies in list context. Am I missing something? Yes, the proposal to make map 3*$_ semantic to work in a scalar context too (to avoid cryptocontext). Ilya
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
Karl Glazebrook wrote: Ilya Zakharevich wrote: You are trading a frequently used shortcut @a == 1 + $#a for a rarely-used-but-beautiful/intuitive semantic. I'm not sure it is a win. It's now boiling down to a matter of opinion and we'll have to agree to differ. Of course I use array arithmetic all the time as a heavy PDL user. It's not just for number-crunchers either. Array notation greatly simplifies many frequently used operations. For instance (from RFC 82): quote @people = ('adam', 'eve ', 'bob '); @scores = (7,9,5); # Score for each person @histogram = '#' x @scores; # Returns ('xxx','x','x') print join("\n", @people . ' ' . @histogram); adam xxx eve x bob x /quote Array notation is not 'rarely used' in languages that support it--in fact, operations are applied to arrays and lists at least as often as scalars in most code I see written for Mathematica, J, PDL, and so forth.
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
On Sat, Sep 23, 2000 at 10:01:11AM +1100, Jeremy Howard wrote: It's now boiling down to a matter of opinion and we'll have to agree to differ. Of course I use array arithmetic all the time as a heavy PDL user. It's not just for number-crunchers either. Array notation greatly simplifies many frequently used operations. For instance (from RFC 82): @people = ('adam', 'eve ', 'bob '); @scores = (7,9,5); # Score for each person @histogram = '#' x @scores; # Returns ('xxx','x','x') print join("\n", @people . ' ' . @histogram); adam xxx eve x bob x Are you trying to convince me/us that is going to be used often? Array notation is not 'rarely used' in languages that support it--in fact, operations are applied to arrays and lists at least as often as scalars in most code I see written for Mathematica, J, PDL, and so forth. a) You can *already* use vectors as scalars in Perl; b) What we are discussing is Perl, not Mathematica, J, PDL, and so forth. These languages have a very narrow niche. Ilya
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
Ilya Zakharevich wrote: Are you trying to convince me/us that is going to be used often? Yes, I am. You made the unsupported statement that array operations are rarely used. I'm suggesting otherwise (although to say that they're rarely used in Perl 5 is a tautology, of course!). Array notation is not 'rarely used' in languages that support it--in fact, operations are applied to arrays and lists at least as often as scalars in most code I see written for Mathematica, J, PDL, and so forth. a) You can *already* use vectors as scalars in Perl; That's not what RFC 82 is proposing. b) What we are discussing is Perl, not Mathematica, J, PDL, and so forth. These languages have a very narrow niche. That's because few such languages provide strong general purpose programming features as well. They are either limited maths-oriented languages (like Mathematica) or add-ons to general purpose languages that aren't fully integrated (Python/NumPy; Perl/PDL; C++/Blitz++). Many Perl users operate on lists of data. Requiring explicit loops every time a programmer wants to operate on a list is asking the programmer to fit in with how a computer thinks. That's not right.
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
On Sat, Sep 23, 2000 at 10:41:07AM +1100, Jeremy Howard wrote: a) You can *already* use vectors as scalars in Perl; That's not what RFC 82 is proposing. Who cares? This already works... b) What we are discussing is Perl, not Mathematica, J, PDL, and so forth. These languages have a very narrow niche. That's because few such languages provide strong general purpose programming features as well. They are either limited maths-oriented languages (like Mathematica) or add-ons to general purpose languages that aren't fully integrated (Python/NumPy; Perl/PDL; C++/Blitz++). Many Perl users operate on lists of data. Requiring explicit loops every time a programmer wants to operate on a list is asking the programmer to fit in with how a computer thinks. That's not right. Well, this is your opinion agains mine... ;-) Ilya
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
Ilya Zakharevich wrote: On Fri, Sep 22, 2000 at 05:24:55PM -0400, Karl Glazebrook wrote: It's now boiling down to a matter of opinion and we'll have to agree to differ. Of course I use array arithmetic all the time as a heavy PDL user. ...Do you say you are confused by using vectors (=scalars) instead of arrays? I'm not having a problem with that personally but *many* users of PDL have complained about being confused by this. They assume ndim == array == perl array. Christian
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
Ilya Zakharevich wrote: As shipped: no. But if this is made a primitive (which I would not like), then the only change which is needed is to make the tie::multi::range() token to be followed by 3 numbers. [Aside: Why not make ternary-range operator into 10 :: 20 :: 2 ?] That would work. My point is that having a stride is a fundamental feature in other array languages (IDL, Matlab, PDL) and would be useful in the perl core. Finally as an overload expert what do you think about the proposals to make arrays overloadable objects so one can say things like: @x = 3 * @y; This is not an overloading issue, this is the context resolution issue. IMO, the cryptocontext turns out to be evil with an exception of extremely short scripts - and this is with what we have now. A proposal like this would make a nuisance into a nightmare. Yes, it looks nice, but it contradicts many rules, so in the long run it is going to be a significant step back. ...Unless the whole idea of cryptocontext is turned to become something else... I am not sure what you mean by "cryptocontext"? I guess the motivation here is to make non-core arrays (such as PDL objects) look as much as possible like Perl arrays to simplify the appearance to users. It's really hard to explain why people should use @x[1..10] for perl arrays and $x-slice("1:10") for PDL arrays! I can see that allowing expressions on @x would require considerable changes to perl core. Is there a nice way to resolve this problem? Karl
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
At 03:26 PM 9/21/00 -0400, Karl Glazebrook wrote: Finally as an overload expert what do you think about the proposals to make arrays overloadable objects so one can say things like: @x = 3 * @y; I can see that allowing expressions on @x would require considerable changes to perl core. Is there a nice way to resolve this problem? What do you think of: $x[|i] = 3 * $y[|i]; or @x = 3 * $y[|i]; It's not as clean as @x = 3 * @y, but it is cleaner context-wise. (Working on RFC207(v2) even as I write) Karl
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
At 03:35 PM 9/21/00 -0400, Buddha Buck wrote: At 03:26 PM 9/21/00 -0400, Karl Glazebrook wrote: Finally as an overload expert what do you think about the proposals to make arrays overloadable objects so one can say things like: @x = 3 * @y; What do you think of: $x[|i] = 3 * $y[|i]; or @x = 3 * $y[|i]; It's not as clean as @x = 3 * @y, but it is cleaner context-wise. And one could argue that: @x = map 3*^_, @y; is cleaner yet... Karl
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
On Thu, Sep 21, 2000 at 03:26:39PM -0400, Karl Glazebrook wrote: [Aside: Why not make ternary-range operator into 10 :: 20 :: 2 ?] That would work. My point is that having a stride is a fundamental feature in other array languages (IDL, Matlab, PDL) and would be useful in the perl core. Did not use any steps more than 1 for a decade or so. But in 80's, when people did not believe in 10^4..10^7 speedups my algos were claiming, I needed to actually code them in Fortran ;-). I think I used larger-than-1 steps that time. But with Fortran such things are not *needed*. Compilers are smart enough to convert (equivalents to) map 3*$_, 34..67 into efficient code... A proposal like this would make a nuisance into a nightmare. Yes, it looks nice, but it contradicts many rules, so in the long run it is going to be a significant step back. ...Unless the whole idea of cryptocontext is turned to become something else... I am not sure what you mean by "cryptocontext"? See p5p archives. (Significant) switching of the meaning of operations basing on the context looks good on paper and for small examples, but it breaks badly in slightly more complicated situations. The problem is that the context is not always what you think. Say, f(3*@a) would typically be a list context - and suddently instead of 3*(1+$#a) you get Cmap 3*$_, @a. I guess the motivation here is to make non-core arrays (such as PDL objects) look as much as possible like Perl arrays to simplify the appearance to users. Why? Currently you can make them look like references to array. See Math::Pari for an implementation. Overloading '@{}' gives yet another way to do this. It's really hard to explain why people should use @x[1..10] for perl arrays and $x-slice("1:10") for PDL arrays! Use $x-[1..10] for both. Ilya
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
Hi Ilya, I have three questions about your RFC: Firstly does your proposal allow for a slice like 10..20:2 (i.e. with a stride of 2) ? If not is there an easy way to incorporate that? Secondly, what about having multidim support in the core so that the tie-tokenisers get optimised away? i.e. would we be able to say something like: @x = @y[10..20; 1..3] for core arrays Finally as an overload expert what do you think about the proposals to make arrays overloadable objects so one can say things like: @x = 3 * @y; Katl
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
Finally as an overload expert what do you think about the proposals to make arrays overloadable objects so one can say things like: @x = 3 * @y; Is this where RFC 231's suggestion for OO slicing comes in (see quote)? For example, $matrix1-[2..5; 2..4] * $matrix2-[1,3,5; 11..64]; would denote: create two new objects for the specified submatrices, apply (overloaded) multiplication to these objects. Such a request is illegal for untie()d arrays; for tie()d arrays it is converted to a call to FETCH_SLICE in a scalar context. (Alternative: introduce two new tie()d methods: FETCH_SUBOBJECT, STORE_SUBOBJECT.) or is this supposed to be othogonal? Another questing re RFC 231. Is it really required to make the syntactical distinction between ranges (..) and bi_ranges (...)? Some more explanation would be appreciated. Christian
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
On Sat, Sep 16, 2000 at 11:08:18AM +1100, Jeremy Howard wrote: proposes a convenient syntax to slice multi-dimensional arrays; It is hard to evaluate this proposal without more context. In particular: - How does it relate to RFC 204? Is it an alternative, or an addition? 204 cannot be implemented since it prohibits usage of overloaded objects as array indices. - How does it relate to RFC 81? The semantics of '..' seems to conflict. What I say conserns the usage of '..' inside an index only. It cannot conflict with anything else. - Why is it better to make ';' "special inside a hash/array index only" Because ',' is already special there. There is little chance that ';' operator is created as a general-purpose operator. - Why is a special token for a separator necessary "to avoid the (giant) overhead of creation of anonymous arrays"? Don't RFC 203 arrays and RFC 81/205 lazy generation avoid this? a) "Lazy generation" is not defined, as stated it is a good wish only. What is @a = (0, 2..99, 200..9998, 100); f(@a); ? My proposal has completely defined behaviour (AFAICS). [Yes, I was proposing lazy evaluation for a long time. But I know that it can be further than it appears.] b) The call for $a[2,3;5,6] is *) Put already-available SV pointers for $a, 2,3,4,5 and the cashed SV* for tie::multi::separator() on stack; *) Put the (cached) CV* for the method on stack; *) invoke the call frame; This is not *very* quick, but at least it may be "not that slow". While all the alternatives require creation of anonymous lists, which (I expect) will slow things down 7..10 times for the call above. For $a[1..100;1..100] it may easily be 100..1000 times slower. Your way was my way when I was designing Math::Pari. When I *implemented* Math::Pari, it took some time to determine why it was so much slower than what I expected. My proposal is based on this experience. Creation of [1,2,3] is *very* slow. - Overall, what is the problem in the existing array RFCs that this is designed to solve? *) They are not compatible with overloading (unless overloaded things are dramatically changed); *) They create a lot of temporary anonymous arrays the only purpose of which is to group arguments; *) They go very high on the bizzareness scale. - Can we incorporate a solution into the existing RFCs without creating a new conflicting one? If there are implementation challenges around the existing RFCs, I would rather make changes required to overcome them within those RFCs. I see no way how the existing RFC can be accepted. (No, I could not read the "include all the PDL" proposal to the end, so I cannot comment on this.) That we we get the benefit of the thought we've all put into the syntax of these RFCs, plus the benefit of Ilya's deep understanding of Perl internals. Thank you for suggesting that I do not need to think to create a RFC. Ilya
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
Ilya Zakharevich wrote: On Sat, Sep 16, 2000 at 11:08:18AM +1100, Jeremy Howard wrote: - How does it relate to RFC 204? Is it an alternative, or an addition? 204 cannot be implemented since it prohibits usage of overloaded objects as array indices. Why is it important for overloaded objects to be used as array indices? Why does RFC 204 rule that out? RFC 204 simply specifies that a list reference as an index provides multidimensional access: $a[ [1,1] ] == $a[1][1]; - How does it relate to RFC 81? The semantics of '..' seems to conflict. What I say conserns the usage of '..' inside an index only. It cannot conflict with anything else. RFC 81 expands on the existing operator '..' in a list context to allow more generic list generation. It is particularly useful to generate lists to act as array slices: @a[ 1..5 : 3] == @a[1,3,5]; This would seem to conflict with the meaning of '..' outlined in RFC 231. - Why is it better to make ';' "special inside a hash/array index only" Because ',' is already special there. There is little chance that ';' operator is created as a general-purpose operator. When we first discussed ';' on the list, we looked at making it special in an index only. But the more generic approach of making it a cartesian product operator seems cleaner--it avoids 'special' meanings in favour of providing a generic operator. Why is there little chance of creating ';' as a general-purpose operator? - Why is a special token for a separator necessary "to avoid the (giant) overhead of creation of anonymous arrays"? Don't RFC 203 arrays and RFC 81/205 lazy generation avoid this? a) "Lazy generation" is not defined, as stated it is a good wish only. What is @a = (0, 2..99, 200..9998, 100); f(@a); Lazy generation is a well understood concept in other languages. I'm most familiar with C++, so I'll draw from that. In libraries that provide lazy evaluation, f(@lazy_list) is a 'promise' to apply f() to the elements of @lazy_list when an element of f(@lazy_list) needs to be calculated. Sometimes this is all done at runtime (MTL, newmat), sometimes parts are done at compile time ('expression templates' in POOMA and Blitz++). These C++ examples and many others are indexed at: http://www.oonumerics.org/oon/ b) The call for $a[2,3;5,6] is *) Put already-available SV pointers for $a, 2,3,4,5 and the cashed SV* for tie::multi::separator() on stack; *) Put the (cached) CV* for the method on stack; *) invoke the call frame; This is not *very* quick, but at least it may be "not that slow". While all the alternatives require creation of anonymous lists, which (I expect) will slow things down 7..10 times for the call above. For $a[1..100;1..100] it may easily be 100..1000 times slower. Lists of lists of known simple type are proposed by RFC 203 to be stored as true arrays (i.e. contiguously in memory). Their overhead is not the same as Perl 5 lists of lists. The index in $a[1..100;1..100] should be generated lazily. An individual element can be calculated directly from the index parameters as required. Your way was my way when I was designing Math::Pari. When I *implemented* Math::Pari, it took some time to determine why it was so much slower than what I expected. My proposal is based on this experience. Creation of [1,2,3] is *very* slow. I hope we can change how [1,2,3] is created by: - Creating a true numeric array if it is an array of known simple types - Generating the elements lazily where it is more efficient to do so If we can not do these, then I agree that RFCs 204 and 205 are not plausible in their current form. - Overall, what is the problem in the existing array RFCs that this is designed to solve? *) They are not compatible with overloading (unless overloaded things are dramatically changed); There are a number of RFCs proposing substantially changing overloading. What specific changes would we need to ensure were incorporated in P6 to avoid this incompatibility? *) They create a lot of temporary anonymous arrays the only purpose of which is to group arguments; Yes, if we can't get any lazy generation to work. *) They go very high on the bizzareness scale. Bizzare??? Which RFC? RFC 82: The concept of all array operations being applied element-wise to arrays is very widely used in languages oriented to numeric programming--it is certainly not 'bizzare'. There has been debate around '||' and '', although I find the alternative meaning of these in a list context proposed by RFC 45 more bizarre. ...But I think that this point is already well discussed... RFCs 90 and 91: These builtins are in almost all languages with rich array functionality. 'merge' and 'demerge' are more frequently called 'zip' and 'unzip', but those terms were almost universally rejected on -language. RFC 203: If we know that a list of lists is of a simple type, why not store it efficiently? And why not
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
On Sat, Sep 16, 2000 at 07:15:34PM +1100, Jeremy Howard wrote: Why is it important for overloaded objects to be used as array indices? Overloaded objects should behave the same way as non-objects. Why does RFC 204 rule that out? RFC 204 simply specifies that a list reference as an index provides multidimensional access: $a[ [1,1] ] == $a[1][1]; I repeat: what does $a[ $ind ] does if $ind is a (blessed) reference to array (1,1), but behaves as if it were 11 (due to overloading)? RFC 81 expands on the existing operator '..' in a list context to allow more generic list generation. It is particularly useful to generate lists to act as array slices: @a[ 1..5 : 3] == @a[1,3,5]; This would seem to conflict with the meaning of '..' outlined in RFC 231. Sorry, I see no conflict. (Assuming that ternary '..' is allowed, the token tie::multi::range() would be followed by 3 numbers, not 2.) These calls will result in tied(@a)-FETCH_RANGE(tie::multi::range(), 1, 5, 3) tied(@a)-FETCH_RANGE(1, 3, 3) If FETCH_RANGE uses tie::multi::inline() to preprocess the keys, this *by definition* will result in the same array of keys. If not, it is the responsibility of FETCH_RANGE to insure the equivalence. And $a[ 1..5e6 ] would not need to create 5e6 Perl objects the only purpose of which is to inform the range extractor that it needs to create an object representing the slice. Because ',' is already special there. There is little chance that ';' operator is created as a general-purpose operator. When we first discussed ';' on the list, we looked at making it special in an index only. But the more generic approach of making it a cartesian product operator seems cleaner--it avoids 'special' meanings in favour of providing a generic operator. No, it is not a generic operator. Its behavior depends on whether it is used *inside parens*, or not. Additionally, the behaviour of cartesian product makes very little sense: if you did not want it 3 times, you should not insert it into the language. a) "Lazy generation" is not defined, as stated it is a good wish only. What is @a = (0, 2..99, 200..9998, 100); f(@a); Lazy generation is a well understood concept in other languages. Maybe. But it is not defined in the corresponding RFC nevertheless. At least: all I could deduce was that the following constructs are made synonymous: @a = ($a .. $b); tie @a, Array::Range, $a, $b; No other usage of .. is covered. b) The call for $a[2,3;5,6] is *) Put already-available SV pointers for $a, 2,3,4,5 and the cashed SV* for tie::multi::separator() on stack; *) Put the (cached) CV* for the method on stack; *) invoke the call frame; This is not *very* quick, but at least it may be "not that slow". While all the alternatives require creation of anonymous lists, which (I expect) will slow things down 7..10 times for the call above. For $a[1..100;1..100] it may easily be 100..1000 times slower. Lists of lists of known simple type are proposed by RFC 203 to be stored as true arrays (i.e. contiguously in memory). Their overhead is not the same as Perl 5 lists of lists. Maybe. But you still need to create 200-elements temporary array the only purpose of which is to inform the tied array that you need the upper-left 1000x1000 submatrix. *You do not want to create new values uncessesarily*. This is too slow. Quick operations should reuse already available values instead. See how scratchpads work... Even if it is creation of a "streamlined" array, creation still will takes much more time than operation dispatch - which is in turn painfully slow. The index in $a[1..100;1..100] should be generated lazily. This is *exactly* what my proposal is doing. The difference is that it defines what "lazily" means. *) They are not compatible with overloading (unless overloaded things are dramatically changed); There are a number of RFCs proposing substantially changing overloading. What specific changes would we need to ensure were incorporated in P6 to avoid this incompatibility? I see no way how they can be made compatible. Overloading allows objects to behave *both* as numbers and as array references. Well, maybe there is a solution: 2 new overloaded accessors in addition to '""', '0+', 'bool', '@{}', '${}' etc: "extract the value as the array/hash index", defaulting to '0+' and '""' correspondingly. *) They go very high on the bizzareness scale. Bizzare??? Which RFC? Binary ';'. RFCs 90 and 91: These builtins are in almost all languages with rich array functionality. 'merge' and 'demerge' are more frequently called 'zip' and 'unzip', but those terms were almost universally rejected on -language. These are convenience functions. I do not see what they have to do with the language design... RFC 204: Isn't it fairly intuitive that: $a[ [1,1] ] == $a[1][1]; It may be - for people who do not
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
Ilya Zakharevich wrote: On Sat, Sep 16, 2000 at 07:15:34PM +1100, Jeremy Howard wrote: Why is it important for overloaded objects to be used as array indices? Overloaded objects should behave the same way as non-objects. Why does RFC 204 rule that out? RFC 204 simply specifies that a list reference as an index provides multidimensional access: $a[ [1,1] ] == $a[1][1]; I repeat: what does $a[ $ind ] does if $ind is a (blessed) reference to array (1,1), but behaves as if it were 11 (due to overloading)? How $ind is implemented (ie the actual structure that is blessed) does not matter. What matters is what interface its class provides. If it overloads operators such that dereferencing it does not provide an array, then it shouldn't be expected to work as a multidimensional array index. If it provides operators that give it the same interface as a list ref, then it should work everywhere a list ref does. RFC 81 expands on the existing operator '..' in a list context to allow more generic list generation. It is particularly useful to generate lists to act as array slices: @a[ 1..5 : 3] == @a[1,3,5]; This would seem to conflict with the meaning of '..' outlined in RFC 231. Sorry, I see no conflict. (Assuming that ternary '..' is allowed, the token tie::multi::range() would be followed by 3 numbers, not 2.) These calls will result in tied(@a)-FETCH_RANGE(tie::multi::range(), 1, 5, 3) tied(@a)-FETCH_RANGE(1, 3, 3) If FETCH_RANGE uses tie::multi::inline() to preprocess the keys, this *by definition* will result in the same array of keys. If not, it is the responsibility of FETCH_RANGE to insure the equivalence. And $a[ 1..5e6 ] would not need to create 5e6 Perl objects the only purpose of which is to inform the range extractor that it needs to create an object representing the slice. From RFC 81: quote When a lazy list is passed to a function it is not evaluated. The function can then access only the elements it needs, which are calculated as required. Furthermore, the arguments that generated the list are available as attributes of the list, and can therefore be used directly without actually accessing the list /quote It is not necessary to create 5e6 objects. Furthermore, RFC 81 proposes syntax beyond just ($start..$stop: $step). Implementing it using tie::multi::range() followed by 3 numbers would not be enough. Anyway, we're defining a language interface here, not an implementation, so we don't really need to nail this down immediately. When we first discussed ';' on the list, we looked at making it special in an index only. But the more generic approach of making it a cartesian product operator seems cleaner--it avoids 'special' meanings in favour of providing a generic operator. No, it is not a generic operator. Its behavior depends on whether it is used *inside parens*, or not. Additionally, the behaviour of cartesian product makes very little sense: if you did not want it 3 times, you should not insert it into the language. I'm not wedded to allowing ';' outside of a list index. However, it does lead to both consistency and convenience with how list slicing is done in Perl 5: # Perl 5 behaviour @indices = (1,3); @list = (3,4,5,6); @list[@indices] = (1,2); # (3,1,5,2) # Multidim extension @2d_indices = ([0,0],[1,1]); @2d_arr = ([3,4,5],[6,7,8]); @2d_arr[@2d_indices] = (1,2); # ([1,4,5],[6,2,8]) # Slice syntax extension @2d_slice = (0..1 ; 0..1); # ([0,0],[0,1],[1,0],[1,1]) @2d_arr = ([3,4,5],[6,7,8]); @2d_arr[@2d_slice] = ([0,1],[0,1]); # ([0,1,5],[0,1,8]) The implementation of ';' when used as a list index and then thrown away clearly should not create an actual list of lists, for efficiency reasons. I don't see why this case can't be dealt with appropriately. Maybe. But it is not defined in the corresponding RFC nevertheless. At least: all I could deduce was that the following constructs are made synonymous: @a = ($a .. $b); tie @a, Array::Range, $a, $b; No other usage of .. is covered. RFC 81 defines 4 uses of C... It does not propose a specific implementation in terms of Ctie, or anything else--it simply defines a language interface. *You do not want to create new values uncessesarily*. This is too slow. Quick operations should reuse already available values instead. See how scratchpads work... Agreed. RFC 81 proposes that generated lists be memoized, and that new values are only create when required. Even if it is creation of a "streamlined" array, creation still will takes much more time than operation dispatch - which is in turn painfully slow. We should optimise special cases when we know which are causing problems. Perl 5 may or may not provide useful experience here--the operation dispatch approach in Perl 6 may be quite different, given how the -internals discussions are progressing. RFC 204: Isn't it fairly intuitive that: $a[
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
On Sun, Sep 17, 2000 at 11:07:09AM +1100, Jeremy Howard wrote: I repeat: what does $a[ $ind ] does if $ind is a (blessed) reference to array (1,1), but behaves as if it were 11 (due to overloading)? How $ind is implemented (ie the actual structure that is blessed) does not matter. What matters is what interface its class provides. As I said: it provides *both* numeric value and list reference interface (as complex values may do). quote When a lazy list is passed to a function it is not evaluated. The function can then access only the elements it needs, which are calculated as required. Furthermore, the arguments that generated the list are available as attributes of the list, and can therefore be used directly without actually accessing the list /quote f(1, 10..1e6, 1e8..2e8, 1e9) How can the body of f() query the "attributes" to see that it got something lazy? Furthermore, RFC 81 proposes syntax beyond just ($start..$stop: $step). Implementing it using tie::multi::range() followed by 3 numbers would not be enough. Another example of "bizzare" (and not completely defined) interface. I would think it stands a very little chance to become reality. Apparently, the authors of RFC81 assume that iterators become better integrated if they are introduced by a funny syntax. Since what they want to accomplish is exactly this... my $iter = new iterator start = $a, next = sub {}; foreach my $i (each $iter) {...} Here an iterator is something which overloads '' (in Perl5 speak). A way to integrate iterators would be very convinient indeed. As you see, in principle it does not need any funny syntax... Anyway, we're defining a language interface here, not an implementation, so we don't really need to nail this down immediately. No, an interface without a feasible implementation in mind is not viable. # Slice syntax extension @2d_slice = (0..1 ; 0..1); # ([0,0],[0,1],[1,0],[1,1]) This is very expensive. Do you know any example when such a list is needed as a final result, not as a temporary? @a = ($a .. $b); tie @a, Array::Range, $a, $b; No other usage of .. is covered. RFC 81 defines 4 uses of C... Sorry, the only context which I could find is the one above. The index in $a[1..100;1..100] should be generated lazily. This is *exactly* what my proposal is doing. The difference is that it defines what "lazily" means. Except that your proposal changes the language interface. In particular, it doesn't allow the creation of contiguous slices, AFAICS. @a[1..100;1..100] should refer to the whole box bounded by (1,1) and (100,100). I have no idea what you are talking about. What else can it *mean* but the whole box? Having different calling conventions does not mean that the *results* are different. It's very important. It shows that a particular syntax is intuitive enough that it is understand by people with a wide range of backgrounds. Intuitive syntax is an important language design goal. The syntax and the access-semantic of RFC81 and of RFC231 are the same. However, RFC231 explain how this semantic can be achieved via simple tie() interfaces. RFC 231 does not (yet) effectively cover the same range of problems that the array RFCs do. We need multidimensional slicing (not just multiple indexing) Is in RFC231. flexible list generation, This is orthogonal. And I do not see why this is needed to be in the core language at all. I would guess that an appropriate module with interfaces to generate efficient arrays is a better place for this (see the example above). multiple levels of indirection, Do not know what you mean here. and fast and compact reshaping. If we want the reshaping to be supported by builtin arrays *and* transparently by overloaded arrays, then yes, it is needed to be in the core. But I see no need for this. Again, this looks as belonging to a module, not to the core. Ilya