Re: Berkeley Help
Ask the usenet group comp.lang.perl.misc. This list is only for discussion of the design of the upcoming Perl 6. - Original Message - From: Phil Daws [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Wednesday, October 17, 2001 1:54 AM Subject: Berkeley Help Hi: I have a HASH file that contains : domain1.co.uk ACCEPT domain2.co.uk ACCEPT domain3.co.uk ACCEPT I am using the following code to check that a key exists. The problem is that it never finds it! What am I doing wrong??? - $file = /file.db; $db = tie(%stats, DB_File, $file, O_CREAT|O_RDWR, 0666, $DB_HASH) || die (Cannot open $file); $key = 'domain1.co.uk'; print Exists\n if exists $stats{$key}; untie %stats ;
Re: Hyperoperators and RFC 207
[EMAIL PROTECTED] wrote: @arr3 = @arr1[^i] + @arr2[^i] # also @arr[^i] = @arr1[^i] + @arr2[^i] Hyper-operators do this just fine. Oh yes they do. The point is that the ^i-loop way is better (more powerful and simpler at the same time). Maybe the examples where not good enough. Your examples were good enough. But they are not a replacement for the RFC, so please, anyone who wants to argue that array-threading notation is not useful, read the RFC first. Here's a direct link to it: http://dev.perl.org/rfc/207.html Hyper-operators (i.e. element-wise array ops) and explicit array threading are not incompatible. In fact, I wrote RFC 82 (element-wise array ops) and was heavily involved in RFC 207 (Buddha Buck and I wrote most of RFCs 202-207 together, but we had to put one of our names on the individual RFCs). RFC 207 explicitly discusses the relationship with RFC 82. As well as providing compact syntax for commonly-used tensor operations, array-threading notation can make life easier for the optimiser when it comes to generating tight loops (which is absolutely required for useful data crunching). Examples of the syntax are abundent in the RFC. Examples of the optimisation improvements can be seen in NumPy's ufunc/fromfunc: http://starship.python.net/~da/numtut/array.html#SEC8 http://starship.python.net/~da/numtut/array.html#SEC13 ...and in Perl Data Language's thread-aware functions: http://pdl.sourceforge.net/PDLdocs/Indexing.html#Threading ...amongst many other examples.
HOF positional args (was Re: reduce via ^)
Damian Conway wrote: @a ^+= reduce {$^a+$^b} @b; What's this? Are positional args to HOFs now alphabetic rather than numeric? cf. http://dev.perl.org/rfc/23.html#Positional_placeholders
Re: Hyper-operators and Underscore
Erik Lechak wrote: 2) RFC 082: Arrays: Apply operators element-wise in a list context (hyper operators) and 3) Special variable representing index of array in foreach structure $# maybe (not in apocolypse3, I think) Hi Eric. Thanks for your comments. Unfortunately it's a little early for detailed consideration of vector and matrix operations, which will be covered in more detail in a later Apocolypse. However, here's a couple of pointers from the RFCs that may appear in some form in Larry's later musings. First, check out: http://dev.perl.org/rfc/207.html This provides looping indices for multidimensional arrays. Please read RFCs 202-206 first though, since RFC 207 draws heavily on them. Next, note that hyper-operators are a bit smarter than the behaviour you indicate. Most notably, they 'broadcast' lower dimensionality structures to the highest dimensionality structure. We're only up to 1-dim structures in the Apocolypses so far, so for now that just means that @a = @b ^+ 1; works as it should. This is a more interesting issue when, for instance, broadcasting a vector across a matrix, but now I'm getting ahead of myself... Finally, note that there is an RFC which can simplify iteration through lists n-at-a-time: http://dev.perl.org/rfc/90.html So you can then pull the first element from each list with: my ($a,$b,$c) = merge(@a,@b,@c)
Re: Math functions? (Particularly transcendental ones)
Uri Guttman wrote: BS == Benjamin Stuhl [EMAIL PROTECTED] writes: Can anyone think of things I've forgotten? It's been a while since I've done numeric work. BS ln, asinh, acosh, atanh2? dan mentioned log (base anything) but i don't recall ln. and definitely the arc hyberbolics are in after i pointed them out. dunno about atanh2. We only really need ln(). Then [log(y) base x] is simply [ln(y)/ln(x)]. There's no need to have separate functions for different bases.
Re: finalization
Sam Tregar wrote: On Wed, 29 Aug 2001, Jeremy Howard wrote: The answer used in .NET is to have a dispose() method (which is not a special name--just an informal standard) that the class user calls manually to clean up resources. It's not an ideal solution but there doesn't seem to be many other practical options. Well, there's the Perl 5 reference counting solution. In normal cases DESTROY is called as soon as it can be. Of course we're all anxious to get into the leaky GC boat with Java and C# because we've heard it's faster. I wonder how fast it is when it's halfway under water and out of file descriptors. I don't think speed is where the interest is coming from. GC should fix common memory problems, such as the nasty circular references issue that has caught all of us at some time.
Re: [nice2haveit]: transpose function
David L. Nicol [EMAIL PROTECTED] wrote: Yes, exactly. I would like to have a transpose operator, which will work on a list of hash refs, so this: $solids = [1..7]; $stripes = [9..15]; foreach (transpose($solids,$stripes)); print the $_-[0] ball is the same color as the $_-[1]\n; RFC 272 proposes a transpose function: http://dev.perl.org/rfc/272.html Also see the proposal for merge(): http://dev.perl.org/rfc/90.html
Re: array/hash manipulation [was :what's with 'with'?]
raptor [EMAIL PROTECTED] wrote: but now I'm looking at these too... http://dev.perl.org/rfc/90.pod http://dev.perl.org/rfc/91.pod http://dev.perl.org/rfc/148.pod so may be what must be the order of passing the arguments and other stuff should be done via these proposed functions. PS. I was thinking of that before, what if we have something let's call it 'transform' for transformation of any structure to other structure.. but as i thought it should combine in some way the features of switch,if-else,for/foeach, do, while, array/hash-slices, assignment etc ps I'm talking about DWIM operator. anyway... is it possible to really add such dwim function/operator that can be modified on the fly so that it suit all programmers tastes and don't make real mess...) The generalised version of these is here: http://dev.perl.org/rfc/81.html In particular, see this section: http://dev.perl.org/rfc/81.html#JUSTIFICATION This RFC suggests a syntax for list comprehension, which when used as a slice index provides flexible data structure transformation. Another useful transformation is provided by the cartesian product operator described here: http://dev.perl.org/rfc/205.html HTH, Jeremy
Re: array/hash manipulation [was :what's with 'with'?]
John Porter wrote: Sterin, Ilya wrote: Don't really know which would be more helpful, since I first need to find a scenerio where I would use this facility, then what result would I expect once the shortest list runs out. Let us ask the PDL folks. In fact, I'm quite sure this has been done already. Well, I'm not a PDL folk, but I'm a p6-data folk so perhaps I qualify. The interest in non-matching indices comes in 'broadcasting', which, assuming the element-wise operators mentioned by Larry, works like this: @b = (1,2,3); @c = (2,4,6); @d := @b :* @c; # Returns (2,8,18) @e := 3 :* @c;# Returns (6,12,18) Notice that the scalar '3' is 'broadcast' across the length of @c just as if it was the list (3,3,3). Or if you prefer text-crunching examples to number crunching, it works like this: @people = ('adam', 'eve ', 'bob '); @scores = (7,9,5); # Score for each person @histogram := '#' :x @scores; # Returns ('xxx','x','x') print join(\n, @people . ' ' . @histogram); Notice that the scalar '#' has been broadcast across the length of @scores. For more information, see: http://dev.perl.org/rfc/82.html#Broadcasting which explains the more interesting case of multidimensional broadcasting. Note that this RFC is a little dated now, in that Larry has proposed the adverb ':' to mean apply element-wise, so the examples in the RFC really need a ':' added before all the operators. Other than that nothing should need to change. For Pyton's implementation of this concept, see: http://starship.python.net/~da/numtut/array.html#SEC19 See also implementations in J and APL (which is the best role model), PDL, functional languages like Haskell, and mathematical languages like Mathematica.
Re: array/hash manipulation [was :what's with 'with'?]
Sterin, Ilya [EMAIL PROTECTED] wrote: Hmmm. Didn't think about that. That would be a nice way, that way you can manipulate it's behaviour depending with how many aliases you provide. for my $el1, $el2 ( (@foo, @bar) ) { print $el\n } $el1 and $el2 would of course be aliases, right? I don't think that this special purpose notation is necessary. With the improved 'want' proposed by Damian, the following should be easy to achieve: @a = (1,2,3,4); for ($b,$c) (@a) { print $b $c} # prints: # 1 2 # 3 4 %d = (a=1, b=2); for ($b,$c) (@a) { print $b $c} # prints: # a 1 # b 2 Which with the merge() RFC makes the desired behaviour for multiple lists easy: @a = (1,2); @b = (3,4); for ($b,$c) merge(@a,@b) { print $b $c} # prints: # 1 3 # 2 4 So, no really new syntax, no special purpose behaviour, just the obvious extension of for-iterators to list context, and the introduction of one new function (which happens to have many other applications).
Re: aliasing - was:[nice2haveit]
raptor [EMAIL PROTECTED] wrote: ... the idea of aliasing is to preserve the fast access and on the other side to shorden the accessor(i.e the way to access the structure) and make code clearer.(mostly u can choose a name that has better meaning in your context) This reminds me... another way to to shorten the accessor discussed in the RFC process was something like Delphi VB's 'with' syntax: VB with Application.ActiveSheet .cells(1,1) = Title .language = English end with Application.ActiveSheet.cells(2,1) = Slow way /VB I can't remember if this actually found its way into an RFC--anyone have a reference? I could envisage this prototyped in P5 with a source filter to deal with a syntax like this: Pseudo-Perl with $XL-{Application}-{ActiveSheet} { -cells(1,1) = Title -language() = English } /Pseudo-Perl Does such a thing exist already?
Re: You can't make a hot fudge sundae with mashed potatoes instead of ice cream, either.
Michael Schwern wrote: mjd tricked me into reading his Strong Typing Doesn't Have To Suck talk, and now I'm looking at the typing proposals for Perl 6 and thinking... boy, its going to be almost as bad as C. That sucks. Is there hope? I dunno, but read the talk. http://perl.plover.com/yak/typing/ Schwern tricked me into reading mjd's Strong Typing Doesn't Have To Suck which turned out to be a really well written exposition of how modern functional languages (mjd used ML as his example) use type checking to flag program errors without losing algorithm flexibility. One of mjd's points about mashed potatoes is that Perl isn't ML, and ML's typing approach doesn't fit on top of Perl very well (i.e. at all). His other point about mashed potato is that it is a poor vector for hot fudge, IIRC... Stroustrup noticed the same thing (about typing, not mashed potatoes) when looking at this issue for C++. His solution was the introduction of 'templates': http://babbage.cs.qc.edu/STL_Docs/templates.htm which are now used widely by C++ programmers when they access libraries such as the STL (which is part of the Standard Library) and the very cool POOMA: http://www.acl.lanl.gov/pooma/ Because templates provide much-needed flexibility in algorithm and class development, C++ programmers don't have to use many of the workarounds that mjd identified. Therefore, far more compiler warnings represent real problems, which means that programmers are less likely to ignore them. Perl 5 didn't need templates, because there wasn't compile-time typing. But with Perl 6 I want to send my compact array of integers to the same fast sum() function as my compact array of floats, and not have to wait while perl treats them both as old generic scalars. That means that my sum() function needs a typed parameter list. There seems to be at least two potential solutions: - Provide a type placeholder in the parameter list (a la C++ function templates) - Provide a type hierarchy for all types (a la Haskell) - (And that 3rd option that I haven't thought of yet...) I don't remember seeing either of these suggestions in the RFCs, but I might have forgotten since I occassionally fail all 361 of them in my head. A hierarchy of types is briefly referred to in RFC 4 but not really developed to deal with this issue: http://dev.perl.org/rfc/4.html
Re: Generalizing value property setting to become postits
What I was suggesting was to consider broadening what the $foo : bar style postfix sub syntax allows/assists bar to do, so that bars can be used to set properties OR do other stuff. What's the practical utility of this? This discussion has been pretty abstract so far... It's easy to see how properties can be used, since we've already used attributes in p5 for all kinds of stuff. Can you give an example or two of problems (with code) that the generalised postfix sub syntax would make easier to solve? Otherwise, I see a possibly interesting twist in which bar can do things beyond property setting, in particular, change $foo's value. Once one takes that step, : can become a generalized apply to value(s) character, and the next natural step is: @foo := bar;# iterate over @foo, applying bar to values. Actually, Larry has already indicated in an earlier message to the list that ':' will work a lot like that! The examples he has shown to date basically allow ':' to be a modifier to allow element-wise array operations, much like: http://dev.perl.org/rfc/82.html A good way to understand the possibilities with this kind of syntax is to examine other languages that allow their 'verbs' (operators / subroutines) to be modified by 'adverbs'. J and APL are probably the definitive source here. Although they don't need a ':' adverb, since they by _default_ apply all operations across an array argument, you can imagine how Perl could benefit from an insert adverb: http://www.jsoftware.com/primer/insert_adverb.htm or a table adverb: http://www.jsoftware.com/primer/table_adverb.htm
Re: Larry's Apocalypse 1
Dan Sugalski wrote: At 09:40 PM 4/6/2001 +0100, Richard Proctor wrote: On Fri 06 Apr, Dan Sugalski wrote: This is, I presume, in addition to any sort of inherent DWIMmery? I don't see any reason that: @foo[1,2] = STDIN; shouldn't read just two lines from that filehandle, for example, nor why Fair enough @bar = @foo * 12; shouldn't assign to @bar all the elements of @foo multiplied by 12. (Though others might, of course) Reasonable, but what should @bar = @foo x 2; do? Repeat @foo twice or repeat each element twice? (its current behaviour is less than useless, other than for JAPHs) I'd go for repeat every element twice. So would I: http://dev.perl.org/rfc/82.html ...which includes some other examples: snip =head1 EXAMPLES =head2 Text processing If @first_names contains a list of peoples first names, and @surnames contains their surnames, this creates a new list that concatenates the elements of the two lists: @full_names = @first_names . @surnames; To quote a number of lines of a message by prefixing them all with ' ': @quoted_lines = ' ' . @raw_lines; To create a histogram for a list of scores: @people = ('adam', 'eve ', 'bob '); @scores = (7,9,5); # Score for each person @histogram = '#' x @scores; # Returns ('xxx','x','x') print join("\n", @people . ' ' . @histogram); adam xxx eve x bob x =head2 Number crunching This snippet multiplies the absolute values of three arrays together and sums the results, in a very efficient way: @b = (1,2,3); @c = (2,4,6); @d = (-2,-4,-6); $sum = reduce ^_+^_, abs(@b * @c + @d); Lists can be reordered or sliced with list generation functions (RFC 81) allowing flexible data manipulation: @a = (3,6,9); @reverse = (3..1); @b = @a * @a[@rev]; # (3*9, 6*6, 9*3) = (27,36,27) Slicing plus array operations makes matrix algebra easy: @a = (1,2,3, 2,4,6, 3,6,9); @column1of3 = (1..7:3); # (1,4,7) - every 3rd elem from 1 to 7 @row1of3 = (1..3); # (1,2,3) $sum_col1_by_row1 = sum ( @a[@column1of3] * @a[@row1of3] ); # (1*1+2*2+3*3)=14 /snip
Re: RFC from a newbie: Method References
Michael G Schwern wrote: Hmmm... an object which contains a method reference which contains a referent to itself. Yup. I don't know why some people think that circular references are complex ;-) Something like this would be nice in a class that creates method references--it would simply need to keep a list of referred objects, and have an explicit destructor that iterates through the references and undefs them. Of course, calling the destructor would be optional where no circular reference exists. Yes, you could keep a list/hash of what you created (as weak references) and explicitly destroy them but I don't think that would help. Consider the following... following deleted... Which doesn't solve the problem... and I don't have any better ideas. No, neither do I. So far I've been undef'ing method references semi-manually (since there are specific event callback hooks in any widget, I just need to undef these), but this isn't Lazy. Maybe a flash of inspiration will come over Christmas...
Re: RFC from a newbie: Method References
Michael G Schwern wrote: On Sun, Dec 17, 2000 at 12:11:01PM +1100, Jeremy Howard wrote: Something to be careful of--it's easy to create a circular reference when using method pointers. As a result, neither the referrer nor referee objects are ever destroyed. When using method references it's important that your class keeps track of its references, and explicitly removes them when done to allow object destruction to occur. Perhaps this could be incorporated into Class::MethRef? Circular references make my brane hurt, but I don't think there's any circularlity here. Once the method reference falls out of scope, so will its reference to the object. Then the object will be garbage collected when it to falls out of scope. The other way around, if the object falls out of scope first, the method reference will still work and the object will finally be destroyed when the meth ref goes away. The only snag is if you put the method reference on the object its refering to, that'll be circular. I'll remember to document that caveat. There's not necessarily any circular reference. The problem is that method references are frequently used to implement event callbacks. This would generally look something like this: do { my $a = new A; my $b = new B; $b-{on_click} = sub{$a-clicked($b);}; $b-do_click; }; print "Out of scope\n"; package A; sub clicked {my $self=shift; my $clicked_by = shift; print "I am clicked\n"; } sub new { my $Proto = shift; my $Class = ref($Proto) || $Proto; my $Self = {}; bless ($Self, $Class); return $Self; } sub DESTROY {print "bye A\n";} package B; sub do_click { my $self=shift; print "Clicking\n"; $self-{on_click}-(); } sub new { my $Proto = shift; my $Class = ref($Proto) || $Proto; my $Self = {}; bless ($Self, $Class); return $Self; } sub DESTROY {print "bye B\n";} The problem is that in this case the destructors are not called immediately after the objects go out of scope--this code prints: Clicking I am clicked Out of scope bye A bye B ...which creates a memory leak in an event loop. The circular reference is generally necessary, because the callback needs to know in what form the click occurred (on object of class A in this case) and what widget was clicked (an object of class B in this case). However, the circular reference is not obvious at first glance--I've seen this kind of memory leak frequently in mod_perl programs, for instance. My objects that get method references always get an explicit destructor, which removes all method references (which are tracked as they are created). Something like this would be nice in a class that creates method references--it would simply need to keep a list of referred objects, and have an explicit destructor that iterates through the references and undefs them. Of course, calling the destructor would be optional where no circular reference exists. Sorry this explanation isn't as clear as it should be--to many pints of Guinness tonight ;-)
Re: RFC from a newbie: Method References
Michael G Schwern wrote: package Class::MethRef; use strict; sub meth_ref { my($proto, $method, @args) = @_; return sub { $proto-$method(@args) }; } So this... my $meth_ref = $obj-meth_ref('foo', @some_stuff); $meth_ref-(); is equivalent to this.. $obj-foo(@some_stuff); You could even make the meth_ref take additional (or overriding) arguments. Its a good idea, I'll put Class::MethRef on CPAN soon. Something to be careful of--it's easy to create a circular reference when using method pointers. As a result, neither the referrer nor referee objects are ever destroyed. When using method references it's important that your class keeps track of its references, and explicitly removes them when done to allow object destruction to occur. Perhaps this could be incorporated into Class::MethRef?
Re: Acceptable speeds (was Re: TIL redux (was Re: What will the Perl6 code name be?))
Uri Guttman wrote: "DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS So unless we come up with something concrete, the goals are: DS 1) A nebulous ~10% faster DS 2) Faster in the things that annoy Dan the most DS 3) Faster in the OO bits the folks upstairs from me use 4. faster internal and language level I/O (of course driven by AIO. :) 5. faster startup via bytecode and/or TIL 6. Quantum::Superposition::ForReal 7. Faster operations on numeric arrays (eg add 2 lists of numbers) BTW, although Quantum::Superposition::ForReal is tough (because it relies on existentially-challenged Quantum Computers), Quantum::Superposition::PrettyDamnFast is a possibility. Using pipelined operations on single CPU machines, and parallelising on multi-CPU machines, would be quite effective for Q::S.
Re: List Comprehensions (from Python)
Simon Cozens wrote: On Sun, Oct 08, 2000 at 01:12:13PM +0100, raptor wrote: [expression for variable in sequence] Can this be done easly at the moment OR via some of the new proposals ?!!!? map { expression } sequence See also RFC 81.
Re: Variable attributes (was Re: RFC 355 (v1) Leave $[ alone.)
Dan Sugalski wrote: At 11:33 AM 10/1/00 -0700, Peter Scott wrote: But, setting aside my visceral reaction to changing array bases, you have precisely the same problem here that has scuppered my intent to file an RFC for hashes with fixed keys; how do you apply the attribute to anonymous, let alone autovivified, arrays? If I say my @a :base(1); then what of $a[1][1]? How to specify that the second level array also has a base of 1? Without it looking really ugly? Well, it'd be reasonable for autovivified arrays and hashes to inherit the properties of their parent, so if you had: my int @foo; and accessed $foo[1][2], that new autovivified array would be of type int. That's exactly what we've proposed for compact multidimensional arrays, for instance (from RFC 203): quote A list (of lists...) that contains elements of the same type can be converted to an array by specifying its type: my @some_LOL = ([1,2], [3,4]); my int @array = @some_LOL; /quote I haven't got around to RFCing the more generic version (that all attributes are inherited inside nested data types), but that would certainly be a nice approach.
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
Karl Glazebrook wrote: Ilya Zakharevich wrote: On Thu, Sep 28, 2000 at 11:39:51AM -0400, Karl Glazebrook wrote: so what is wrong with the statement '@y = 3*@x;' then ? That other constructs *also* create an array context, in which the behaviour of multiplication you propose is not appropriate. for example? A prototypeless-function call. get rid of them all!! Please no! Anything that makes it harder to write 'quick-and-dirty' scripts is never going to fly--this is part of what makes Perl special. I would like to see array operations occur inside prototypeless function calls, which as Ilya notes already creates array context. This is not fundamentally 'inappropriate', although it is a change from P5. It just means having to type 'scalar @arr' when that's what you mean--and having the P52P6 converter do the same.
RFC Freeze
In 2 and a bit days all RFCs must be frozen--those not frozen will be auto-retracted by the librarian! So, could you please freeze your RFCs--the following have some still outstanding: Ilya Zakharevich [EMAIL PROTECTED] 8 David Nicol [EMAIL PROTECTED] 2 Buddha Buck [EMAIL PROTECTED] 1 pdl-porters team [EMAIL PROTECTED] 1 Nathan Torkington [EMAIL PROTECTED] 1 Total 13 The list is here: http://dev.perl.org/rfc/overdue-perl6-language-data.html When freezing your RFC, please include a discussion section at the top summarising the feedback received. If there are outstanding issues, you may want to post a message to the list asking for clarification of ideas before submitting the frozen RFC. Feel free to shoot me an email if you want any clarification of this process. The -meta archives also contain more information about this.
Re: RFC 282 (v1) Open-ended slices
=head1 TITLE Open-ended slices ... @thingy = function() for (@thingy[3..$#thingy]) { ... } Horrible, isn't it? People want something better. I thought about it last year or so, and produced a couple of patches. It seemed then that the right syntax was not, for instance: (function())[3...-1] because sometimes you want C$x..$y to return the empty list, but actually: (function())[3...] (Or C[3..]. It doesn't matter.) The same syntax is proposed in RFC 205 to allow getting a whole slice of an array. It also appeared in RFC 24 which suggested allowing (0..) anywhere that C.. is used. RFC 24 was withdrawn after it became clear that there were too many cases where this behaviour was bizarre. By restricting this behaviour to within an index, I think that we avoid the problem. Can we extend RFC 282 so that it allows the right operand of C.. to be omitted in any index, since the upper-bound can be implied? Or does it already propose this? (...in which case please give an example of an open-ended slice on an array rather than directly on a function returning a list.)
Re: RFC 272 (v1) Arrays: transpose()
[EMAIL PROTECTED] wrote: Jeremy Howard wrote: So where is mv(), you ask? If you use the 'reorder' syntax, but don't specify all of the dimensions in the list ref, then the remaining dimensions are added in order: That sounds good. I'd say why not also allow the mv syntax? It is syntactically different from the others, may be the least often used variant but then there are N+1 ways to do it ;) But no strong feelings either way... It might be best avoided, because it's weird to see hash refs used as parameters to builtins in this way. I'll tell you what... I'll add it in as an 'optional extra'. That way people who don't like it can't use it as an excuse to belittle the whole proposal...
Re: RFC 275 (v1) Add 'tristate' pragma to allow undef to take on NULL semantics
=head1 TITLE Add 'tristate' pragma to allow undef to take on NULL semantics ... The Ctristate pragma allows for undef to take on the RDBMS concept of CNULL, in particular: 1. Any math or string operation between a NULL and any other value results in NULL Any math or string or logical operation... 2. No NULL value is equal to any other NULL 3. A NULL value is neither defined nor undefined Can we have an isnull() function then, please. Otherwise there's no operation to test for nullness. PS: Nullness? Nullility?
Re: RFC 275 (v1) Add 'tristate' pragma to allow undef to take on NULL semantics
Nathan Wiger wrote: Jeremy Howard wrote: Can we have an isnull() function then, please. Otherwise there's no operation to test for nullness. PS: Nullness? Nullility? ... use tristate; $a = undef; carp "\$a is null!" unless defined($a); That way, "defined($a)" will always return false for the above, regardless of the "use tristate" pragma. It seems more consistent in keeping with the "undef really means uninitialized" idea too. Yes, that's nullirific! It would be good to make that clear in the RFC.
Re: RFC 272 (v1) Arrays: transpose()
[EMAIL PROTECTED] wrote: How about (if perl6 allows passing arrays implicitly by reference without arglist flattening) transpose @arr, $a, $b; # xchg transpose @arr, {$a = $b}; # mv transpose @arr, [0,3,4,1,2]; # PDL reorder You know, I had just logged in to post almost the same suggestion! Only I was going to suggest just: transpose $a, $b, @arr; # xchg transpose [0,3,4,1,2], @arr; # PDL reorder (I'm not assuming the no-flattening thing, since that's another source of angst altogether!) So where is mv(), you ask? If you use the 'reorder' syntax, but don't specify all of the dimensions in the list ref, then the remaining dimensions are added in order: # Where @arr is a rank 5 array... transpose ([3], @arr) == transpose ([3,0,1,2,4], @arr); transpose ([0,3], @arr) == transpose ([0,3,1,2,4], @arr); This doesn't give a shortcut like PDL's $arr-mv($a,$b) if $b0, but normally you would use $b = 0 anyway. For $b0, you would have to specify the whole list up to $b (like the 2nd example above). BTW, I notice that you're using dimension numbering starting at 0 for your transpose() examples. Is everyone happy to start at 0 rather than 1?
Re: RFC 83 (v3) Make constants look like variables
Nathan Wiger wrote: Jeremy Howard wrote: Good question. I haven't tackled this in RFC 83, because it is a more general question about attribute syntax. We don't really have a good attribute syntax RFC yet, although Nate threw up some ideas a couple of days ago. Is someone interested in whipping up an RFC that combines Nate's suggestions with clarification of issues such as this? Nate is! ;-) I'm on it, should be out today... Excellent! I thought you were in the RFC recovery program--but you couldn't resist writing 'just one more', could you? ;-)
Re: RFC 272 (v1) Arrays: transpose()
Karl Glazebrook wrote: you should look at the PDL mv() and xchg() methods and factor this into your thinking! Actually, the RFC is based on PDL's xchg()! I forgot to document using negative numbers to count from the last dimension--I'll add that into the next version. Are there any other differences with xchg() that you think would be useful? I haven't used mv() before, but now I look at it I can see it's pretty interesting. Is this used much? If we add it to the RFC, do you think we'd want a separate function, or add another arg to transpose: transpose([$a,$b], 0, @arr); # xchg transpose([$a,$b], 1, @arr); # mv
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
Ilya Zakharevich wrote: Moveover, $x = 3 * @_; suddently being equivalent to $x = @_; does not look very promising... Why are these equivalent? RFC 82 only applies in list context. Am I missing something?
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
Karl Glazebrook wrote: Ilya Zakharevich wrote: You are trading a frequently used shortcut @a == 1 + $#a for a rarely-used-but-beautiful/intuitive semantic. I'm not sure it is a win. It's now boiling down to a matter of opinion and we'll have to agree to differ. Of course I use array arithmetic all the time as a heavy PDL user. It's not just for number-crunchers either. Array notation greatly simplifies many frequently used operations. For instance (from RFC 82): quote @people = ('adam', 'eve ', 'bob '); @scores = (7,9,5); # Score for each person @histogram = '#' x @scores; # Returns ('xxx','x','x') print join("\n", @people . ' ' . @histogram); adam xxx eve x bob x /quote Array notation is not 'rarely used' in languages that support it--in fact, operations are applied to arrays and lists at least as often as scalars in most code I see written for Mathematica, J, PDL, and so forth.
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
Ilya Zakharevich wrote: Are you trying to convince me/us that is going to be used often? Yes, I am. You made the unsupported statement that array operations are rarely used. I'm suggesting otherwise (although to say that they're rarely used in Perl 5 is a tautology, of course!). Array notation is not 'rarely used' in languages that support it--in fact, operations are applied to arrays and lists at least as often as scalars in most code I see written for Mathematica, J, PDL, and so forth. a) You can *already* use vectors as scalars in Perl; That's not what RFC 82 is proposing. b) What we are discussing is Perl, not Mathematica, J, PDL, and so forth. These languages have a very narrow niche. That's because few such languages provide strong general purpose programming features as well. They are either limited maths-oriented languages (like Mathematica) or add-ons to general purpose languages that aren't fully integrated (Python/NumPy; Perl/PDL; C++/Blitz++). Many Perl users operate on lists of data. Requiring explicit loops every time a programmer wants to operate on a list is asking the programmer to fit in with how a computer thinks. That's not right.
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes andslices
Christian Soeller wrote: Karl Glazebrook wrote: Buddha Buck wrote: @x = 3 * $y[|i]; It's not as clean as @x = 3 * @y, but it is cleaner context-wise. And one could argue that: @x = map 3*^_, @y; is cleaner yet... PDL already allows $x = 3*$y why step backwards? Exactly. Those other solutions are plain ugly. Slightly different messages are conveyed in this regard anyway. Dan Sugalski wrote at some stage that overloading for arrays is a feature very likely to be in perl6 while Ilya says this will add to confusion. ?! Yes, Dan and others have been involved from the start and have regularly provided feedback on what is viable... But let's not second guess the capabilities of the -internals guys. We should specify a good language design that is conceptually viable. This means that our proposals should be internally consistent, provide a simple and intuitive interface, and avoid problems that Computer Science has not yet solved. In the example of array operations, element-wise operations, including the broadcasting of lower dimensionality arrays (and scalars) to to higher dimensions, is one of the most powerful constructs in languages like NumPy and J and is well proven as a powerful and viable approach. We can propose the best semantics we can develop, which provides a platform for iteration between design and implementation that has a clear goal. Compromises in implementation are then also more clear.
Re: RFC 204 (v2) Arrays: Use list reference for multidimensional array access
Bart Lateur wrote: Hmm... the problem is, I think, that array references and ordinary scalars are both scalars. That's true, but they're scalars with different interfaces. In particular, an array ref can be dereferenced and provides an array in doing so. If an index can do this, then it's a multidimensional index. Of course, this isn't how it would actually be handled internally, since it's inefficient, but this is how its language interface would look. What would be the difference between $a[2] and $a[[2]] anyway? Aren't these just the same? Nearly! From the RFC: quote When a listref is used to index a list of lists, the returned list reference is automatically dereferenced: my @array( [0,1], [1,2]); my @a = @array[[0]]; # Returns (0,1), _not_ [0,1] /quote Indexing with an integer doesn't have this feature. I considered proposing adding it (so that all list context assignments dereference the rvalue if it's a list ref), but I think it would lead to too many incompatibilities with P5. If so, why not grab back into the old box, and get the syntax for "multidimensional hashes" in perl4? single dimension: $hash{$item} 2 dimensions: $hash{$item1, $item2} Note that because of the '$' prefix, this cannot be confused with a narray slice: hash slice, not multidimensionanl hash: @hash {$item1, îtem2} So, the similar syntax for ordinary arrays would then be: $array[2, 3] not $array[[2, 2]] Please feel free to corrct me if I'm wrong. The difficulty with this is that it's not clear how to specify a multidimensional slice. An index is either multidimensional, *or* a slice, but not *both*.
Re: scalars vis-a-vis non-scalars
Ed Mills wrote: These would be perlish, nice, terse, succint, and economical: ... ($i, $i, $k) += 2; @nums = 10 * @nums; These are both covered by RFC 82.
Re: RFC 83 (v3) Make constants look like variables
Greg Boug wrote: Apologies if these comments have already been noted... my $PI : constant = 3.1415926; my @FIB : constant = (1,1,2,3,5,8,13,21); my %ENG_ERRORS : constant = (E_UNDEF='undefined', E_FAILED='failed'); Constants can be lexically or globally scoped (or any other new scoping level yet to be defined). Just curious about how the use of 'my' and 'local' would be brought into this. Assume for the moment you don't use strict vars in a script (why you wouldn't, I don't know, but some people don't...) Would that mean that: $notusingstrict : constant = 1.2345; would be legal? Personally, I would prefer to enfore the use of strict vars for constants, as it is then obvious as to the scope... Good question. I haven't tackled this in RFC 83, because it is a more general question about attribute syntax. We don't really have a good attribute syntax RFC yet, although Nate threw up some ideas a couple of days ago. Is someone interested in whipping up an RFC that combines Nate's suggestions with clarification of issues such as this? But then again, I'd also prefer use strict to be the default... ;-) Also, presumably the following: my $notconstant = 1.2345; my $const : constant = $notconstant; would cause $const == 1.2345 and remain constant at that value... Yes, this doesn't effect $notconstant in any way. It simply creates a new constant with the value of 1.2345. Constants such as this can't be inlined at compile time, so instead they would be implemented as scalars with a constant flag.
Re: Notice of intent to freeze RFCs 204, 206, and revise 207
Buddha Buck wrote: On RFC 204 (LOL refs as indices), I have followed the discussion from Ilya that list references will have problems when objects that used blessed references to lists as their internal representation are used as indices. This does indeed seem to be a problem, but I'm uncertain how big of a problem. Would it help if the RFC stated that the index had to be either a scalar integer or an ARRAY ref of integers? Since objects would be blessed as something other than ARRAY, they would need to be converted first. If it was an object, it would try to call standard methods to convert to a scalar integer, a list of integers, or an ARRAY ref of integers. Just an idea. There are two options to resolve any potential ambiguity: 1. Require that LOLs as indexes be unblessed, or 2. Define interface precedence to resolve ambiguity (1) is obvious. (2) is simply a case of defining precedence such as: "If a scalar used as an array index overloads operators such that it has both a LOL interface, and a integer interface, it is treated as an LOL for the purpose of array indexing" I prefer (2), because an object with an LOL interface should act just like an LOL, and work anywhere an LOL works. On RFC 207 (efficient array loops), based on discusion and additional thought on my part, I want to clarify and change the syntax used in the RFC. I also want to go into more detail about how the scope of the efficient array loops is derived. I agree with all of your proposed changes. Also, incorporate the rules that define the width of the implied loop, that you included in an earlier email to the list.
Re: RFC 206 (v2) Arrays: @#arr for getting the dimensions of an array
Bart Lateur wrote: On 20 Sep 2000 04:07:27 -, Perl6 RFC Librarian wrote: Where an array is declared without ':bounds', @# returns the largest bounds of each dimension that has been accessed: Wouldn't that be slow? It depends. The array creation RFC proposes that LOLs declared with a simple type be stored as compact arrays. In this case their bounds would have be to stored internally (otherwise the indexes of the data structure can not be derived). The only difference between this and using :bounds is that :bounds turns off autovivification and turns on bounds checking exceptions. If it is a good-old-fashioned Perl 5 list of lists, then yes, finding the bounds would be slow (by some definition of slow). But it would be faster than a pure Perl approach... I don't imagine @# being much used outside of typed arrays, however.
Re: RFC 148 (v2) Add reshape() for multi-dimensional array reshaping
Let's jump in. This RFC proposes a Creshape builtin with the following syntax: Err... this syntax isn't what I expected at all! I thought the first argument would define the shape of the result, like NumPy or PDL... When one array is passed in, it is split up. Here, the C$x and C$y determine the dimensions of the resulting lists. The C$i determines the interleave. This $i definition should be removed now. So, we'll assume an input array of the form: ( [1,2,3,4], [5,6,7,8], [9,10,11,12] ) Which is called by Creshape with the following dimensions: $x,$y @results - -- -1,-1 ( 1,4,7,10,2,5,8,3,6,9 ) # simple concat I think that a simple concat is: reshape ([-1], @a); since here the rank of a list is one, so the length of the first argument to reshape is one. The -1 means 'use up the whole list'. It should be an error to have more than one arg of -1. 3, -1 ( 1,2,3,5,6,7,8,9,10 )# 3 vals from all lists That should be a rank 2 matrix of shape (3,4), i.e. ([1,2,3],[4,5,6],[7,8,9],[10,11,12]). -1, 2 ( 1,2,3,4,5,6,7,8 ) # all vals from 2 lists That's a rank 2 matrix of shape (6,2). 3, 2( 1,2,3,4,5,6 ) # 3 vals x 2 lists That's a rank 2 matrix of shape (3,2), which would discard the last 6 elements. Hopefully this is easy to understand. C$x controls how many elements of each list are used, and C$y controls how many lists are used. This is just like the splitting operation, but in reverse. Again, wildcards of C-1 can be used here as well. I don't think that there's 2 types of reshape(). There's one, and it takes an array of one shape, and returns an array of a different shape. The shape of the new array is specified by the first argument. The second argument is a list, so it succumbs to Perl's normal list flattening behaviour. The behaviour of reshape() should reflect: - PDL's reshape() - NumPy's reshape() -- which is the only one allowing '-1' in the shape - J's '$' verb which all behave the same way. Also, the examples should show reshaping into and out of arrays other than rank 1 or 2. Sorry Nate--I know we thought we were on the same wavelength here, but it looks like we weren't at all! Would you like me to redraft this for you, or create a new RFC?
Sublist -data RFC wrap-up time
We need to get our -data RFCs wrapped up. Nate said it rather well on -objects, so rather than rewrite what he said, I'll just repeat it here. I had planned to get RFCs frozen by this Wednesday, but that's looking overly optimistic, so let's aim to meet the same deadlines that -objects are working to (see the attached message): - Original Message - From: "Nathan Wiger" [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Sunday, September 17, 2000 12:28 PM Subject: Sublist -objects RFC wrap-up time All- As Nat has mentioned on -meta, it's time to start wrapping things up. In particular, I think the following "deadlines" should apply: 1. Any and all *new* RFC's should be submitted by Wednesday at the absolute latest (preferably sooner). 2. All existing RFC's should have their final "developing" versions posted by this Friday, Sep 22nd, at the latest. 3. All RFCs should be frozen by the following Wednesday, Sep 27th. The hard deadline is Sep 30th for frozen RFCs. I will make no efforts to enforce these "deadlines", however please consider that the sooner your frozen RFC is done, the sooner Larry will be able to evaluate it. He's making his decision in about 3 weeks, so the sooner the better. I will make a final dredge through this weekend and will post a list of those RFCs still in need of updating Monday (but you probably know who you are). When freezing your RFCs, please add a section up top called "NOTES ON FREEZE", under the ABSTRACT section. In this section, please include a brief little synopsis: a) what the general consensus was b) any specific highlights or issues to be resolved still c) any dependencies on other RFC's This section should be brief and honest. For an example, please see RFC 164. The intent here is to clarify key points, and avoid giving Larry RFCs that would have a top section that looked like this: =head1 NOTES ON FREEZE Everyone else hated this idea, but I really like it, so screw everybody, I'm freezing it anyways. If your idea was disliked overall, be honest with yourself and others. I've retracted 3 of mine personally already because they ended up being real stinkers, or superceded by better ideas. If you retract your idea, an optional "NOTES ON RETRACTION" section would be nice, since it could help ideas from being brought up again in the future. For an example of this, please see RFC 175. Final words: Let's try to avoid flooding the list with last-minute RFCs. Let's try to get the main ones out there that could affect the fundamental shape of the language. Other stuff can be added as we see fit later in Perl 6.0.1, 6.0.2, etc. And if you have a problem with an existing RFC, please don't submit a counter-RFC as your first course of action. Please try first contributing to existing discussions and reaching consensus, since that has worked quite effectively so far. Your sublist footstool, Nate
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
Ilya Zakharevich wrote: On Sat, Sep 16, 2000 at 11:08:18AM +1100, Jeremy Howard wrote: - How does it relate to RFC 204? Is it an alternative, or an addition? 204 cannot be implemented since it prohibits usage of overloaded objects as array indices. Why is it important for overloaded objects to be used as array indices? Why does RFC 204 rule that out? RFC 204 simply specifies that a list reference as an index provides multidimensional access: $a[ [1,1] ] == $a[1][1]; - How does it relate to RFC 81? The semantics of '..' seems to conflict. What I say conserns the usage of '..' inside an index only. It cannot conflict with anything else. RFC 81 expands on the existing operator '..' in a list context to allow more generic list generation. It is particularly useful to generate lists to act as array slices: @a[ 1..5 : 3] == @a[1,3,5]; This would seem to conflict with the meaning of '..' outlined in RFC 231. - Why is it better to make ';' "special inside a hash/array index only" Because ',' is already special there. There is little chance that ';' operator is created as a general-purpose operator. When we first discussed ';' on the list, we looked at making it special in an index only. But the more generic approach of making it a cartesian product operator seems cleaner--it avoids 'special' meanings in favour of providing a generic operator. Why is there little chance of creating ';' as a general-purpose operator? - Why is a special token for a separator necessary "to avoid the (giant) overhead of creation of anonymous arrays"? Don't RFC 203 arrays and RFC 81/205 lazy generation avoid this? a) "Lazy generation" is not defined, as stated it is a good wish only. What is @a = (0, 2..99, 200..9998, 100); f(@a); Lazy generation is a well understood concept in other languages. I'm most familiar with C++, so I'll draw from that. In libraries that provide lazy evaluation, f(@lazy_list) is a 'promise' to apply f() to the elements of @lazy_list when an element of f(@lazy_list) needs to be calculated. Sometimes this is all done at runtime (MTL, newmat), sometimes parts are done at compile time ('expression templates' in POOMA and Blitz++). These C++ examples and many others are indexed at: http://www.oonumerics.org/oon/ b) The call for $a[2,3;5,6] is *) Put already-available SV pointers for $a, 2,3,4,5 and the cashed SV* for tie::multi::separator() on stack; *) Put the (cached) CV* for the method on stack; *) invoke the call frame; This is not *very* quick, but at least it may be "not that slow". While all the alternatives require creation of anonymous lists, which (I expect) will slow things down 7..10 times for the call above. For $a[1..100;1..100] it may easily be 100..1000 times slower. Lists of lists of known simple type are proposed by RFC 203 to be stored as true arrays (i.e. contiguously in memory). Their overhead is not the same as Perl 5 lists of lists. The index in $a[1..100;1..100] should be generated lazily. An individual element can be calculated directly from the index parameters as required. Your way was my way when I was designing Math::Pari. When I *implemented* Math::Pari, it took some time to determine why it was so much slower than what I expected. My proposal is based on this experience. Creation of [1,2,3] is *very* slow. I hope we can change how [1,2,3] is created by: - Creating a true numeric array if it is an array of known simple types - Generating the elements lazily where it is more efficient to do so If we can not do these, then I agree that RFCs 204 and 205 are not plausible in their current form. - Overall, what is the problem in the existing array RFCs that this is designed to solve? *) They are not compatible with overloading (unless overloaded things are dramatically changed); There are a number of RFCs proposing substantially changing overloading. What specific changes would we need to ensure were incorporated in P6 to avoid this incompatibility? *) They create a lot of temporary anonymous arrays the only purpose of which is to group arguments; Yes, if we can't get any lazy generation to work. *) They go very high on the bizzareness scale. Bizzare??? Which RFC? RFC 82: The concept of all array operations being applied element-wise to arrays is very widely used in languages oriented to numeric programming--it is certainly not 'bizzare'. There has been debate around '||' and '', although I find the alternative meaning of these in a list context proposed by RFC 45 more bizarre. ...But I think that this point is already well discussed... RFCs 90 and 91: These builtins are in almost all languages with rich array functionality. 'merge' and 'demerge' are more frequently called 'zip' and 'unzip', but those terms were almost universally rejected on -language. RFC 203: If we know that a list of lists is of a simple typ
Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices
Ilya Zakharevich wrote: On Sat, Sep 16, 2000 at 07:15:34PM +1100, Jeremy Howard wrote: Why is it important for overloaded objects to be used as array indices? Overloaded objects should behave the same way as non-objects. Why does RFC 204 rule that out? RFC 204 simply specifies that a list reference as an index provides multidimensional access: $a[ [1,1] ] == $a[1][1]; I repeat: what does $a[ $ind ] does if $ind is a (blessed) reference to array (1,1), but behaves as if it were 11 (due to overloading)? How $ind is implemented (ie the actual structure that is blessed) does not matter. What matters is what interface its class provides. If it overloads operators such that dereferencing it does not provide an array, then it shouldn't be expected to work as a multidimensional array index. If it provides operators that give it the same interface as a list ref, then it should work everywhere a list ref does. RFC 81 expands on the existing operator '..' in a list context to allow more generic list generation. It is particularly useful to generate lists to act as array slices: @a[ 1..5 : 3] == @a[1,3,5]; This would seem to conflict with the meaning of '..' outlined in RFC 231. Sorry, I see no conflict. (Assuming that ternary '..' is allowed, the token tie::multi::range() would be followed by 3 numbers, not 2.) These calls will result in tied(@a)-FETCH_RANGE(tie::multi::range(), 1, 5, 3) tied(@a)-FETCH_RANGE(1, 3, 3) If FETCH_RANGE uses tie::multi::inline() to preprocess the keys, this *by definition* will result in the same array of keys. If not, it is the responsibility of FETCH_RANGE to insure the equivalence. And $a[ 1..5e6 ] would not need to create 5e6 Perl objects the only purpose of which is to inform the range extractor that it needs to create an object representing the slice. From RFC 81: quote When a lazy list is passed to a function it is not evaluated. The function can then access only the elements it needs, which are calculated as required. Furthermore, the arguments that generated the list are available as attributes of the list, and can therefore be used directly without actually accessing the list /quote It is not necessary to create 5e6 objects. Furthermore, RFC 81 proposes syntax beyond just ($start..$stop: $step). Implementing it using tie::multi::range() followed by 3 numbers would not be enough. Anyway, we're defining a language interface here, not an implementation, so we don't really need to nail this down immediately. When we first discussed ';' on the list, we looked at making it special in an index only. But the more generic approach of making it a cartesian product operator seems cleaner--it avoids 'special' meanings in favour of providing a generic operator. No, it is not a generic operator. Its behavior depends on whether it is used *inside parens*, or not. Additionally, the behaviour of cartesian product makes very little sense: if you did not want it 3 times, you should not insert it into the language. I'm not wedded to allowing ';' outside of a list index. However, it does lead to both consistency and convenience with how list slicing is done in Perl 5: # Perl 5 behaviour @indices = (1,3); @list = (3,4,5,6); @list[@indices] = (1,2); # (3,1,5,2) # Multidim extension @2d_indices = ([0,0],[1,1]); @2d_arr = ([3,4,5],[6,7,8]); @2d_arr[@2d_indices] = (1,2); # ([1,4,5],[6,2,8]) # Slice syntax extension @2d_slice = (0..1 ; 0..1); # ([0,0],[0,1],[1,0],[1,1]) @2d_arr = ([3,4,5],[6,7,8]); @2d_arr[@2d_slice] = ([0,1],[0,1]); # ([0,1,5],[0,1,8]) The implementation of ';' when used as a list index and then thrown away clearly should not create an actual list of lists, for efficiency reasons. I don't see why this case can't be dealt with appropriately. Maybe. But it is not defined in the corresponding RFC nevertheless. At least: all I could deduce was that the following constructs are made synonymous: @a = ($a .. $b); tie @a, Array::Range, $a, $b; No other usage of .. is covered. RFC 81 defines 4 uses of C... It does not propose a specific implementation in terms of Ctie, or anything else--it simply defines a language interface. *You do not want to create new values uncessesarily*. This is too slow. Quick operations should reuse already available values instead. See how scratchpads work... Agreed. RFC 81 proposes that generated lists be memoized, and that new values are only create when required. Even if it is creation of a "streamlined" array, creation still will takes much more time than operation dispatch - which is in turn painfully slow. We should optimise special cases when we know which are causing problems. Perl 5 may or may not provide useful experience here--the operation dispatch approach in Perl 6 may be quite different, given how the -internals discussions are progressing. RFC 204: Isn't it fairly
Update: Wrapping up -data RFCs
Adam Turoff wrote: I didn't use Date::Parse, but I did look for all RFCs still stting at v1 status. Since they're numbered chronologically, I cut off the bottom (anything submitted after 9/7). There are 100 RFCs in the list that follows. Code and data upon request. Thanks Ziggy--very handy! There's only 3 from the list I'm looking after (-data), here's their status: RFC : 148 v1; Developing Title: Add reshape() for multi-dimensional array reshaping Maint: Nathan Wiger [EMAIL PROTECTED] List : [EMAIL PROTECTED] Date : 24 Aug 2000 Nate is redrafting this as we speak--we're probably just about ready to freeze this. RFC : 191 v1; Developing Title: smart container slicing Maint: David Nicol [EMAIL PROTECTED] List : [EMAIL PROTECTED] Date : 1 September 2000 There was quite a bit of discussion of various alternatives to this on list. David--could you incorporate these ideas into the RFC and see if we can get concensus. RFC : 196 v1; Developing Title: More direct syntax for hashes Maint: Nathan Torkington [EMAIL PROTECTED] List : [EMAIL PROTECTED] Date : 5 Sep 2000 Discussion on this one has died down... Nat--could you incorporate the suggestions from the list and see if we can get this frozen? All other -data RFCs are still under active development, but I've asked all maintainers to have them frozen by next Wed (20/9) so that Larry has time to think about them before his release of the draft language spec.
Re: reshape() (was Re: Fw: Wrapup time)
Nathan Wiger wrote: Jeremy Howard wrote: RFC 203 defines a :bounds attribute that defines the maximum index of each dimension of an array. RFC 206 provides the syntax @#array which returns these maximum indexes. For consistancy, the arguments to reshape() should be the maximum index of each dimension. A maximum index of '0' would mean that that dimension is 1 element wide. Therefore '0' can not be special in reshape(). Therefore we should use '-1'. I agree with Christian, if you're going to use bounds(), this should be equal to the number of elements, NOT the number of the last element. So you would say "3" for 3 elements, even though they're numbered 0..2. This is the way other similar Perl ops work already: $size = @a;# 3 $last = $#a; # 2 OK, I'm convinced. I'll change :bounds. I agree with Christian that it should be renamed :shape in that case.
Re: RFC 225 (v1) Data: Superpositions
Perl6 RFC Librarian (aka Damian Conway) wrote: This RFC (seriously) proposes Perl 6 provide Cany and Call operators, and, thereby, conjunctive and disjunctive superpositional types. Great to see this RFC'd--this will makes lots of data crunching code _way_ easier. Now, I haven't quite finished reading all the references yet ;-) but I'm wondering about the efficiency of of any() and all(). I've heard it said that they work in constant time, but I assume that would only be if they are implemented on a quantum computer (which are currently existentially-challenged ;-) Do any() and all() have some magic around how they are implemented in von Neumann computers that make them faster than standard CS searching techniques? The RFC mentions the opportunity to parallelise these operators, if they are included in the core. The same is true of a number of other -data RFCs, such as element-wise array operations, and implicit loops. Is there a generic approach we could propose that would allow parallelising of user algorithms without having to rely only on a subset of 'parallel-enabled' builtins in the core?
Fw: Wrapup time
Forwarded from perl6-meta Nathan Torkington wrote: Larry's going to release a draft of his langauge decisions on the 1st of October. My plan to prevent a flood of 100 new RFCs on September 30: - deadline for new RFCs of Sep 25. After that, only discussion of old ones. - send mail to existing authors of "developing" RFCs telling them this Given that Larry is making decisions in 2.5 weeks, I figure that we've got 1 week to clarify our RFCs (so Larry has some time to digest them). Here's some thoughts about where our RFCs stand: 115: Change motivation to better recognise RFC 205. Motivation should probably more generic. Consider incorporating into RFC 159. 116: Refocus. Remove sections that are now served by the array RFCs. Separately RFC anything missing. Make the RFC an informative RFC about how PDL solves implementation problems 169: Withdraw 204: Clarify behaviour when specifying less indexes than there are dimensions. Clarify relationship between $a[$i] and @a[[$i]] on lists of lists 207: Decide whether we really want this. If so, add a motivation section as to what we're really winning. Define the width of the loop, and clarify its behaviour. Resolve whether we want |i notation or $INDEX::i notation 82: Summarise argument against RFC 45 90/91: Resolve whether alias or copy returned. Clarify how recursion avoided, if alias returned 148: Change to Numeric Python semantics of reshape(), or write counter-RFC specifying these semantics (preferably renaming this RFC's 'reshape' to something else) New: transpose (or similar); ufuncs (like in NumPy) I'm happy to work on 204, 82, 90/91, and 148 (Nate--I don't think we've resolved this one yet...). I think Buddha is best placed to do 169 (easy!) and 207 (Buddha--I think it's mainly a case of summarising the emails you and I have recently written about this). Can someone volunteer to look at 115 and 116? Any other changes or new RFCs we need? Have a look through any PDL/NumPy/whatever code you've got lying around, and see if there's anything that our proposed Perl can't handle nicely. Feel free to post any such code and ask for suggestions about we could implement these bits.
reshape() (was Re: Fw: Wrapup time)
Nathan Wiger wrote: Jeremy Howard wrote: 148: Change to Numeric Python semantics of reshape(), or write counter-RFC specifying these semantics (preferably renaming this RFC's 'reshape' to something else) There are a couple things that the NumPy one lacks that RFC 148 currently has: 1. Arbitrary Interleaving 2. A way to specify multiple @arrays, i.e. @new = reshape $x,$y,$i, @a, @b, @c;# RFC 148 Now, if we're looking for a new, more compact syntax, let's make arg one an arrayref of dimensions: @new = reshape [$x,$y,$i], @a, @b, @c; That looks remarkably similar to NumPy's, plus it can take multiple arrays, even defaulting to @_. And I can change the "wildcard" from 0 to -1, just like NumPy's. That looks good, except that I'd remove the interleaving. Currently, it's not clear how to reshape() to more than 2 dimensions, because the third argument of the first list ref is the interleave flag. We should be able to be able to reshape to any number of dimensions: @new = reshape [$w,$x,$y,$z], @a, @b, @c; Furthermore, it's not clear how interleaving would work on 2d arrays. I'd rather use a transpose() function for this that can transpose across a given axis. I don't think we need to define the ability to work on multiple lists as special behaviour. Perl knows how to flatten lists, so any syntax we define will allow multiple lists simply by letting Perl join and flatten them. Finally, I think the dimensions specified by reshape() should be the maximum index of the axis, not the number of elements, since this way it matches the :bounds semantics. In this case, the wildcard would clearly need to -1. Then I'd be happy. ;-)
Re: Please take RFC 179 discussion to -data
[EMAIL PROTECTED] wrote: Could we please take discussion of 179 to -data? I think that's where it should be. K. Personnally, I don't see any objection to this. If everybody is ok, why not ? How should I process ? Submit again the proposal with a modified mailing-list email ? Gael, Yes. If you do this, I suggest you take the opportunity to fill out RFC 179 with more detail. In particular: - Why you think set operations should work on arrays rather than hashes - In what way the current Set:: modules are insufficient - Why set operations should be added to the core rather than a module That way the list will be able to understand the reasoning behind the RFC better.
Re: logical ops on arrays and hashes
Dan Sugalski wrote: ...would anyone object to the _binary_ operators being used instead? They don't have short-circuit semantics, and generally don't have any reasonable meanings for hashes and arrays. With that, instead of writing the above code, you'd write: @a = @b | @c; nothing short-circuits but then you don't expect it to, and that's more or less OK. The and operation would likely return the left-hand value if both are true, and xor would return whichever of the two were true, or undef of both (or neither) were true. Of course they have reasonable meanings for arrays--element-wise operations (RFC 82): http://tmtowtdi.perl.org/rfc/82.html Any operation you can do on a scalar you should be able to do element-wise on a list, and certainly it's not hard to come up with situations where this is useful for non-short-circuiting bitwise operators. Bit vectors and associated masks may well be stored in lists, for instance. This discussion should probably be on -data, BTW.
Re: RFC 207 (v1) Array: Efficient Array Loops
[EMAIL PROTECTED] wrote: Reading through the examples left me wondering about some technicalities: @t[|i;|j] = @a[|j;|i]; # transpose 2-d @a Written like this it would require that @a is exact 2-dim, i.e. it would not just swap the first two dims of any n-dim array? I suppose if I'd want that I'd write @t[|i;|j;] = @a[|j;|i;]; # trailing ';' implies there might be trailing dims Not necessary. Since arrays support all the syntax of a plain old list of lists, and the |i syntax just creates an implicit loop, the example quoted from the RFC will work with other sized arrays. In fact, if it was only 2d, it would more properly be: $t[|i;|j] = $a[|j;|i]; # transpose 2-d @a With three dimensions, each implicit loop in @t[|i;|j] = @a[|j;|i]; # transpose 2-d @a is assigning the _list_ (or 1d array) at @a[|j;|i] to the appropriate index in @t. Ditto for 3 dimensions, except that it is a 1d array (or LOL) that is being assigned at each index through the implicit loop.
Re: RFC 90 (v3) Arrays: Builtins: merge() and demerge()
Christian Soeller wrote: Jeremy Howard wrote: However I like the Numeric Python reshape() semantics better: http://starship.python.net/~da/numtut/array.html Is that in any significant way different from PDL's reshape? http://pdl.sourceforge.net/PDLdocs/Core.html#reshape The Numeric Python version lets you use '-1' for any one dimension, which sets the size of that dimension to make the array fill the new shape as completely as possible.
Re: RFC 90 (v3) Arrays: Builtins: merge() and demerge()
Removed perl6-announce x-post Chaim Frenkel wrote: "DC" == Damian Conway [EMAIL PROTECTED] writes: DC I *still* think it should be "unmerge"! ;-) Hrmpf. It should be reshape. (Which would be its own inverse and saves a keyword.) reshape() has already been proposed (RFC 148): http://dev.perl.org/rfc/148.html However I like the Numeric Python reshape() semantics better: http://starship.python.net/~da/numtut/array.html Nate (RFC 148 maintainer) and I are trying to find away of getting the greater flexibility of his proposal with the more intuitive semantics of the Numeric Python version. Hopefully we'll have an RFC or 2 on this shortly. PS: How does one pronounce 'hrmpf'?
Re: RFC 82 (v3) Arrays: Apply operators element-wise in a list context
Nathan Torkington wrote: Jeremy Howard writes: No, there's no arbitrary decision. *Every* operator is component wise on lists. It is internally consistent, and consistent with most other languages that provide array/list operators. It's easy to get stuck on the '*' example, because different mathematicians have different feelings about what matrix operation should map to '*'. However, there is no consistant and meaningful definition of array operations (for _all_ operators) other than that defined in RFC 82. Actually, the only refinement I'd like to see is that boolean operators (==, , ||) be excepted from the distributive rule. This is to permit: if (@a == @b) # shallow comparison Already works under the RFC (scalar context). and @a = @b || @c; # @a=@b or @a=@c; # ish Doesn't work in P5 (try it!) The math operations are fine to apply to each element. I have no problem with those being distributive, but I think || for default values and == for comparison are too ingrained and they'd be too useful (as opposed to a distributive || or , which is much less useful). == is applied in a scalar context--fine. || as you show it can not be ingrained because it doesn't currently work this way!
Re: RFC 90 (v3) Arrays: Builtins: merge() and demerge()
Matthew Wickline wrote: (not on list, just tossing this in for discussion) OK--we'll keep you cc'd in on this discussion. RFC 90 (v3) wrote: - Both Cmerge and demerge do not make - a copy of the elements of their arguments; - they simply create an alias to them: - 1 @a = (1,3,5); 2 @b = (2,4,6); 3 @merged_list = merge(@a,@b); # (1,2,3,4,5,6) 4 $merged_list[1] = 0; 5 @b == (0,4,6); # True ... In order for the aliasing thing to work, perl would have to keep track of all these aliases through a large number of possible operations. Suppose that in the above code, after line 3 I decided to to say @a = (); @merged_list == @b; Now @merged_list must be seriously adjusted as well. What if I did @a = (@merged_list, @b, @a, reverse @merged_list); Not only is that likely to be a great deal of work, but I can't even begin to think about what the result should be. I don't think that it can be defined. On assignment to another list, an array copy would have to occur. In this case the result of this operation is pretty obvious. The problem in that case is that you've got a recursive definition with no base case. So, I would say that an easy solution is to return a non-aliased list. If you need the aliasing effect, then you need a way to avoid recursion problems (and you probably just have to bite the bullet with respect to all the extra bookkeeping work perl would have to do). You're right that there's a lot of bookkeeping, and I'm not sure it's worth it either. Perl's current slicing operation C@a[$x1, $x2] only aliases when used in an lvalue context: @a = (4,5,6); @b = @a[1,2]; @b[0] = 9; # @a == (4,5,6) @a[1,2] = (8,9); # @a == (4,8,9) We could do the same for merge(). The downside is that: @transpose = part( # Find the size of each column scalar @list_of_lists, # Interleave the rows merge(@list_of_lists); ) and similar expressions would do an awful lot of copying. Ideally if merge() didn't alias in an rvalue context, Perl would still optimise away multiple merge()s, part()s, slices, and so forth so that only one copy occurred. I'd want to stuff it in a module, but then I'd want to do the same with any other core feature I don't use much, so that opinion ain't really worth all that much. ;) This was discussed quite a bit when v1 of this RFC was released. It's a common operation for working with multidimensional arrays, and for efficient looping through multiple lists, both of which are important in data crunching. If it were in a module, then the elements required to make it's complex optimization behaviour work would have to be definable in a module. I haven't seen an RFC that provides such a capability.
Re: RFC 82 (v3) Arrays: Apply operators element-wise in a list context
[EMAIL PROTECTED] Nathan Wiger wrote: This RFC proposes that operators in a list context should be applied element-wise to the elements of their arguments: @d = @b * @c; # Returns (2,8,18) If the lists are not of equal length, an error is raised. I've been watching this RFC for a while. I would hesitate to change the default behavior of * and other operators in so radical a sense, especially since it would create unexpected error conditions. I think these operations should remain scalar. I disagree. So do I. ... The idea would be operator and data overloading is completely integrated with tie, based on the concept of polymorphic objects. As such, you could create native PDL classes that would allow @a * @b to do what you want. Not only do you lose consistency here (as Christian already pointed out), but also speed. Array functions and operations would be tightly optimised loops, and furthermore multiple operations would avoid redundant loops and copies. Good luck finding a way of getting Ctie to do this, RFC 200 notwithstanding. Otherwise, you have to decide if @a * @b should be element-wise, a cross-product, a vector, or ??? I just think it's a can of worms that's going to result in a set of arbitrary decisions, which in the the end makes less sense than having these contexts remain scalar by default. No, there's no arbitrary decision. *Every* operator is component wise on lists. It is internally consistent, and consistent with most other languages that provide array/list operators. It's easy to get stuck on the '*' example, because different mathematicians have different feelings about what matrix operation should map to '*'. However, there is no consistant and meaningful definition of array operations (for _all_ operators) other than that defined in RFC 82. This RFC is absolutely fundamental to providing numeric programming capabilities in Perl 6, and it happens to make a lot of other stuff simpler besides (e.g. the text processing examples in the RFC). It would improve the speed and clarity of code, and Perl 5 scripts can be converted with a simple Cscalar. I can't really see the argument for not doing this.
Re: $a in @b
Jonas Liljegren wrote: Does any other RFC give the equivalent to an 'in' operator? I have a couple of times noticed that beginners in programming want to write if( $a eq ($b or $c or $d)){...} and expects it to mean if( $a eq $b or $a eq $c or $a eq $d ){...}. I think it's a natural human reaction to not be repetative. An 'in' operator will help here. It could be something like this: $a in @b; # Has @b any element exactly the same as $a $a == in @b; # Is any element numericaly the same as $a $a eq in @b; $a in @b; # Is $a bigger than any element in @b? $a not in @b; # Yes. Make 'not' context dependent modifier for in. ... You could even use it as a version of the ||| operator. :-) $a = in @b; # Assign the first defined value to $a Quantum::Superpositions provides this in a more flexible way by adding the 'any' and 'all' keywords. http://search.cpan.org/doc/DCONWAY/Quantum-Superpositions-1.03/lib/Quantum/ Superpositions.pm One of Damian Conway's many promised RFCs will cover incorporating these ideas into Perl 6.
List generation (was Re: PDL-P: No status field for Perl6 RFC 115 )
Moved to perl6-language-data from PDL Porters Robin Williams wrote: "Jeremy Howard" writes:- The first version of this RFC had a @start..$end:gen but it just seems too dangerous, so I removed it. I'm still willing to be convinced though... as well as @start..gen:$num_steps ? Yes, I'd agree you'd have to choose one!! No, *instead* of. But `@start..$end:gen' still seems the more natural choice to me -- easier to remember, maybe even easier to parse, a scalar in any position always means the same thing... Yes, I like it too, but the problem is that $end may not be reached: @weird = (0..5: ^0 mod 2); is an infinite loop under this proposal. That's not necessarily a dead-end, but it seems pretty dangerous. The $num_steps proposal could be a lot faster too, because a fast loop could be generated (particularly if we define some functions that are easily optimised, like ufuncs in Numeric Python): http://starship.python.net/~da/numtut/array.html#SEC13 Any other ideas? o Use the `:' as a kind of `modifier' syntax (cf shell environment variables) 2..10:(n,gen) -- ugly, I still don't like the 10 being $num_steps, but might allow some kind of regexp-like optimization hints. o Just allow gen to do what it likes with the 2nd argument to `..' and return undefined at termination -- probably just _too_ laissez faire. Or, along similar lines, allow the second argument to be a test, so @start : test : inc is more like for (i=1; i=10; i++) -- with some way of accessing element number as well as values. Yes, when I had the $end notation, I was planning to provide an test syntax too--I think if you had one you'd want both. BTW, the element number is always available as the first argument to the list generation function. o Or even 2 x 10 : gen -- at least, I'd _expect_ to get 10 things here! 2x10 = 2..sub{0}:10 as TMTOWTDI seems a tad curious, on reflection. Hmmm... an interesting one. Starting to look more Perlish and less Matlabish, which may be a good or a bad thing depending on where you're coming from...
Re: Change ($one, $two)= behavior for optimization? (was Re: RFC 175 (v1) Add Clist keyword to force list context (like Cscalar))
Nathan Wiger wrote: Tom Christiansen wrote: Ever consider then having ($a, $b, $c) = FH; or @a[4,1,5] = FH; only read three lines? I mean, how many if any builtins would it make sense to make aware of this, and do something "different"? Personally, I think this would be really cool; stuff like this is what I was trying to poke at. Lots more power and flexibility. I could name lots of builtins this potentially makes sense for: ($one, $two) = grep /pat/, @data; ($k1, $k2) = keys %hash; # leave index at $k3? @a[6,5,4]= map { split ' ' } @line; ($last) = reverse @array; And then there's splice, sort, and any and every user-defined sub too. The only problem is when grep and map are used to change values on the fly...this will have to be addressed. But actually, the behavior could potentially be quite cool - maybe only the number requested back are changed. Hmmm. The problem with making these builtins respect the number of return values context in want() is that, as Nate mentions, the expressions may have side-effects that are desired for the whole list. An alternative approach is to make these builtins respect lazy(), as defined by RFC 123: quote What if adding laziness to a list context was up to the programmer and passed through functions that can support it: for (lazy(grep {$h{$_}-STATE eq 'NY'} keys %h)){ $h{$_}-send_advertisement(); }; would cause a lazy list is passed to Cfor, and increment of the object's "letters_sent_total" field might break the iteration. for (grep {$h{$_}-STATE eq 'NY'} lazy(keys %h)){ $h{$_}-send_advertisement(); }; causes a lazy list to be passed to our filter function Cgrep, saving us from allocating the entire Ckeys array. CGrep is still in the default busy context, so it returns a busy array, which Cfor can iterate over safely. /quote By returning a lazy list, elements that are never used are never calculated. That way the programmer could decide whether or not they want the Perl 5 list-gobbling behaviour, or lazy behaviour, as they require.
Re: Change ($one, $two)= behavior for optimization? (was Re: RFC 175 (v1) Add Clist keyword to force list context (like Cscalar))
Tom Hughes wrote: For example, in Perl you have for a long time been able to do this: ($one, $two) = grep /$pat/, @data; However, what currently happens is grep goes to completion, then discards possibly huge amounts of data just to return the first two matches. For example, if @data was 20,000 elements long, you could potentially save a good chunk of time if you only had to return the first and/or second match, rather than finding 1000 only to throw 998 away. This could fall out of using iterators in the core but without grep itself having to know anything about the left hand side. ... The only problem with this scheme (and indeed I suspect with yours) is if the match expression has a side effect. This is even more of a problem when trying to apply the same optimisation to map because of the widespread use of map in a void context to apply a side effect to the elements. RFC 123 'Builtin: lazy' describes a syntax for explicitly stating that your operation does not have a side effect, and requests that a 'lazy list'/iterator be used. It mentions grep as an example: quote What if adding laziness to a list context was up to the programmer and passed through functions that can support it: for (lazy(grep {$h{$_}-STATE eq 'NY'} keys %h)){ $h{$_}-send_advertisement(); }; would cause a lazy list is passed to Cfor, and increment of the object's "letters_sent_total" field might break the iteration. for (grep {$h{$_}-STATE eq 'NY'} lazy(keys %h)){ $h{$_}-send_advertisement(); }; causes a lazy list to be passed to our filter function Cgrep, saving us from allocating the entire Ckeys array. CGrep is still in the default busy context, so it returns a busy array, which Cfor can iterate over safely. /quote
Re: RFC 179 (v1) More functions from set theory to manipulate arrays
Gael Pegliasco wrote: First is the choice of arrays verses hashes as the choice for set storage. Arrays are obviously easier to construct, but hashes are both faster implementations, and easier to determine membership. Well in fact I'm interested by such functions in order to manipulate lists of scalars (1, 'toto') and to manipulate lists of hash table references ( { name = 'joe', age = 21 }, { name = 'sam', age = 27 } ) and I'd like to use these functions independantly of the array content type. The point is that a hash of booleans (not a list of hashes) is a more direct way to implement a set. A set is unordered, and does not have duplicates. This is also true of hash keys. Furthermore, the nature of a hash makes it faster and easier to check for the existance of a key, which is the fundamental operation of a set (test for membership).
Re: RFC 56 (v3) Optional 2nd argument to Cpop() and Cshift()
Michael G Schwern wrote: If pop @array, -1 == shift @array, 1 and shift @array, -1 == pop @array, 1, and if both Ways To Do It are almost exactly the same, then there's no value to allowing negative numbers. In most cases I'd expect passing a negative number to be a mistake on the programmer's part. I'd like to see negative numbers work. Otherwise the programmer would have to explicitly check whether an index into a string was positive or negative, take the absolute value, and use pop() or shift() as appropriate.
Re: n-dim matrices
Christian Soeller wrote: No, at least 18. One more piece of semantics that would be appreciated is optional omission of trailing dimensions in slices, e.g. for a 3-dim @a: @a[0:1] == @a[0:1;] == @a[0:1;;] I'd rather see the ';' be required, but the '(0..)' not be required, so you could still say: @a[0:1;;] == @a[0:1; (0..); (0..)]; @a[;0:1;] == @a[(0..); 0:1; (0..)]; @a[;;0:1] == @a[(0..); (0..); 0:1]; Note that @a[;0:1;] _does_ have a useful meaning--it is the 1st and 2nd planes cut across the 2nd dimension of a cube. Think of the actual cartesian product created by [(0..); 0:1; (0..)] and it's pretty clear. I've been doing lots of thinking over the last couple of days and I think I've got all the notation issues sorted out, except for the notation for index iterators (which we talked about using @* for in some way). On Sunday I'll try and churn out RFCs for all of them (they are all the ideas we've already talked about, but with ambiguities and implications sorted out). Under one of these RFCs @a[0:1] will have a useful meaning, so I don't want it to be equivalent to @a[0:1;;]. The RFCs I envisage are: - Overview of matrix RFCs - Notation for declaring and creating matrices - Notation for declaring sparse matrices - Notation for indexing matrices with a LOL as an index - ';' for slicing matrices - @#mat for getting the dimensions of a matrix - Extension of standard LOL notation to multiple elements - Automatic dereferencing of 1d matrix slices - @* as a magic iterator
Re: Designing Perl 6 data crunching (was Re: n-dim matrices)
Christian Soeller wrote: There might still be a need for something for those people who need FFTs and work on really large blocks of data. The hope would be that a perl6 PDL would fill such a gap and be more perlish than it is now. But again concrete syntax ideas are needed along with a clear statement of current weaknesses... We absolutely should cater to those needing big number crunching power. All I meant is that we shouldn't assume everyone who does data crunching is from the world of sophisticated mathematical software. Even those who do use such software have very different ideas on syntax depending on whether their background is SAS, Mathematica, SPSS, FORTRAN, or whatever... Our goal should be to make data crunching fast and easy, and to Perl users it should be intuitive how the syntax works. This means consistency with the rest of Perl. Hopefully we won't need a "Perl Data Language" anymore, since Perl 6 will be a great data language itself. Instead, we'll have modules to link in SLATEC, do statistical tests, implement FFTs, etc, using the powerful Perl data structures, functions, and operators that we define here. The 1st implementation of Perl 6 may not provide all the optimisations we've come to expect from our data crunching language of choice. For this reason maybe PDL will continue to exist independently in Perl 6 at least for a while, although a fair bit of rewriting will be required for the new XS, and to take advantage of the new syntax.
Re: RFC 177 (v1) A Natural Syntax Extension For Chained References
Buddha Buck wrote: At 05:35 PM 8/31/00 +, David L. Nicol wrote: Buddha Buck wrote: The array syntax would also be useful in multi-dimensional arrays. That is if multi-dimensional arrays are implemented as lists-of-lists, which they might not be. Even if they aren't implemented as lol, they may appear as lol to the programmer The language-data group is working on multi-dimensional array syntax. I would like to have (in multidimensional arrays) the various dimensions be equally easy to access. Using lol syntax imposes a hierarchy on the dimensions, which makes equal access hard. Not necessarily. _Allowing_ LOL syntax does *not* rule out provides other syntax which allows direct access to slices of additional dimensions. Providing both approaches means that all the language constructs and modules that already support LOLs will continue to work, while n-dim access works as well. It also avoids creating a new data type. Anyhow, let's keep this discussion to -data from here... RFC 177 should probably make -data its home.
Re: New variable type: matrix
Karl Glazebrook wrote: There is a difference between a List of Lists and a multi-dimensional array - the latter is rectangular, e.g. the rows are all the same size so you don't have to store the sizes of individual ones. So the latter needs much less storage overhead. How would you be proposed this be handled transparently? esp. if calling external C routines. This is only a difference in how P5 implements LOLs vs how PDL implements n-dim arrays. P6 need not have this difference. A LOL of a simple type, perhaps with the addition of an attribute (':compact'?) would be stored as an n-dim array. Non-uniform sized rows would be padded. The dimensions of the LOL would be stored somewhere just like with a PDL array. Reduce functionality equivalent is already available in PDL modules, so I don't see that as terribly urgent. RFC 76 proposes a reduce() builtin that is quite flexible. If we have something better we should propose that instead, or get Damian to incorporate the additional ideas into the existing RFC. I think it would be unfortunate if RFC 76 turned out to be incompatible with the n-dim array notation we come up with.
Re: A thought concerning matrix index variables...
Buddha Buck wrote: RFC 169 says it would be nice if: @a[^i;^j] = @b[^j;^i]; did a transpose operation. Should the syntax also allow: # fill a 10x10 array with 0-99 my @table: bounds(10,10); @table[^i;^j] = ^i*10 + ^j; I think it should--it seems a natural extension.
Re: Multiple for loop variables
Eric Roode wrote: Also the ability to traverse multiple lists at once for ($a,$b,$c) (zip(@a,@b,@c)) { ... } I don't get it. This is a great advantage over: @looparray = zip(@a,@b,@c); while ( ($a,$b,$c) = splice (@looparray, 0, 3)) ? Because splice() is destructive, the 1st of your 2 lines would have to do a full copy (otherwise based on the zip() RFC it would be lazily evaluated without a copy). Doing a full copy would destroy the efficiency of such an algorithm. I'm thinking that if I were implementing such a loop, I'd probably have set up my data structures so that instead of three arrays, I'd have one array of three hash elements, and iterate over it: for $iter (@data) { foo ($iter-{a}, $iter-{b}, $iter-{c}); } I wouldn't have thought this would be a good idea. The for (...) zip(...) construct may be used to transpose a matrix stored as an array, or maybe the lists are slices of other lists. It's the kind of construct one would see in numeric programming all the time. Using a hash would both destroy the efficiency of the program, and would also make other transposes, slices, diagonals, etc very unwieldy.
Re: multidim. containers
X-posted to [EMAIL PROTECTED] David L. Nicol wrote: If arrays as we know them implement by using a key space restricted to integers, I think a reasonable way to get matrices would be to open up their key space to lists of integers. I've been thinking along exactly the same lines. There's a lot of language issues to consider to get this to work consistently, such as interaction with reduce(), notation for slices across a dimension (and generalised slices such as diagonals), and so forth. I'm thinking that a n-dim array could just be a list of lists (of lists of lists of...) with the n-dim notation just being syntactic sugar (and perhaps helping with optimisation too). BTW, these kinds of issues are well suited to the perl6-language-data list: mailto:[EMAIL PROTECTED] I suggest that we cover some 'internals' on that list as well as 'language', since implementation efficiency is so fundamental in this area.
Re: implied pascal-like with or express
Ken Fox wrote: Dave Storrs wrote: On Thu, 17 Aug 2000, Jonathan Scott Duff wrote: BTW, if we define Cwith to map keys of a hash to named place holders in a curried expression, this might be a good thing: with %person { print "Howdy, ", ^firstname, " ", ^lastname; } # becomes sub { print "Howdy, ", $person{$_[0]}, " ", $person{$_[1]}; }-('firstname', 'lastname'); You're breaking the halting rules for figuring out the bounds of a curried expression. Your original code should have become: with %person { print "Howdy, ", sub { $_[0] }, " ", sub { $_[0] }; } I don't believe so. The rule at issue here is probably: quote =item Sub called in void context Currying halts in the argument list of a subroutine (or method) that is called in a void context. The tree traversal example given above shows a method in a void context (any return value from $root-traverse is being ignored). Therefore just its argument is curried, rather than the whole call expression. /quote I say 'probably' because it depends how 'with' is defined. Assuming that there are no explicit curry prototypes or sub prototypes floating around in the declaration of 'with', the commas do not limit the currying context. It gets worse with longer examples because each line is a separate statement that defines a boundary for the curry. IMHO, curries have nothing to do with this. All "with" really does is create a dynamic scope from the contents of the hash and evaluate its block in that scope. my %person = { name = 'John Doe', age = 47 }; with %person { print "$name is $age years old\n"; } becomes { my $env = $CORE::CURRENT_SCOPE; while (my($k, $v) = each(%person)) { $env-bind_scalar($k, $v); } print "$name is $age years old\n"; } The thing I don't like about either of these suggestions is that the local scope is hidden. In cough VB, you can say: dim height as double dim ws as new Excel.worksheet // 'worksheet' has a 'height' property with ws print .height // Accesses ws.height print height// Accesses me.height end with In Pascal, this is not possible. As a result, I find myself rarely using 'with' in Pascal, since it's rare that you do not need to access any of the local variables within a block.
Re: RFC 76 (v1) Builtin: reduce
Bart Lateur wrote: On Thu, 17 Aug 2000 07:44:03 +1000, Jeremy Howard wrote: $a and $b were done for speed: quicker to set up those global variables than to pass values through the stack. The solution is to pass args in as $_[0] and $_[1]. sort { $_[0] = $_[1] } @list is very ugly. I *like* the syntax of sort { $a = $b } @list My original post actually said that the reason for this is that you can then write: sort { ^0 = ^1 } @list; ...which is pretty Perlish.
Re: RFC 76 (v1) Builtin: reduce
Array and placeholder indices both start at *zero*! Array and placeholder indices both start at *zero*! Array and placeholder indices both start at *zero*! Array and placeholder indices both start at *zero*! - Original Message - From: "Damian Conway" [EMAIL PROTECTED] To: "Jarkko Hietaniemi" [EMAIL PROTECTED]; "Larry Wall" [EMAIL PROTECTED]; "Jeremy Howard" [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Saturday, August 19, 2000 3:22 PM Subject: Re: RFC 76 (v1) Builtin: reduce Except that Perl 6 people will know all about numbered parameters, so they will write: @out = sort ^2 cmp ^1, @in; and it will work just as they expect! As long as they expect it to fail miserably! :-( Now, go home and write it out 100 times: "Array and placeholder indices both start at *zero*!" Damian
Re: Extended Regexs
James Mastros wrote: On Fri, Aug 18, 2000 at 08:46:17PM +0100, Richard Proctor wrote: There is one significant area of perl that has very little attention here (other than one of my RFCs) that is regexs. Perl has very powerfull regexs - but what other features might be desirable? Well, one thing that has acatualy come up (on -language-io, BCCed) is less powerful regexes. Specificly, making an additional /f modifer that told the compiler to use a faster DFA matcher instead of the default super-powerful, but slow one. (/d for DFA has been proposed, but I rather like f for Finite and Fast.) The choice of algorithms is a great idea, but why do we need a modifier? Isn't it a pretty straightforward set of rules that allow us to decide if a DFA matcher will work? It would be a lot nicer if Perl could just notice that the regex could be handled by its DFA matcher, all by itself.
Re: RFC 76 (v1) Builtin: reduce
Larry Wall wrote: Jarkko Hietaniemi writes: : (Yes, there is a small aesthetic edge in using $a vs $_[0], but I still : consider the $ and $b to be warts.) : : And anyhow, this will work just fine (see RFC 23): : :$sum = reduce ^a + ^b, @numbers; : : I have been amply reminded of this, thanks :-) (Too little time : to spend on RFCs...) Yes, but has anyone pointed out that @out = sort ^b cmp ^a, @in; won't do what people will certainly think it ought to? Except that Perl 6 people will know all about numbered parameters, so they will write: @out = sort ^2 cmp ^1, @in; and it will work just as they expect! (See RFC 23 for information about how numbered, named, and anonymous placeholders get filled in).
Re: implied pascal-like with or express
David L. Nicol wrote: Yes, absolutely, about the semantics. About the syntax, how about just in a block behind %HASHNAME? (as long as it doesn't use $a and $b, of course ) (or if the insta-sort thing needs "sort" written in and this doesn't) %record{ $something_new = 3; # just set $record{something_new} to 3 }; I abhor 'with' in Pascal/Delphi, because within the block I can't distinguish between two properties with the same name, where one is in the current 'with' scope, and one is in the default scope. This is one of those few cases where VB has nicer syntax--within a 'with' block you have to precede a property name with '.' to get the with block scope: dim height as double dim ws as new Excel.worksheet // 'worksheet' has a 'height' property with ws print .height // Accesses ws.height print height// Accesses me.height end with Whatever syntax we go with to get 'with' type scoping, please let's make sure that we can still access the default scope within the block.
Uses for array notation (was Re: RFC 104 (v1) Backtracking)
Mark Cogan wrote: At 12:39 PM 8/16/00 +1000, Jeremy Howard wrote: It seems obvious that @a should be the whole array @a, not the size of the array. If I want to check the size of @a, I should have to do so explicitly, with scalar or $#. This is non-obvious if you think that || is a flow control statement, so think about * for a moment: @c = @b * @a; But, to me at least, arrays aren't something you multiply. Multiplication applies to numbers (scalars), and returns a scalar, so when you see @c = @b * @a it should be clear that something funny is going on. Well, they're not something you multiply in Perl now. But there's plenty of languages where you can, and it's ever so convenient. And, really, what is wrong with: @c = map {$a[$_] * $b[$_]} (0..$#a); Numerical programming is all about manipulating arrays (or n-dim tensors, but they're just order array slices really). Writing such programs in a language that requires explicit loops is not only a pain, but creates code that bears no resemblance to the numeric algorithm it's implementing. Furthermore, it's out of the question in anything other than low-level languages, because the looping and array dereferencing is much too slow. It also makes easy things easy: @full_names = @first_names . @surnames; and with the extension to scalars as one operand (coming in v2!): @quoted_lines = ' ' . @raw_lines; or @histogram = '#' x @num_recs; It's pretty clear what working on 'the whole array' means here, I think. I disagree. In particular, think of it from the point of view of someone who hasn't studied computer science. What should: @a = defined @a; return? defined() is a function. Under the proposed extension of RFC 82 from operators to functions, this example should return a list of booleans, where each is true if the corresponding element of the original list was defined. Under Perl 5.6, 'defined @a' is deprecated. This RFC would give it a useful meaning (in a list context). Treating || as a special case is asking for trouble. If you want a flow control statement, use one: @result = @b unless @result = @a; || may be a suboptimal example, but I think the idea that a @-variable without an iteration function refers to the array as a whole, and not its elements is an intuitive one, and having array iteration magically happen when you're not looking is dangerous. It seems funny if you're not used to it. But that's because we learn that lists are hard, and computers have to loop. After just a little unlearning it actually becomes quite intuitive, and is generally picked up with no additional trouble by new students. Programmers shouldn't have to know how a computer implements things behind the scenes--which is really what requiring explicit looping forces.
Array notation (was Re: RFC 104 (v1) Backtracking)
Mark Cogan wrote: At 11:11 PM 8/15/00 -0400, Chaim Frenkel wrote: You are missing the beauty of vector/matrix operations. No, I'm not; I'm opining that the vast majority of Perl users don't need to do vector/matrix ops, and that they don't belong in the core. The vast majority of Perl 5 users don't, because it's a pain in Perl 5. They use other languages instead, but those other languages aren't nearly as nice as Perl in almost every other way. Why not open up Perl for a whole new class of folks? The math folks really would like to be able to describe the operation wanted and have perl do the optimization. Then maybe they should use a dedicate math extension to Perl, like PDL. You've got to be joking! You think PDL is a simple convenient add-on to Perl that provides these features? It's not! PDL is an astonishing hack by some incredibly clever people, but it still _does_not_ support array notation, and optimised loops have to be written in yet *another* language named PP, which is compiled into C by a special compiler, and linked into the program. This is the absolute best that can be done right now, and the PDL folks deserve a medal for what they've achieved. But really, no-one should have to jump through these kinds of hoops! Would adding another character be helpful @result = @a x|| @b? @result = @a ||| @b? or perhaps a modifier? @result = @a || @b forall; (Blech, that just doesn read right.) Perhaps we just need a two-list map, which aliases $a and $b like sort(): @result = map {$a || $b} @a,@b; No. It would need to be n-list, and it would have to handle slices and lazily generated lists incredibly cleverly. map is too generic to easily make the kinds of optimisations required for array notation. I don't know how to convince you here. There's a feature that, if added to Perl, would make my life and many of my collegues lives easier. I know this for a fact because I've used it elsewhere. I doesn't break much that can't fairly easily be incorporated into P52P6, it's a generic abstraction that applies to many things other than numeric programming (it's great for string manipulation!), and it's proved itself elsewhere. I'm not ruling out a modifier, as Chaim outlined, although I agree that the word 'blech' comes to mind. But this is too good an opportunity to see it just disappear. Have a look at the examples in RFCs 23, 76, 81, 82, 90, and 91, which combine together to provide as strong a data manipulation language as you'll find anywhere. Over the next few days I'll be updating these to include more examples, particularly non-numeric ones.
Re: RFC 104 (v1) Backtracking
raptor wrote: ]- I tried minimalistic approach as small as possible additions to the Perl language, we get only the "backtrack" mechanism i.e. something that is harder or slower to be done outside of the perl core. The rest should be done outside . (I too want all in the core) I don't know if you noticed the earlier thread, but there's a few people (including myself) who are still having trouble understanding how your proposal works. Could you please provide a brief idiomatic example of some code where this proposal would provide a real practical benefit--enough that slow learners like myself can really understand what it makes easier, and how? It would be great if you could provide the equivalent Perl 5 code too. TIA, Jeremy
Re: Array storage (was Re: RFC 84 (v1) Replace = (stringifying comma) with =)
Stephen P. Potter wrote: Lightning flashed, thunder crashed and "Jeremy Howard" [EMAIL PROTECTED] whispered: | No, neither proposal makes sense. Arrays can be stored compactly and | | $a[1_000_000_000] = 'oh, really?' # :-) | | my int @a: sparse; | $a[1_000_000_000] = 'Yes, really!' # :P | | OK, so I cheated... I haven't submitted my RFC for a 'sparse' attribute yet. | My point is that arrays *can* be stored compactly, not that they always | *are*. Another type of array storage is that required for lazily generated Isn't this just as true, maybe even moreso for hashes? How much storage is taken up by $a{1_000_000_000} = "Sparse, without any special code!"; Yes, in this case (it wasn't my example!). But I would hope that we provide a way of providing compact array storage where we need it. This would be useful for lists of floats or ints that will be frequently iterated through and directly manipulated. Image processing and statistical analysis come to mind as obvious applications. Anyway, this is hard to discuss without an RFC... I'm sure I've get some time to write one someone around here...
Re: RFC 104 (v1) Backtracking
Johan Vromans wrote: Damian Conway [EMAIL PROTECTED] writes: As I understand things: BLOCK1 andthen BLOCK2 evaluates BLOCK1 and then if BLOCK1 evaluates to "true" evaluates BLOCK2. If BLOCK2 evaluates to "true" we're done. If BLOCK2 evaluates to "false", then BLOCK1 is re-evaluated. So how is that different from: do BLOCK1 until do BLOCK2 It's the same. But the real fun starts when blocks and functions can suspend and resume. There's already an RFC for coroutines that proposes allowing suspension and resumption of subroutines. Do the proposed backtracking operators buy anything that do/until/coroutines don't provide?
Re: RFC 76 (v1) Builtin: reduce
Nathan Torkington wrote: Piers Cawley writes: The $a and $b of the sort comparator were A Bad Idea to begin with. Ditto. Can we ditch these in Perl 6? Don't see why $_[0] and $_[1] can't be used, or even a more standard $1 and $2. Either one makes it more obvious what's being operated on. $1 $2 could be somewhat dangerous in a sub that might have regexen in it... $1 and $2 are a poor choice because of regexps. $a and $b were done for speed: quicker to set up those global variables than to pass values through the stack. The documentation for perl5's sort function says that passing as arguments is considerably slower. I don't think you can handwave and say "oh, that's an implementation detail". I think it's an implementation detail that's bloody hard to fix, especially for things like code references passed to sort: sort $some_ref @unordered Perl can't do anything at compile-time to tell sort where lexicals in the sort sub are. So I don't have a solution, I just have more detail on the problem. The solution is to pass args in as $_[0] and $_[1]. Using higher-order function notation allows these args to be named however you like: sort ^left cmp ^right, @list; sort ^1 cmp ^2, @list; sort ^_ cmp ^_, @list;
Re: RFC 82 (v2) Apply operators component-wise in a list context
Nathan Torkington wrote: Perl6 RFC Librarian writes: It is proposed that in a list context, operators are applied component-wise to their arguments. Furthermore, it is proposed that this behaviour be extended to functions that do not provide a specific list context. I don't mind making Perl's builtins do this. Making my own functions do it gives me the willies. You'd require subroutine authors to label their subroutines as capable of having this rule applied. Subroutine users are still going to have to read the docs to work out how to use the subroutine. You're not really making anything automatic, just trading one set of typing for a different set. Context is weird enough without trying to add more magic to it. Stick with "make Perl's builtin operators apply themselves to list elements when in list context". I know what you mean. I'm not crazy about it myself. However, it seems like it shouldn't be necessary to explicitly incorporate this feature into all Perl functions. Maybe we could add an attribute: 'listable', 'loop', ...? This attribute would be required to get the implicit looping behaviour in a list context. Then Perl functions could use this attribute to get the behaviour automatically, as could module authors who took the time to think about it. This would follow the mind-set of the lvalue attribute quite closely.
Re: RFC 91 (v1) Builtin: partition
Stephen P. Potter wrote: Lightning flashed, thunder crashed and Perl6 RFC Librarian [EMAIL PROTECTED] whispered: | =head1 TITLE | | Builtin: partition | | =head1 ABSTRACT | | It is proposed that a new function, Cpartition, be added to Perl. | Cpartition($partition_size, \@list) would return @list broken into | references to sub-lists, each one $list_size in size. This is very similar to what unzip does. Yes. The new version of the partition RFC (posted overnight) makes the distinction clear. Would it be better to combine them into a single function that could do both operations (and possibly others) with a flag? Nathan Wiger is currently following up this path. If he can find acceptable syntax we'll consider it. I'm worried about making this functionality more confusing than it has to be though, so we'll have to see what it looks like. In fact, couldn't all this (zip, unzip, partition) be handled as part of pack and unpack? No. They are lazily evaluated and require special optimisations to allow them to work on like iterators on subsets of sparse, masked, or sliced arrays. They can all be handled (sort of) by using lazily evaluation list generation functions (RFC 81), but these three functions are so fundamental to using 1d arrays as n-dim matrices that requiring complex code to roll your own would be less than ideal. The narrow scope of these functions also makes them much easier to optimise. They are likely to end up in a module, however.
Hashes vs arrays (was Re: RFC 84 (v1) Replace = (stringifying comma) with =)
Chaim Frenkel wrote: "KH" == Kai Henningsen [EMAIL PROTECTED] writes: KH Hashes and arrays, OTOH, really aren't different for people. The concept KH of an index needing to be a nonnegative number is a computer concept. I don't know about that. Good old PL/I had arbitrary ranges for array indices. Hmm, I feel an RFC coming on my @arr :low(-32000) :high(+32000); my @population :low(1900) :high(2039); $population[1923] = 323000; How about my @population[1900:2039]; Looks funny though. I'm quite fond of the idea of negative array indices. But don't you touch my precious C: operator, or I'll send in the hounds! I want its sanctity recognised for the purpose of the notation in RFC 81: http://tmtowtdi.perl.org/rfc/81.pod PS: Do you ever get the impression that some people will take every chance possible to segue all threads into a discussion of their own RFCs? ;-)
Re: RFC 99 (v1) Maintain internal time in Modified Julian (not epoch)
On Tue, Aug 15, 2000 at 09:25:34AM -0700, Larry Wall wrote: [EMAIL PROTECTED] writes: : Yep. Or more generally "Standardize Perl on all platforms to one : common time epoch" and reccommend the Unix epoch since it's so : widespread. :-) Oh, gee, where's your sense of history? (As in creating our own. :-) Maybe we should invent our own epoch, like the year 2000. Or use a really standard one, like the year 0 AD (aka 1 BC). I have this horror that people will still be using 1970 as the epoch in the year 31,536. Actually, Damian Conway came up with a good one last night. His suggestion was that the time at which we decide what to use as the start of our epoch, should be the time that we use as the start of our epoch. Now that's creating our own history!
Re: RFC 76 (v1) Builtin: reduce
Jarkko Hietaniemi wrote: On Tue, Aug 15, 2000 at 11:31:50AM -0500, Adam Krolnik wrote: Following the lead of the sort operator, it would be a little simpler to see reduce expressions use $a and $b instead of $_[0], $_[1]. The $a and $b of the sort comparator were A Bad Idea to begin with. There's nothing wrong with using the standard @_. The $a and $b just introduce yet another special case, which among other things makes it very hard to warn about dubious uses of users' variables named $a or $b. (Yes, there is a small aesthetic edge in using $a vs $_[0], but I still consider the $ and $b to be warts.) And anyhow, this will work just fine (see RFC 23): $sum = reduce ^a + ^b, @numbers;
Re: RFC 83 (v1) Make constants look like variables
Jonathan Scott Duff wrote: On Mon, Aug 14, 2000 at 06:28:23PM -0700, Nathan Wiger wrote: Well, just to counter argue, I feel exactly the opposite way. I'd like the keyword to be "constant" instead of "const". I've always thought "const" was a needless save of 3 characters. Constants should be obvious to pick out. The inventors of UNIX, when asked "What was your biggest mistake?" replied "Spelling creat() without the 'e'". Ditto here, IMO. Amen. Which is the easiest for anyone to tell what's going on? my num $PI : constant = 3.1415926; my num $PI : const = 3.1415926; my num $PI =| 3.1415926; Admittedly, "const" is pretty darn close to "constant", so tolerable. But =| is way too obscure, I think. Not only obscure but backwards IMHO. Rather than using some weird assignment operator to modify the attributes of a scalar (after all, constancy is a property of the scalar), better the attributes should be verbose and explicit. Yes, I agree too. I'm going to submit a v2 of this RFC shortly, which will clarify a few points about the use of const with lists and hashes and make the change to attribute notation, but otherwise will be pretty much the same. Since there hasn't really been concensus on this issue, those interested in alternative notation, or a wider array of scenarios where constant can be used, should submit a counter-RFC.
Re: RFC 84 (v1) Replace = (stringifying comma) with =
Stephen P. Potter wrote: Lightning flashed, thunder crashed and John Porter [EMAIL PROTECTED] whispered : | Here's a counter-proposal: throw out hashes as a separate internal | data type, and in its place define a set of operators which treat | (properly constructed) arrays as associative arrays. It's the Doesn't it make more sense to get rid of arrays and just use hashes? No, neither proposal makes sense. Arrays can be stored compactly and accessed and iterated through quickly, because they can take advantage of the fact that they are always indexed by an integer. You could remove the array/list data type and rely on Perl to try and implement hashes indexed by integers as a list, but that would introduce a lot of complexity for little real benefit.
Array storage (was Re: RFC 84 (v1) Replace = (stringifying comma) with =)
Jarkko Hietaniemi wrote: On Wed, Aug 16, 2000 at 08:37:21AM +1000, Jeremy Howard wrote: Stephen P. Potter wrote: Lightning flashed, thunder crashed and John Porter [EMAIL PROTECTED] whispered : | Here's a counter-proposal: throw out hashes as a separate internal | data type, and in its place define a set of operators which treat | (properly constructed) arrays as associative arrays. It's the Doesn't it make more sense to get rid of arrays and just use hashes? No, neither proposal makes sense. Arrays can be stored compactly and $a[1_000_000_000] = 'oh, really?' # :-) my int @a: sparse; $a[1_000_000_000] = 'Yes, really!' # :P OK, so I cheated... I haven't submitted my RFC for a 'sparse' attribute yet. My point is that arrays *can* be stored compactly, not that they always *are*. Another type of array storage is that required for lazily generated lists (see RFC 81) http://tmtowtdi.perl.org/rfc/81.pod
Component wise || and RFC 82 (was Re: RFC 104 (v1) Backtracking)
Nathan Torkington wrote: Jeremy Howard writes: @result = @a || @b; Which applies '||' component-wise to elements of @a and @b, placing the result in @result. *Ptui* That's not how *I* want || to behave on lists/arrays. I want @result = @a || @b; to be like: (@result = @a) or (@result = @b); That's what all my students keep expecting it to mean. Note that RFC 82 (http://tmtowtdi.perl.org/rfc/82.pod) proposes that _all_ operators behave the same way in a list context. To me, this consistancy would be a real win. The behaviour you're discussing is described in RFC 45. RFC 45 proposes a special case for the || operator, which is only using its short-circuiting functionality, not its 'or' functionality. In this case I would think that using an explicit 'if' would be more appropriate. I'm working with the maintainer of RFC 45 at the moment to see if we can get the consistency of component-wise behaviour in a list context without losing the ability to short-circuit in an assignment. A potential compromise would be to say that RFC 82 only applies to operators that do not short-circuit. However, my view is that the loss of consistency through such a step would be a substantial source of confusion. It would also mean that overloaded operators with a lazily evaluated parameter would not work in a consistent manner.
Re: Array storage (was Re: RFC 84 (v1) Replace = (stringifying comma) with =)
Dan Sugalski wrote: At 05:55 PM 8/15/00 -0500, Jarkko Hietaniemi wrote: No, neither proposal makes sense. Arrays can be stored compactly and $a[1_000_000_000] = 'oh, really?' # :-) my int @a: sparse; I see: you have a time machine and I don't. So very unfair... Need to upgrade to that new machine with the Pentium UltraMegaPro IV 2000 processor. (Now with thiotimoline! :) $a[1_000_000_000] = 'Yes, really!' # :P OK, so I cheated... I haven't submitted my RFC for a 'sparse' attribute yet. My point is that arrays *can* be stored compactly, not that they always I smell...n-level bitmaps? Nah, you smell vapor. The shapes in the vapor may well be n-level bitmaps, though. (Or possibly hashes with fixed keys optimized for integer key hashing for *real* sparse arrays...) The shape and feel of the vapor will be described in the RFC. The actual chemical makeup of the vapor will not.
Re: RFC 90 (v1) Builtins: zip() and unzip()
Ariel Scolnicov wrote: Damian Conway [EMAIL PROTECTED] writes: Just to point out that the standard CS term is "merge". `merge' produces a list of items from 2 (or more) lists of items; `zip' produces a list of pairs (or tuples) of items from 2 (or more) lists of items. So in a language like Haskell which uses square brackets for lists and round for tuples (and `==' for equality, etc.): merge [1,2,3,4],[5,6,7,8] == [1,5,2,6,3,7,4,8] and zip [1,2,3,4],[5,6,7,8] == [(1,5),(2,6),(3,7),(4,8)] This brings up an interesting question... which behaviour would we prefer? Currently the RFC defines zip() as producing a flat list, rather than a list of references to arrays. Of course, you can always say: $haskell_zip = partition (zip @^listOfLists, scalar @^listOfLists); which is why I figured the flat-by-default version would be more useful. If we created the partitioned version by default, then the other version would be: $haskell_merge = map @^, @listOfLists; which seems a little harder to evaluate lazily (for an individual item in a tuple, that is--evaluating a whole tuple lazily would be straightforward). Assuming that the current definition remains, 'merge' does seem more appropriate (and less offensive to the 'functionally challenged' ;-)
Re: RFC 90 (v1) Builtins: zip() and unzip()
Nathan Wiger wrote: "David L. Nicol" wrote: These things sound like perfectly reasonable CPAN modules. What's the block prevenenting their implementation w/in the perl5 framework? Jeremy and I are working on a general purpose matrix/unmatrix function that may well be core-worthy. This would allow arbitrary reshaping of 2d (Nd?) arrays into any form imaginable. Actually, I still remain to be convinced that RFC 81 (Lazily evaluated list generation functions) isn't already this generic tool (when used as an index to another list). When you've got some examples of using your proposed 'reshape' (or whatever it'll be called), I'll see what the same code looks like with RFC 81 notation... However, I would probably argue that zip/unzip/merge/unmerge/whatever go into a module (Math::Matrix?) since they'll probably just be specialized calling forms of matrix/unmatrix. I think the trend is to put a lot of formerly-core functions and features in modules, especially if subs get fast enough (and it sounds like they're going to). Definitely, if the generic foundation for them (lazily generated lists, reshape, ...) is there. But to answer Nick's question, the reason they're not in Perl 5 in this way at the moment is that Perl 5 doesn't provide the foundation required for them. Although it's easy enough to write a zip or partition function in Perl 5, because it can't be evaluated lazily and would therefore be useless for any real numeric programming. Also there's no use in having just array reshaping functions if the rest of the baggage required to avoid explicit loops isn't in the language. In general, if array notation (i.e. working with lists without explicit loops) isn't reliably efficient, I would always use explicit loops instead (since the loss of clarity is more than outweighed by the increased speed and lower memory use).
Re: RFC 89 (v2) Controllable Data Typing
=head1 TITLE Controllable Data Typing =head1 VERSION Maintainer: Syloke Soong [EMAIL PROTECTED] Mailing List: [EMAIL PROTECTED] ... Retain current flexibility of Perl liberal variables. Provide a new form of declaring variables: scope cast-type $varname:constraint; ... =head2 Constant constraint A constant confered by the keyword const creates a final value; That value stays immutably the same throughout the scope of its existence. I don't think this RFC is the place to try and cover all of the 'constraints' that might be in perl 6. Sure, this RFC may as well be the one to formalise the notation that attributes of variables are defined in this way, but also trying to enumerate and describe them would make this RFC very bloated. On the 'const' issue specifically, there's already an RFC for that (RFC 83). Syloke--if you really want to incorporate this RFC then please let me know, and I'll send you all of the suggested changes I've received in feedback to RFC 83. However, I think it would be much easier if we kept discussion of the actual 'constraints' in separate RFCs, allowing debate on each issue to be kept separate. Furthermore, using the term 'constraint' is misleading--I think 'attribute' is better. Another to-be-proposed attribute is 'sparse', which gives Perl information about how to store a list. And then of course there's character set type attributes (eg 'utf8')... These are not constraints, but they all use the same notation. Finally, the attribute notation needs a way of taking parameters. For instance, the 'sparse' attribute needs a default value, and an optional 'sparsity index'. We need syntax that allows something like: my int @sparse_array : sparse(0,0.99) = ((0) x 5 , 1);
Re: RFC 90 (v1) Builtins: zip() and unzip()
Nathan Wiger wrote: With zip/unzip/partition I really gotta say, those functions *need* to be renamed, for a variety of reasons. First, they have well-established computer meanings (compression, disks). Second, "partition" is too long anyways. I've seen numerous emails from other people saying the same thing. If other languages name these functions zip/unzip I'd argue they're wrong. "mop", "cleave", "weave", "mix", or any other term that doesn't already have well-established computer meaning is acceptable. Jeremy, in the next version of the RFC's would you be willing to suggest some alternatives? Yes, of course! I do read every message posted regarding the RFCs I'm maintaining, and in the 2nd version I will incorporate the suggestions that are made. Where the community hasn't reached consensus, I'll propose a solution I think is appropriate (based on the on-list debate), and include a discussion section mentioning other options--after all, in the end it's up to Larry to decide, and my view is that my role as an RFC maintainer is to summarise the combined wisdom of the Perl community to help him do that. In this case, I've got no particular feeling of ownership over the function naming I proposed--I just stole them from the names of the same functions in widely used functional languages. Personally, I like 'weave' rather than 'zip'. I'm happy with 'unweave' too--although I'm still unsure about that one... BTW, I've seen no discussion of RFC 82 (Make operators behave consistently in a list context), so I'm not sure what to do with it... Is that because everyone thinks it's great, or that it's stupid, or just that no-one's got any idea what I'm trying to say?
Re: RFC 90 (v1) Builtins: zip() and unzip()
Jarkko Hietaniemi wrote: I simply can't get over the feeling that the proposed zip/unzip/partition functions are far too specialized/simple, That's certainly a possibility. They are such common operations though, it might be a win to build them in. With zip/unzip/partition and good array slicing syntax it is possible to construct many n-dim matrix transforms and functions. and that something more general-purpose in the order of pack/unpack (with the transformation spec encoded in a template) for lists would be preferable. That's one of the things RFC 81--Lazily evaluated list generation functions, covers. Using a generated list as the indexes to an array provides completely flexible array transformations. When someone said that matrix/unmatrix would be better I did not find that to be a joke: on the contrary, what we are talking here would be a mapping from n-dim arrays to p-dim arrays. Just simply thinking in 1-dim lists/arrays doesn't cut it. Of course. But RFCs 81, 82, 90, and 91 provide between them all the parts required for matrix operations over any number of dimensions. Even although the basic platform is a 1d array, n-dim operations are provided for through generated lists, zip, unzip, and partition. For instance, let's take the example from RFC 91 and modify it to calculate column sums from a 2d matrix: # Add all the elements of a list together, returning the result $sum = reduce (^total + ^element, @^elements); # Swap the rows and columns of a list of lists $transpose = partition( # Find the size of each column scalar @^list_of_lists, # Interleave the rows zip(@^lists_of_lists); ) # Take a list of references to lists, and return an array of each # sub-list's sum $sum_cols = reduce ( push (@^total, $sum-( @^next_list )), $transpose-(^list_of_lists), ); # Example usage of $sum_mult @a = (1,3,5); @b = (2,4,6); @c = (-1,1,-1); @answer = @{$sum_cols-(\@a, \@b, \@c)}; # 1*2*-1,3*4*1,5*6*-1=(-2,12,-30) Mind you, I don't think your average Perl hacker should have to worry about all this--it would be nice if Perl also provided some easy way to use n-dim arrays directly. However, with the building blocks I've described the n-dim stuff could be written in pure Perl. I'm not sure that is the best way--but it's certainly one way (and the way C++ took, when it introduced the 1d valarray--see Stroustrup, "The C++ Programming Language, 3rd Edition", pp662-679). I'm still trying to work out what the alternative might look like--a set of language constructs that operated on n-dim arrays directly. This is really hard, but there are some good starting points in: - PDL: pdl.perl.org - Blitz++: http://oonumerics.org/blitz/ - POOMA: http://www.acl.lanl.gov/pooma/ PDL uses a special language ('PP') that lets the programmer explicitly specify loops over specific dimensions. Blitz++ and POOMA are more adventerous, providing advanced iterator/index classes that operate over n-dim arrays in defined ways, but require much more work from the compiler. Of course, if we go down this route, we would need to ensure that related RFCs (like 'reduce') can handle using these kinds of arrays and iterators.
Re: Imrpoving tie() (Re: RFC 15 (v1) Stronger typing through tie.)
Dan Sugalski writes: I don't mind if someone overrides the vtable functions for a variable of a built-in type--a standard declaration of: my $foo; is really shorthand for: my generic_scalar $foo; more or less. If a variable gets its vtable functions messed with, well, that's OK. If + doesn't actually add, well, no biggie. I'd like to have the optimizer not assume functionality on variables that have been overridden somehow. (So if $foo gets tied we stop assuming we know what + does, for example) The bigger thing I worry about is if someone does something odd like make + short-circuit, or not short-circuit. That's the sort of base behaviours I'd like written in stone someplace. Would that really be a problem? Damian is writing an RFC that will propose allowing '?' in a function prototype to indicate a lazily evaluated parameter. If someone overloaded C+ with a function that had a prototype where the 2nd parameter was evaluated lazily, couldn't Perl notice this and do the right thing?
Re: the currying operator
Piers Cawley wrote: Graham Barr [EMAIL PROTECTED] writes: On Fri, Aug 11, 2000 at 01:47:12PM +0100, Piers Cawley wrote: /^_/ What is that matching ? We've done this. It's matching a string that begins with '_'. Which is why, if you want to disambiguate you do /^{_}/ just like you do with variables. No that won't work either. That matches the string {_} Damian and I put this example into the RFC explicitly. I like what I wrote the first time, so I'll just repeat it: =head2 Resolving ambiguity The following is ambiguous: $check_start = $somestring =~ /^_foobar/; This should be interpreted as an immediate pattern match for '_foobar' at the start of a string. To cause this to be interpreted as a higher order function, the ambiguity must be resolved through using braces: $check_start = $somestring =~ /^{_}foobar/; which creates a higher order function testing for its argument, followed by 'foobar', anywhere in $somestring. That is: $check_start = sub { $somestring =~ /$_[0]foobar/ }; It wouldn't be too hard to get P52P6 to recognise ^{something} in a regex and quote it, would it? (And to make the quoting work properly...) I was hoping that the use of {} would seem reasonably intuitive given the similarity to their use in resolving ambiguity in dereferencing and interpoling variables.
Re: Data type and attribute syntax (was Re: RFC 89 (v1) Controllable Data Typing)
Chaim Frenkel wrote: A nice way of making a value read-only is lovely. And let it be a runtime error to modify it. The caller can easily do a foo eval{$const_item} to remove the read-only attribute. Hmm, perhaps we should rename the attribute :read-only Can't we make a value 'truely constant', as inlined subs do now? It would be nice to declare data 'truely constant' and 'never aliased' to enable appropriate optimisations (these are two separate attributes, BTW). Modifying a constant value using eval should give the same errors as modifying a literal--and I would have thought this would be straightforwards, if the constant is just replaced with its value at compile time. Aliasing a 'never aliased' value should also provide a meaningful error. Currently, of course, another problem with constants as inlined subs is that the errors are horrible: use constant AA=6; AA = 5; # Can't modify non-lvalue subroutine call in scalar assignment Ick!
Re: RFC 83 (v1) Make constants look like variables
Steve Simmons wrote: I really like the idea of constants in perl, but think the RFC should go a lot further. C/C++ has solved this problem; we should follow in their footsteps. ... I desparately _don't_ want to follow the horrible mess that is const in C++. The enormous hassle in trying to do any sensible optimisation on C++'s consts has been a real nightmare. Trying to find a way of adding a 'restrict' keyword which actually does something useful has been just as bad. For why guarantees of aliasing and constness are important, see for example: http://oonumerics.org/oon/oon-list/archive/0328.html The same sort of thought process should be applied to lists, arrays, references, and objects. IMHO the RFC should be updated to reflect all these issues before we can properly consider it. Sorry to rain all over your parade, Jeremy. The core idea is great, but the implementation section really needs expansion. It's not like that! An RFC is the start of a discussion, not the last word... Your interesting ideas are not 'raining on my parade'--they're very constructive thoughts and I really appreciate them. This is certainly an area I've been thinking about--as I said in an earlier post: quote Yes. But what about types and attributes within complex types? - Constant refs vs refs to constants? - Types of hash (or 'pair') keys and elements? - Attributes (e.g. constantness) of hash keys and elements? - Ditto for arrays/lists... I left this out of v1 of the RFC because I wanted to get some feedback on syntax. If we can flesh this out I'll incorporate it into v2. /quote My current thinking is that a ref to a constant should only be possible through creating a constant first, and then creating a reference to that separately. I'm still unsure of array and hash elements, however. Using the ':' in this way has the potential to cause ambiguity with list generation syntax (RFC 81), and add complexity where there may not be much payoff. I really need to see some examples of code where this would be of practical benefit, I think.
Re: RFC 90 (v1) Builtins: zip() and unzip()
Philip Newton wrote: Would it not be more natural to pass the *number* of lists to unzip, rather than the desired length? This way, unzip() would know to pick off elements two-at-a-time, three-at-a-time, etc., rather than having to go through the zipped list, count the elements, divide by $list_size, etc. Could be. It's a bit more intuitive too, isn't it? (The 2nd param is the 'step size'). Unless I misunderstood the example and you wanted the result to be ([1,2,3], [4,5,6]) in which case unzip would not have to do nearly as much work. But then (1..7) would unzip(3) into ([1,2,3], [4,5,6], [7]). No, you didn't misunderstand. That's partition(), which is RFC 91.