Re: RFC 90 (v1) Builtins: zip() and unzip()
Lightning flashed, thunder crashed and Perl6 RFC Librarian [EMAIL PROTECTED] whispered: | =head1 TITLE | | Builtins: zip() and unzip() | [snip] | | its arguments. Cunzip($list_size, \@list) would reverse this operation. | [snip] | | If the list to be unzipped is not an exact multiple of the partition size, | the final list references are not padded--their length is one less than | the list size. For example: | | @list = (1..7); | @unzipped_list2 = unzip(3, \@list); # ([1,4,7], [2,5], [3,6]) This wording is confusing. Is $list_size or "the partition size" supposed to be the length of each list, or the number of lists? The way it is described leads me to think it should be the length of each list, but this example shows it being the number of lists. I would expect the @unzipped_list2 would return ([X,Y,Z], [A,B,C], [M]), although I can't wrap my mind around which values should go where yet. It makes more sense for it to be the number of lists, in which case @unzipped_list should be ([1,4], [2,5], [3,6]) not ([1,3,5], [2,4,6]). -spp
Re: RFC 90 (v1) Builtins: zip() and unzip()
Lightning flashed, thunder crashed and "Jeremy Howard" [EMAIL PROTECTED] whispered: | @unzipped_list2 should not be([X,Y,Z], [A,B,C], [M]). The RFC's proposed | behaviour makes it work as the inverse of zip(), which is the desired | behaviour. The reason I used letters instead of the actual values is because I couldn't make it make any sense when it meant the length of each list. When it means the number of lists to break into, ([1,4,7], [2,5], [3,6]) makes perfect sense. -spp
Re: RFC 90 (v1) Builtins: zip() and unzip()
In message [EMAIL PROTECTED] Graham Barr [EMAIL PROTECTED] wrote: On Fri, Aug 11, 2000 at 03:30:28PM -, Perl6 RFC Librarian wrote: In order to reverse this operation we need an Cunzip function: @zipped_list = zip(\@a,\@b); # (1,2,3,4,5,6) @unzipped_list = unzip(3, \@zipped_list); # ([1,3,5], [2,4,6]) Is unzip used that often ? I wondered the same thing. As far as I can tell from a quick perusal of my copy of "Introduction to Functional Programming" there isn't a direct inverse of zip in Miranda. Of course if the array slicing RFC goes through you could always extract the original lists from a zipped list using array slices. =head1 IMPLEMENTATION The Czip and Cunzip functions should be evaluated lazily. lazily ? why, no other operator does by default (I am asuming here) Currently... I thought one idea for perl6 was to make more things use iterators instead of creating large temporary lists. Tom -- Tom Hughes ([EMAIL PROTECTED]) http://www.compton.nu/ ...I'm so close to hell I can almost see Vegas!
Re: RFC 90 (v1) Builtins: zip() and unzip()
Damian Conway [EMAIL PROTECTED] writes: Just to point out that the standard CS term is "merge". `merge' produces a list of items from 2 (or more) lists of items; `zip' produces a list of pairs (or tuples) of items from 2 (or more) lists of items. So in a language like Haskell which uses square brackets for lists and round for tuples (and `==' for equality, etc.): merge [1,2,3,4],[5,6,7,8] == [1,5,2,6,3,7,4,8] and zip [1,2,3,4],[5,6,7,8] == [(1,5),(2,6),(3,7),(4,8)] (note: `merge' is often also used to denote producing a list which respects ordering; then the above merge would produce [1,2,3,4,5,6,7,8]). [...] It's called `zip'. Really. -- Ariel Scolnicov|"GCAAGAATTGAACTGTAG"| [EMAIL PROTECTED] Compugen Ltd. |Tel: +972-2-6795059 (Jerusalem) \ We recycle all our Hz 72 Pinhas Rosen St.|Tel: +972-3-7658514 (Main office)`- Tel-Aviv 69512, ISRAEL |Fax: +972-3-7658555http://3w.compugen.co.il/~ariels
Re: RFC 90 (v1) Builtins: zip() and unzip()
Ariel Scolnicov wrote: Damian Conway [EMAIL PROTECTED] writes: Just to point out that the standard CS term is "merge". `merge' produces a list of items from 2 (or more) lists of items; `zip' produces a list of pairs (or tuples) of items from 2 (or more) lists of items. So in a language like Haskell which uses square brackets for lists and round for tuples (and `==' for equality, etc.): merge [1,2,3,4],[5,6,7,8] == [1,5,2,6,3,7,4,8] and zip [1,2,3,4],[5,6,7,8] == [(1,5),(2,6),(3,7),(4,8)] This brings up an interesting question... which behaviour would we prefer? Currently the RFC defines zip() as producing a flat list, rather than a list of references to arrays. Of course, you can always say: $haskell_zip = partition (zip @^listOfLists, scalar @^listOfLists); which is why I figured the flat-by-default version would be more useful. If we created the partitioned version by default, then the other version would be: $haskell_merge = map @^, @listOfLists; which seems a little harder to evaluate lazily (for an individual item in a tuple, that is--evaluating a whole tuple lazily would be straightforward). Assuming that the current definition remains, 'merge' does seem more appropriate (and less offensive to the 'functionally challenged' ;-)
Re: RFC 90 (v1) Builtins: zip() and unzip()
Nathan Wiger wrote: "David L. Nicol" wrote: These things sound like perfectly reasonable CPAN modules. What's the block prevenenting their implementation w/in the perl5 framework? Jeremy and I are working on a general purpose matrix/unmatrix function that may well be core-worthy. This would allow arbitrary reshaping of 2d (Nd?) arrays into any form imaginable. Actually, I still remain to be convinced that RFC 81 (Lazily evaluated list generation functions) isn't already this generic tool (when used as an index to another list). When you've got some examples of using your proposed 'reshape' (or whatever it'll be called), I'll see what the same code looks like with RFC 81 notation... However, I would probably argue that zip/unzip/merge/unmerge/whatever go into a module (Math::Matrix?) since they'll probably just be specialized calling forms of matrix/unmatrix. I think the trend is to put a lot of formerly-core functions and features in modules, especially if subs get fast enough (and it sounds like they're going to). Definitely, if the generic foundation for them (lazily generated lists, reshape, ...) is there. But to answer Nick's question, the reason they're not in Perl 5 in this way at the moment is that Perl 5 doesn't provide the foundation required for them. Although it's easy enough to write a zip or partition function in Perl 5, because it can't be evaluated lazily and would therefore be useless for any real numeric programming. Also there's no use in having just array reshaping functions if the rest of the baggage required to avoid explicit loops isn't in the language. In general, if array notation (i.e. working with lists without explicit loops) isn't reliably efficient, I would always use explicit loops instead (since the loss of clarity is more than outweighed by the increased speed and lower memory use).
Re: RFC 90 (v1) Builtins: zip() and unzip()
Nathan Wiger wrote: With zip/unzip/partition I really gotta say, those functions *need* to be renamed, for a variety of reasons. First, they have well-established computer meanings (compression, disks). Second, "partition" is too long anyways. I've seen numerous emails from other people saying the same thing. If other languages name these functions zip/unzip I'd argue they're wrong. "mop", "cleave", "weave", "mix", or any other term that doesn't already have well-established computer meaning is acceptable. Jeremy, in the next version of the RFC's would you be willing to suggest some alternatives? Yes, of course! I do read every message posted regarding the RFCs I'm maintaining, and in the 2nd version I will incorporate the suggestions that are made. Where the community hasn't reached consensus, I'll propose a solution I think is appropriate (based on the on-list debate), and include a discussion section mentioning other options--after all, in the end it's up to Larry to decide, and my view is that my role as an RFC maintainer is to summarise the combined wisdom of the Perl community to help him do that. In this case, I've got no particular feeling of ownership over the function naming I proposed--I just stole them from the names of the same functions in widely used functional languages. Personally, I like 'weave' rather than 'zip'. I'm happy with 'unweave' too--although I'm still unsure about that one... BTW, I've seen no discussion of RFC 82 (Make operators behave consistently in a list context), so I'm not sure what to do with it... Is that because everyone thinks it's great, or that it's stupid, or just that no-one's got any idea what I'm trying to say?
Re: RFC 90 (v1) Builtins: zip() and unzip()
* Jeremy Howard ([EMAIL PROTECTED]) [13 Aug 2000 17:28]: [...] Personally, I like 'weave' rather than 'zip'. I'm happy with 'unweave' too--although I'm still unsure about that one... Weave is too much like Knuth's tangle and weave pair of programs for his WEB idea. *sigh* All the good names are taken =) BTW, I've seen no discussion of RFC 82 (Make operators behave consistently in a list context), so I'm not sure what to do with it... Is that because everyone thinks it's great, or that it's stupid, or just that no-one's got any idea what I'm trying to say? I glanced through it and thought it seemed fine. If people think something is stupid, they'll email. If people want something changed, they'll email. If something is good, they won't =) cheers, -- iain truskett, aka Koschei.http://eh.org/~koschei/ Emacs is a nice OS - but it lacks a good text editor. That's why I am using Vim. -- Anonymous.
Re: RFC 90 (v1) Builtins: zip() and unzip()
On Sun, Aug 13, 2000 at 06:54:10PM +1000, iain truskett wrote: * Jeremy Howard ([EMAIL PROTECTED]) [13 Aug 2000 17:28]: [...] Personally, I like 'weave' rather than 'zip'. I'm happy with 'unweave' too--although I'm still unsure about that one... Weave is too much like Knuth's tangle and weave pair of programs for his WEB idea. *sigh* All the good names are taken =) That, however, is nowhere as well known (=confusion causing) as 'zip'. Pretty much every English verb must have by now been taken as a name of a piece of software, we have to draw the line somewhere... -- $jhi++; # http://www.iki.fi/jhi/ # There is this special biologist word we use for 'stable'. # It is 'dead'. -- Jack Cohen
Re: RFC 90 (v1) Builtins: zip() and unzip()
* Jarkko Hietaniemi ([EMAIL PROTECTED]) [14 Aug 2000 00:15]: On Sun, Aug 13, 2000 at 06:54:10PM +1000, iain truskett wrote: * Jeremy Howard ([EMAIL PROTECTED]) [13 Aug 2000 17:28]: [...] Personally, I like 'weave' rather than 'zip'. I'm happy with 'unweave' too--although I'm still unsure about that one... Weave is too much like Knuth's tangle and weave pair of programs for his WEB idea. *sigh* All the good names are taken =) That, however, is nowhere as well known (=confusion causing) as 'zip'. Pretty much every English verb must have by now been taken as a name of a piece of software, we have to draw the line somewhere... True, but we can still look. qw/fuse unite spin zigzag entwine/ etc. cheers, -- iain truskett, aka Koschei.http://eh.org/~koschei/ Xander: But we were going to have a romantic evening! Anya: We were going to light lots of candles and have sex near them!
Re: RFC 90 (v1) Builtins: zip() and unzip()
Just to point out that the standard CS term is "merge". I guess the opposite would be..."emerge"??? Damian
Re: RFC 90 (v1) Builtins: zip() and unzip()
Subject: RFC 90 (v1) Builtins: zip() and unzip() I just don't like the name zip(), unzip() - shold be saved for something that will really do commpression. Variants : combine I like merge too.. As of this it will be good if there some sort of compression internally by the perl, say for the data structures... I'm not sure how easly this can be done, but this will big win especialy when worknig on big text files or arrays (RLE is enought in most cases). everyone knows the BLOATED https's under mod_perl. For example Interbase DB uses RLE compression for at record level... this is big saving ... = iVAN [EMAIL PROTECTED] =
Re: RFC 90 (v1) Builtins: zip() and unzip()
what about (not zip() offcource :")): @a = (1,2,3); @b = (4,5,6); @c = (7,8,9); zip (how,@a,@b,@c); i.e. @list = zip (0,@a,@b,@c); #stright result (1,2,3,4,5,6,7,8,9) @list = zip (1,@a,@b,@c); #reverse result (7,8,9,5,6,7,1,2,3) @list = zip(2,@a,@b,@c); # all first elems, then all second..etc result (1,4,7,2,5,8,3,6,9) @list = zip(3,@a,@b,@c); #and reverse... result (7,4,1,8,5,2,9,6,3) Also we can tell : @list = zip(1,@a,@b,reverse @c); = iVAN [EMAIL PROTECTED] =
Re: RFC 90 (v1) Builtins: zip() and unzip()
I simply can't get over the feeling that the proposed zip/unzip/partition functions are far too specialized/simple, and that something more general-purpose in the order of pack/unpack (with the transformation spec encoded in a template) for lists would be preferable. When someone said that matrix/unmatrix would be better I did not find that to be a joke: on the contrary, what we are talking here would be a mapping from n-dim arrays to p-dim arrays. Just simply thinking in 1-dim lists/arrays doesn't cut it. -- $jhi++; # http://www.iki.fi/jhi/ # There is this special biologist word we use for 'stable'. # It is 'dead'. -- Jack Cohen
Re: RFC 90 (v1) Builtins: zip() and unzip()
Jarkko Hietaniemi wrote: I simply can't get over the feeling that the proposed zip/unzip/partition functions are far too specialized/simple, That's certainly a possibility. They are such common operations though, it might be a win to build them in. With zip/unzip/partition and good array slicing syntax it is possible to construct many n-dim matrix transforms and functions. and that something more general-purpose in the order of pack/unpack (with the transformation spec encoded in a template) for lists would be preferable. That's one of the things RFC 81--Lazily evaluated list generation functions, covers. Using a generated list as the indexes to an array provides completely flexible array transformations. When someone said that matrix/unmatrix would be better I did not find that to be a joke: on the contrary, what we are talking here would be a mapping from n-dim arrays to p-dim arrays. Just simply thinking in 1-dim lists/arrays doesn't cut it. Of course. But RFCs 81, 82, 90, and 91 provide between them all the parts required for matrix operations over any number of dimensions. Even although the basic platform is a 1d array, n-dim operations are provided for through generated lists, zip, unzip, and partition. For instance, let's take the example from RFC 91 and modify it to calculate column sums from a 2d matrix: # Add all the elements of a list together, returning the result $sum = reduce (^total + ^element, @^elements); # Swap the rows and columns of a list of lists $transpose = partition( # Find the size of each column scalar @^list_of_lists, # Interleave the rows zip(@^lists_of_lists); ) # Take a list of references to lists, and return an array of each # sub-list's sum $sum_cols = reduce ( push (@^total, $sum-( @^next_list )), $transpose-(^list_of_lists), ); # Example usage of $sum_mult @a = (1,3,5); @b = (2,4,6); @c = (-1,1,-1); @answer = @{$sum_cols-(\@a, \@b, \@c)}; # 1*2*-1,3*4*1,5*6*-1=(-2,12,-30) Mind you, I don't think your average Perl hacker should have to worry about all this--it would be nice if Perl also provided some easy way to use n-dim arrays directly. However, with the building blocks I've described the n-dim stuff could be written in pure Perl. I'm not sure that is the best way--but it's certainly one way (and the way C++ took, when it introduced the 1d valarray--see Stroustrup, "The C++ Programming Language, 3rd Edition", pp662-679). I'm still trying to work out what the alternative might look like--a set of language constructs that operated on n-dim arrays directly. This is really hard, but there are some good starting points in: - PDL: pdl.perl.org - Blitz++: http://oonumerics.org/blitz/ - POOMA: http://www.acl.lanl.gov/pooma/ PDL uses a special language ('PP') that lets the programmer explicitly specify loops over specific dimensions. Blitz++ and POOMA are more adventerous, providing advanced iterator/index classes that operate over n-dim arrays in defined ways, but require much more work from the compiler. Of course, if we go down this route, we would need to ensure that related RFCs (like 'reduce') can handle using these kinds of arrays and iterators.
Re: RFC 90 (v1) Builtins: zip() and unzip()
With zip/unzip/partition I really gotta say, those functions *need* to be renamed, for a variety of reasons. First, they have well-established computer meanings (compression, disks). Second, "partition" is too long anyways. I've seen numerous emails from other people saying the same thing. If other languages name these functions zip/unzip I'd argue they're wrong. "mop", "cleave", "weave", "mix", or any other term that doesn't already have well-established computer meaning is acceptable. Jeremy, in the next version of the RFC's would you be willing to suggest some alternatives? Alternatively, pick a different set, but I really think zip/unzip/partition are (a) confusing and (b) not obviously list manipulation functions. -Nate
Re: RFC 90 (v1) Builtins: zip() and unzip()
On Fri, Aug 11, 2000 at 03:30:28PM -, Perl6 RFC Librarian wrote: =head1 ABSTRACT It is proposed that two new functions, Czip, and Cunzip, be added to Perl. Czip(\@list1, \@list2, ...) would return a list that interleaved its arguments. Cunzip($list_size, \@list) would reverse this operation. I know other languages call it zip, but personally I dislike that name as zip() is commonly used with reference to compression. Although I do not have a good alternative. @a = (1,3,5); @b = (2,4,6); @zipped_list = zip(\@a,\@b); # (1,2,3,4,5,6) No need for the \ other builtin operators like shift,pop,splice etc dont need them, zip should not either. It's prototype would be (\@\@;\@\@\@\@\@\@) In order to reverse this operation we need an Cunzip function: @zipped_list = zip(\@a,\@b); # (1,2,3,4,5,6) @unzipped_list = unzip(3, \@zipped_list); # ([1,3,5], [2,4,6]) Is unzip used that often ? =head1 IMPLEMENTATION The Czip and Cunzip functions should be evaluated lazily. lazily ? why, no other operator does by default (I am asuming here) Effectively, Czip creates an iterator over multiple lists. If used as part of a reduction, the actual interleaved list need never be created. Yes it should return an iterator in an iterator context. An example I would use is for my($a,$b) (zip(@a,@b)) { # code } which would loop through both array together. Graham.
Re: RFC 90 (v1) Builtins: zip() and unzip()
I know other languages call it zip, but personally I dislike that name as zip() is commonly used with reference to compression. Ditto, I really dislike zip() and unzip(). They're PC and even UNIX commands on several platforms now, increasing confusion. Here's two names: mix() and unmix(). It's what's happening, right? Just as short too. No need for the \ other builtin operators like shift,pop,splice etc dont need them, zip should not either. Agreed. -Nate
Re: RFC 90 (v1) Builtins: zip() and unzip()
On Fri, Aug 11, 2000 at 10:06:38AM -0700, Nathan Wiger wrote: I know other languages call it zip, but personally I dislike that name as zip() is commonly used with reference to compression. Ditto, I really dislike zip() and unzip(). They're PC and even UNIX commands on several platforms now, increasing confusion. Here's two names: mix() and unmix(). It's what's happening, right? Just as short too. mix() sounds awfully disorderly. interleave()? No need for the \ other builtin operators like shift,pop,splice etc dont need them, zip should not either. splice() would be fine, but... -- $jhi++; # http://www.iki.fi/jhi/ # There is this special biologist word we use for 'stable'. # It is 'dead'. -- Jack Cohen
Re: RFC 90 (v1) Builtins: zip() and unzip()
I know other languages call it zip, but personally I dislike that name as zip() is commonly used with reference to compression. Although I do not have a good alternative. fold() and unfold()? merge() and cleave()? A
Re: RFC 90 (v1) Builtins: zip() and unzip()
On 11 Aug 2000, Perl6 RFC Librarian wrote: its arguments. Cunzip($list_size, \@list) would reverse this operation. [...] In order to reverse this operation we need an Cunzip function: @zipped_list = zip(\@a,\@b); # (1,2,3,4,5,6) @unzipped_list = unzip(3, \@zipped_list); # ([1,3,5], [2,4,6]) Would it not be more natural to pass the *number* of lists to unzip, rather than the desired length? This way, unzip() would know to pick off elements two-at-a-time, three-at-a-time, etc., rather than having to go through the zipped list, count the elements, divide by $list_size, etc. Unless I misunderstood the example and you wanted the result to be ([1,2,3], [4,5,6]) in which case unzip would not have to do nearly as much work. But then (1..7) would unzip(3) into ([1,2,3], [4,5,6], [7]). Cheers, Philip -- Philip Newton [EMAIL PROTECTED]
Re: RFC 90 (v1) Builtins: zip() and unzip()
Andy Wardley wrote: cleave()? Note that cleave is its own antonym! :-) -- John Porter
Re: RFC 90 (v1) Builtins: zip() and unzip()
On Fri, Aug 11, 2000 at 06:25:07PM +0100, Andy Wardley wrote: I know other languages call it zip, but personally I dislike that name as zip() is commonly used with reference to compression. Although I do not have a good alternative. fold() and unfold()? People would confude that for fold() in other languages which is like reduce() merge() and cleave()? I think I like interleave() best, but it's too long. thesaurus.com returns [Verbs] lie between, come between, get between; intervene, slide in, interpenetrate, permeate. put between, introduce, import, throw in, wedge in, edge in, jam in, worm in, foist in, run in, plow in, work in; interpose, interject, intercalate, interpolate, interline, interleave, intersperse, interweave, interlard, interdigitate; let in, dovetail, splice, mortise; insinuate, smuggle; infiltrate, ingrain. interfere, put in an oar, thrust one's nose in; intrude, obtrude; have a finger in the pie; introduce the thin end of the wedge; thrust in (insert) [more]. I think I like plow() or maybe just weave() Graham.
Re: RFC 90 (v1) Builtins: zip() and unzip()
Andy Wardley wrote: I know other languages call it zip, but personally I dislike that name as zip() is commonly used with reference to compression. Although I do not have a good alternative. fold() and unfold()? merge() and cleave()? A collate() and ...?
Re: RFC 90 (v1) Builtins: zip() and unzip()
Damian Conway wrote: Note that cleave is its own antonym! :-) I can see it now: @interspersed = cleave(@list1, @list2, @list3) @separated= cleave(3,@interspersed); Now *that's* DWIM! ;-) In fact, perl really only needs one OP: @results = dwim $stuff, @args, %hey; (Well, I guess that's two: the assignment is an op also.) -- John Porter
Re: RFC 90 (v1) Builtins: zip() and unzip()
Note that cleave is its own antonym! :-) I can see it now: @interspersed = cleave(@list1, @list2, @list3) @separated= cleave(3,@interspersed); Now *that's* DWIM! ;-) Damian
Re: RFC 90 (v1) Builtins: zip() and unzip()
In fact, perl really only needs one OP: @results = dwim $stuff, @args, %hey; (Well, I guess that's two: the assignment is an op also.) dwim @results, dwim $stuff, @args, %hey; Can you say 'Lisp'? Damian
Re: RFC 90 (v1) Builtins: zip() and unzip()
On Sat, Aug 12, 2000 at 07:22:01AM +1000, Damian Conway wrote: dwim @results, dwim $stuff, @args, %hey; Can you say 'Lisp'? Lithp Michael (who couldn't resist) -- Administrator www.shoebox.net Programmer, System Administrator www.gallanttech.com --
Re: RFC 90 (v1) Builtins: zip() and unzip()
Philip Newton wrote: Would it not be more natural to pass the *number* of lists to unzip, rather than the desired length? This way, unzip() would know to pick off elements two-at-a-time, three-at-a-time, etc., rather than having to go through the zipped list, count the elements, divide by $list_size, etc. Could be. It's a bit more intuitive too, isn't it? (The 2nd param is the 'step size'). Unless I misunderstood the example and you wanted the result to be ([1,2,3], [4,5,6]) in which case unzip would not have to do nearly as much work. But then (1..7) would unzip(3) into ([1,2,3], [4,5,6], [7]). No, you didn't misunderstand. That's partition(), which is RFC 91.