Re: relational language extension (was Re: request new Mapping|Hash operators)
Good luck w/ your studies. Viable alternatives to SQL are always welcome. ;-) On 3/23/07, Darren Duncan <[EMAIL PROTECTED]> wrote: At 7:15 PM -0700 3/23/07, John Beppu wrote: >You might find Dee interesting: > >http://www.quicksort.co.uk/ This Dee project in Python is a worthy thing to study, and it does represent a major part of what I believe Perl 6 should elegantly support, if not bundle. And while my own efforts with either Set::Relation or QDRDBMS (to rename) can not actually be used yet (but hopefully soon), Dee has actually been released, and AFAIK, works right now. In the short term, looking at that project will help to explain a lot of what I'm trying to get at more than my own explanations, probably. -- Darren Duncan
relational language extension (was Re: request new Mapping|Hash operators)
At 7:15 PM -0700 3/23/07, John Beppu wrote: You might find Dee interesting: http://www.quicksort.co.uk/ A relational language extension for Python Inspired by 'The Third Manifesto', a book by Chris Date and Hugh Darwen, we're putting forward an implementation of a truly relational language using Python (Dee). We address two problems: 1. The impedance mismatch between programming languages and databases 2. The weakness and syntactic awkwardness of SQL Actually, John, you raise a good point. This Dee project in Python is a worthy thing to study, and it does represent a major part of what I believe Perl 6 should elegantly support, if not bundle. And while my own efforts with either Set::Relation or QDRDBMS (to rename) can not actually be used yet (but hopefully soon), Dee has actually been released, and AFAIK, works right now. In the short term, looking at that project will help to explain a lot of what I'm trying to get at more than my own explanations, probably. -- Darren Duncan
Re: request new Mapping|Hash operators
You might find Dee interesting: http://www.quicksort.co.uk/ A relational language extension for Python Inspired by 'The Third Manifesto', a book by Chris Date and Hugh Darwen, we're putting forward an implementation of a truly relational language using Python (Dee). We address two problems: 1. The impedance mismatch between programming languages and databases 2. The weakness and syntactic awkwardness of SQL On 2/27/07, Darren Duncan <[EMAIL PROTECTED]> wrote: All, I believe that there is some room for adding several new convenience operators or functions to Perl 6 that are used with Mapping and Hash values. Or getting more to the point, I believe that the need for the relational data model concept of a tuple (a "tuple" where elements are addressed by name not position) would be satisfied by the existing Perl 6 data types of Mapping (immutable variant) and Hash (mutable variant), but that some common relational operations would be a lot easier to express if Perl 6 had a few more operators that make them concise. Below I will name some of these operators that, AFAIK, don't exist yet in some form; since they are all pure functions, I will use the Mapping type in their pseudo-Perl-6 signatures, but Hash versions should exist too. Or specifically, these should be part of the Mapping role, so anything that .does Mapping, such as a Hash, does them too? Some of these operators are like those for sets, but aren't exactly the same due to plain set ops not working for mappings or hashes as a whole. I want to emphasize that the operator names are those that are used in DBMS contexts, but you can of course name them something else in order for them to fit better into Perl 6; the importance is having some concise way to get the desired semantics. Also, this functionality doesn't have to be with new operators, but could utilize existing ones if there is a concise way to do so. Likewise, some could conceivably be macros, if it wouldn't impair performance. I also want to emphasize that I see this functionality being generally useful, and that it shouldn't just be shunted off to a third-party module. 1. join() aka natural_join(): function join of Mapping (Mapping $m1, Mapping $m2) { ... } This binary operator is conceptually like a set-union operator, in that it derives a Mapping that has all of the distinct keys and values of its 2 arguments, assuming any matching keys also have matching values. (Note that "matching" specifically means that === returns true, or if users get a choice, then that is its default meaning.) But if there are any matching keys with mismatching values, then this is a failure condition (they are incompatible), and the function returns undef instead (or fail, though given the anticipated use case, undef is more appropriate). It is only possible for 2 arguments to be incompatible if they have any keys in common; if they have none, the result is guaranteed to be defined/successful. If the 2 arguments have all keys in common, they must be equal, and the result is also equal to either. This join() function is both commutative and associative, and can generalize to N arguments. Any equal arguments are redundant and so duplicates can be ignored. Given 2 or more arguments, each is unioned pairwise until 1 remains. Given 1 argument, the result is that argument. Given zero arguments, the result is a Mapping with zero elements. A zero-element Mapping is its identity value. So join() can be used as a reduction operator, with identity of the empty Mapping, but that it can return undef (or fail) instead if any 2 arguments have the same keys but different associated values. For examples: join( { a<1>, b<2> }, { b<2>, c<3> } ) # returns { a<1>, b<2>, c<3> } join( { a<1>, b<2> }, { b<4>, c<3> } ) # returns undef join( { a<1>, b<2> }, { c<3>, d<4> } ) # returns { a<1>, b<2>, c<3>, d<4> } join( { a<1>, b<2> }, { a<1> } ) # returns { a<1>, b<2> } join( { a<1> } ) # returns { a<1> } join( { a<1> }, {} ) # returns { a<1> } join() # returns {} In practice, if a relation were implemented, say, as a set of Mapping, then the relational (natural) join could then be implemented sort of like this: function join of Relation (Relation $r1, Relation $r2) { return Relation( grep <-- $r1.values XjoinX $r2.values ); } That is, the relational (natural) join could then simply be implemented as a pairwise invocation of the tuple join between every tuple in each relation, keeping only the results that are defined. In this wider sense, a relational (natural) join is both an intersection in one dimension and a union in the other dimension. Now, I'm not currently asking for Relation to be implemented as a Perl 6 feature (it is
Re: request new Mapping|Hash operators
On 3/18/07, Darren Duncan wrote: On Sun, 18 Mar 2007, Aaron Crane wrote: > That's easy even in Perl 5. This modifies %hash in-place: > my @values = delete @[EMAIL PROTECTED]; > @[EMAIL PROTECTED] = @values; [...] If %hash contained keys a,b,c and @old_names was a and @new_names was b, then the above code would overwrite the existing b element, and leave 2 elements total. The operation I proposed needs to fail when one requests a colliding element, such as that situation; [...] Yes, and more than that, deleting an old key and adding a new one isn't strictly the same as renaming a key. Consider what would happen if the value were a funny object that had side-effects every time you evaluated it. As for the name, I don't think it's a problem for hashes and IOs to both have "rename" methods, but I do like Uri's "rekey" suggestion. -David
Re: request new Mapping|Hash operators
On Sun, 18 Mar 2007, Uri Guttman wrote: > as for rename on hash keys, why not call it rekey? also even if it is > called rename as a hash method it is different than rename as a function > to rename a file so there is no ambiguity. The name "rekey" or some such sounds like a reasonable name, and its more descriptive considering the context than "rename" is, since Hash|Mapping element names are called "keys" in Perl. > you can even do: > > @[EMAIL PROTECTED] = delete @[EMAIL PROTECTED]; > > and i am sure a simple p6 thing can be written which takes pairs of > old/new names and loops over them. someone wanna rewrite that here in p6 > and we can see how complex it would be? a rename method might not be so > important then. While you're at it, provide a concise-as-possible example that creates a new Hash container rather than mutating the existing container, because then that example would work with Mapping too. I would consider both mutating and non-mutating versions to be equally useful, but each in different contexts; the non-mutating version is for use by people writing pure functions|expressions and such. But regardless, the system needs to be designed, as an option if not always, such that the operation would fail if any @old_keys don't match a pre-change Hash|Mapping element or if any @new_keys match some other element not being rekeyed. The rationale would be to catch common errors, similar to referencing an undeclared symbol name. Or maybe have separate versions for strict or non-strict behaviour, or a pragma that toggles such, as is appropriate. -- Darren Duncan
Re: request new Mapping|Hash operators
> "AC" == Aaron Crane <[EMAIL PROTECTED]> writes: AC> That's easy even in Perl 5. This modifies %hash in-place: AC> my @values = AC> @[EMAIL PROTECTED] = @values; you can even do: @[EMAIL PROTECTED] = delete @[EMAIL PROTECTED]; and i am sure a simple p6 thing can be written which takes pairs of old/new names and loops over them. someone wanna rewrite that here in p6 and we can see how complex it would be? a rename method might not be so important then. as for rename on hash keys, why not call it rekey? also even if it is called rename as a hash method it is different than rename as a function to rename a file so there is no ambiguity. uri -- Uri Guttman -- [EMAIL PROTECTED] http://www.stemsystems.com --Perl Consulting, Stem Development, Systems Architecture, Design and Coding- Search or Offer Perl Jobs http://jobs.perl.org
Re: request new Mapping|Hash operators
On Sun, 18 Mar 2007, Aaron Crane wrote: > David Green writes: > > In the meantime, Darren's proposal still raises a lot of interesting > > language questions. For example, how *do* you rename a hash key? > > That's easy even in Perl 5. This modifies %hash in-place: > > my @values = delete @[EMAIL PROTECTED]; > @[EMAIL PROTECTED] = @values; > > While there's certainly motivation to wrap this up in a function or > operator, it doesn't strike me as something particularly difficult, or > necessarily more worthy of inclusion in Perl 6.0.0 than anything else. Actually, what I proposed is more complicated than that. Any key-renames such as I propose need to be non-colliding. If %hash contained keys a,b,c and @old_names was a and @new_names was b, then the above code would overwrite the existing b element, and leave 2 elements total. The operation I proposed needs to fail when one requests a colliding element, such as that situation; a successful operation will leave a Hash that is identical but for the element key changes; there would be the same number of elements and they have the same values. So if a short-hand syntax existed, it would be replacing more complicated code than that. -- Darren Duncan
Re: request new Mapping|Hash operators
David Green writes: > In the meantime, Darren's proposal still raises a lot of interesting > language questions. For example, how *do* you rename a hash key? That's easy even in Perl 5. This modifies %hash in-place: my @values = delete @[EMAIL PROTECTED]; @[EMAIL PROTECTED] = @values; While there's certainly motivation to wrap this up in a function or operator, it doesn't strike me as something particularly difficult, or necessarily more worthy of inclusion in Perl 6.0.0 than anything else. -- Aaron Crane
Re: request new Mapping|Hash operators
On 3/16/07, Darren Duncan wrote: On Wed, 7 Mar 2007, Smylers wrote: >[...] Perl is a better language than SQL, in general, [...] Likewise, we shouldn't have to write in SQL, or in pseudo-Perl-SQL, but just write in Perl. A database is supposed to be a base for *data*, after all. I'd love to be able to do all the coding in native Perl. All this said, I recognize that a lot of details can be provided in a Perl 6 module rather than being in the core language specification, and maybe that very well may be the best option. But I still can't help feeling that some of this naturally fits in the core language as much as Perl 6's replacement for PDL does. I'm not really that concerned about what's "CORE(TM)" or not; P6 minimises the distinction between them, and CP6AN will make it even less worth worrying about. Perhaps the most salient point is how well some proposed syntax would play with normal, "core" syntax. And that seems a most suitable thing to discuss in p6-*language* -- modules need a language, so we can discuss what makes good syntax for them too. (We've all seen modules that, while useful, don't feel very perlish (which in P5 is admittedly often a result of limitations in how much you can warp the language to your own ends).) > DBI is a module, not a core part of the Perl language. I think it would be odd for Perl 6 to have core support for these operations when most > users would use DBI instead. Maybe if it were built-in, a lot more people might discover they have a use for the new way... at any rate, it's a moot point until someone actually produces some code. In the meantime, Darren's proposal still raises a lot of interesting language questions. For example, how *do* you rename a hash key? (That doesn't seem like such a strange or unlikely need.) It's not something you'd use every day, but then again I can't remember the last time I renamed files in Perl. If we get a "rename" method for hashes, should it be called something else? Or should file-renaming be called something else? Or should both 'rename's be methods only, and not have any "rename" function/multi? [I think that might already be the case, although I can't put my finger on it at the moment.] ... -David
Re: request new Mapping|Hash operators
P.S. Sorry for not replying to this for so long, but I have been without a computer for the last week ... and possibly for the next week too ... right now, I'm on someone else's machine. -- On Wed, 7 Mar 2007, Smylers wrote: > On February 27th Darren Duncan writes: > > One common usage scenario, of relational-join in particular, is doing > > operations on tabular data, where you want to know something that > > involves matching up columns in several tables. > > > > I've seen lots of programs that do things like that. But by a long way > it's far more common to have the data tables in a database and use SQL > for the joining than for the data to be elsewhere, such that SQL can't > be used and it has to be read into Perl data structures for doing the > joining. My main point here is that it should be easy to code any common programming task in Perl itself, rather than coding it in some other language like SQL. Perl is a better language than SQL, in general, and we should be able to expres all database operations using Perl alone, and it should be possible for that Perl to be cleaner and at least as concise as SQL; moreover, SQL isn't capable of cleanly expressing a lot of concepts that Perl can express, so better to avoid the baggage. While there is a strong benefit to having a dedicated database engine, there is no reason that this can't be written in Perl, rather than some other language, and some support in the language itself for fundamental concepts will make it easier to write and use such Perl native engines. Also, a lot of people work with smaller or medium-sized data sets for which there isn't that great a performance hit for processing the data with more simple and naive implementations that have all the data in Perl data structures. One may, for example, be processing data from several text files, or that was recently input from the user. I see the situation here as being somewhat analagous to the situation with people writing code using inlined C because Perl couldn't express what they wanted in an efficient manner; Perl 6 added native support for things like native numeric types, and other things that PDL was used for, so people wouldn't have to write in a non-Perl language, and keep things simpler. Likewise, we shouldn't have to write in SQL, or in pseudo-Perl-SQL, but just write in Perl. This isn't to say that external database engines can't still be used. But it should be easier to use them transparently, such that you can write your relational operations in pure Perl regardless of whether your data is in Perl variables or in data structures tied to external engines. All this said, I recognize that a lot of details can be provided in a Perl 6 module rather than being in the core language specification, and maybe that very well may be the best option. But I still can't help feeling that some of this naturally fits in the core language as much as Perl 6's replacement for PDL does. > DBI is a module, not a core part of the Perl language. I think it would > be odd for Perl 6 to have core support for these operations when most > users would use DBI instead. To this I say there is room for both. Ideally, Perl 6 would have bundled an ideal native interface for RDBMSs, and a default built-in implementation that just uses Perl data structures and is simple. But CPAN authors can .does() that interface in their own modules, such that theirs provide bridges to other engines, often removing the need for people to use DBI-as-it-is directly, if at all. Of course, DBI-as-it-is still has its uses, just as there are still always uses for writing inlined C. Larry's Synopsis docs *do* include example roles or traits called "Database" do they not? What could that refer to if not something similar to what I'm talking about? > Are there Cpan modules in existence for doing this kind of thing in Perl > 5? I will note that there is a very new distro called DBIx::Perlish that is probably worth looking at. I didn't make it, but it presents some very good ideas such that would be good to have. Take as some inspiration. -- Darren Duncan P.S. My lack of a computer has set back my progress, but once I get it back, I still hope to have working prototypes of my proposals within a few weeks.
Re: request new Mapping|Hash operators
On February 27th Darren Duncan writes: > At 4:45 PM + 2/27/07, Nicholas Clark wrote: > > > > 4. rename(): > > > rename is a Perl 5 builtin. > I see this situation as being similar to Dog.bark() vs Tree.bark(); The difference is that those are methods. Having different objects which have identically named methods is very different from having a built-in function which performs multiple, but very different, tasks based on its arguments. > One common usage scenario, of relational-join in particular, is doing > operations on tabular data, where you want to know something that > involves matching up columns in several tables. > I've seen lots of programs that do things like that. But by a long way it's far more common to have the data tables in a database and use SQL for the joining than for the data to be elsewhere, such that SQL can't be used and it has to be read into Perl data structures for doing the joining. DBI is a module, not a core part of the Perl language. I think it would be odd for Perl 6 to have core support for these operations when most users would use DBI instead. For people doing data processing that requires this functionality I don't see why loading a module would be too much of a burden. > In conclusion, I consider functionality like relational-join to > provide considerable conciseness to very common data processing > operations Are there Cpan modules in existence for doing this kind of thing in Perl 5? Smylers
Re: request new Mapping|Hash operators
At 4:45 PM + 2/27/07, Nicholas Clark wrote: > 4. rename(): rename is a Perl 5 builtin. I didn't think that it had been dropped for Perl 6. At 6:22 PM + 2/27/07, Smylers wrote: > 1. join() aka natural_join(): Remember that Perl already has a C function, for joining strings. To both of these comments, first I want to repeat that we don't have to use the names I provided if there are other names or syntax that would work better. As for join(), it already has multiple meanings in Perl. Not only is join() used for joining strings, but also for joining threads. Regardless, I see this situation as being similar to Dog.bark() vs Tree.bark(); the operators I described take a Mapping or Hash as their primary argument, while any other join() or rename() do not, so they are very easy to distinguish using normal multi semantics, and they don't even look the same visually. But once again, the functions|operators can have different names. At 6:22 PM + 2/27/07, Smylers wrote: Darren Duncan writes: > I believe that ... some common relational operations would be a lot easier to express if Perl 6 had a few more operators that make them concise. I am prepared to believe that. But what I'm unclear on is when I'd want to perform a common relational operation. Please could you give an example of something which is useful -- that is useful as a means to some other end, not merely useful to somebody who has an interest in relational theory -- but which is currently awkward, and then give the same example again showing how much better it is with your proposed functions? At 12:51 PM + 2/27/07, Aaron Crane wrote: (As it happens, I'm not entirely convinced that these operations are generally useful in the same way as, say, multiplication, or string concatenation, or cross hypering, but I think that's a side issue.) I would say that relational operations, in usefulness, place around the order of cross hypering and/or set operations, or just at the next level down. In functionality, I see relational operations as being like slightly more complicated set operations, in that each set element has multiple significant parts and the set-like operations can be looking at just parts of each element rather than the whole element when querying membership, and that the elements of derived sets can have different elements than either of the set operation arguments. It is convenient for elements to be represented using Mappings|Hashes. One common usage scenario, of relational-join in particular, is doing operations on tabular data, where you want to know something that involves matching up columns in several tables. For example, say you have data tables {suppliers,foods,shipments} and you want to know what suppliers, along with their countries, that you have received orange-coloured foods from. A country of residence is an attribute of a supplier, and color is an attribute of a part. Your data, which could come from anywhere, could look like this: $suppliers = Set( { farm<'Hodgesons'>, country<'Canada'> }, { farm<'Beckers'>, country<'England'> }, { farm<'Wickets'>, country<'Canada'> }, ); $foods = Set( { food<'Bananas'>, colour<'yellow'> }, { food<'Carrots'>, colour<'orange'> }, { food<'Oranges'>, colour<'orange'> }, { food<'Kiwis'>, colour<'green'> }, { food<'Lemons'>, colour<'yellow'> }, ); $shipments = Set( { farm<'Hodgesons'>, food<'Kiwis'>, qty<100> }, { farm<'Hodgesons'>, food<'Lemons'>, qty<130> }, { farm<'Hodgesons'>, food<'Oranges'>, qty<10> }, { farm<'Hodgesons'>, food<'Carrots'>, qty<50> }, { farm<'Beckers'>, food<'Carrots'>, qty<90> }, { farm<'Beckers'>, food<'Bananas'>, qty<120> }, { farm<'Wickets'>, food<'Lemons'>, qty<30> }, ); If the join() and semijoin() operators that I described existed, then the query could look like this: $supp_of_oran_food = Set( $suppliers.values XsemijoinX ($shipments.values XjoinX $foods.values >>join<< { colour<'orange'> }) ); Or if higher-level operators were made that worked on entire sets of mappings, or relations (which can have multiple indexes) instead, the above query could look more like this instead: $supp_of_oran_food = $suppliers semijoin ($shipments join $foods join Set( { colour<'orange'> } ) ); The result is then: Set( { farm<'Hodgesons'>, country<'Canada'> }, { farm<'Beckers'>, country<'England'> }, ); Without any join etc operators, you would have to explicitly iterate over each element of each Mapping|Hash and do comparisons between keys and values|elements, which is considerably more verbose. In conclusion, I consider functionality like relational-join to provide considerable conciseness to very common data processing operations, which given their nature, would likely get a speed benefit from being implemented at the same low level that set operations in general are. And the operators could have
Re: request new Mapping|Hash operators
Darren Duncan writes: > I believe that ... some common relational operations would be a lot > easier to express if Perl 6 had a few more operators that make them > concise. I am prepared to believe that. But what I'm unclear on is when I'd want to perform a common relational operation. Please could you give an example of something which is useful -- that is useful as a means to some other end, not merely useful to somebody who has an interest in relational theory -- but which is currently awkward, and then give the same example again showing how much better it is with your proposed functions? > I also want to emphasize that I see this functionality being generally > useful, and that it shouldn't just be shunted off to a third-party > module. Why is being in a module being "shunted off"? You could put everything in the main namespace but that way PHP, ahem I mean madness, lies. Nicholas already pointed out that in Perl 5 C exists, as an operation on files. That shows the problem with using generic function names for quite specific operations without there being any surrounding context. Many people rarely use C, because they happen to be using Perl for things other than dealing the filesystem, but the existence of that function clobbers a useful name. Rather than fighting over it it strikes me as much more sensible to have a module for filesystem operations and another for relational operations, then users can import the functions that they actually use. Note that being in a module doesn't (necessarily) mean 'not distributed with core Perl'. > 1. join() aka natural_join(): Remember that Perl already has a C function, for joining strings. Smylers
Re: request new Mapping|Hash operators
On Tue, Feb 27, 2007 at 12:18:20AM -0800, Darren Duncan wrote: > 4. rename(): > > function rename of Mapping (Mapping $m, Str $old_k, Str $new_k) { > ... } > > This operator takes one Mapping argument and derives another rename is a Perl 5 builtin. I didn't think that it had been dropped for Perl 6. Nicholas Clark
Re: request new Mapping|Hash operators
Darren Duncan writes: > I believe that there is some room for adding several new convenience > operators or functions to Perl 6 that are used with Mapping and Hash > values. > I also want to emphasize that I see this functionality being generally > useful, and that it shouldn't just be shunted off to a third-party > module. Um, why not? Or rather, why do these need to be part of standard Perl 6.0.0? Even assuming that they are "generally useful", the volunteers donating their time to implement Perl 6.0.0 shouldn't feel compelled to build every last feature that might be considered generally useful. (As it happens, I'm not entirely convinced that these operations are generally useful in the same way as, say, multiplication, or string concatenation, or cross hypering, but I think that's a side issue.) I think it would be reasonable for someone who believes that these operations are generally useful to attempt to write a Perl 6 module that provides them. If that effort goes well, maybe the module will be included in Perl 6.0.0. If the module can't be written, or can't be made efficient, that is presumably interesting to the designers of the language and of its implementation(s). But "I need these operations" does not imply "Perl 6.0.0 needs these operations". -- Aaron Crane