relational data models and Perl 6

Darren Duncan Wed, 14 Dec 2005 17:19:48 -0800

All,

P.S. What follows is rough and will be smoothed out or reworked.

I propose, perhaps redundantly, that Perl 6 include a complete set ofnative language constructs for a relational data model, akin to thatintroduced in E. F. Codd's classic paper, "A Relational Model of Datafor Large Shared Data Banks" (a copy of which is athttp://www.acm.org/classics/nov95/toc.html ), and also discussed atlength in such books as C. J. Date's "Database in Depth" (O'Reilly,2005). Codd's paper itself (see 1.5) says that the necessary piecesare good candidates for a sub-language of any typical programminglanguage.

The actual relational data model (which is not the same as SQL perse) is expressable in terms of mathematics, such as sets andpredicate calculus, and therefore I believe that Perl 6 already hasmost of what is needed in the language already.


Essentially it comes down to better handling of data sets.

It is very possible, then that all which may be necessary is anextension of the standard data types, or operators, or builtinfunctions, and/or utilization of the Perl 6 object model.

What I would like, for example, are standard data types which areakin to Relations/RelVars/etc (tables/rowsets), Tuples (rows),Attributes (fields), Sets (enums), Domains (data types) and such.Largely these already map to existing Perl 6 entities:

* a Domain is like a class that defines a set of possible values,and each value can be multi-part; equal to a perl Class


 * an Attribute stores a value which is a perl Object

* a Tuple is an associative array having one or more Attributes, andeach Attribute has a name or ordinal position and it is typedaccording to a Domain;

this is like a restricted Hash in a way, where each key has a specific type

* a Relation is an unordered set of Tuples, where every Tuple hasthe same definition, as if the Relation were akin to a specific Perlclass and every Tuple in it were akin to a Perl object of that class


Fairly standard so far.

Specifically what I would like to see added to Perl, if that doesn'talready exist, is a set of operators that work on Relations, like setoperations, such as these (these bulleted definitions from "Databasein Depth", 1.3.3, some context excluded):

* Restrict - Returns a relation containing all tuples from aspecified relation that satisfy a specified condition. For example,we might restrict relation EMP to just the tuples where the DNO valueis D2.

* Project - Returns a relation containing all (sub)tuples thatremain in a specified relation after specified attributes have beenremoved. For example, we might project relation EMP on just the ENOand SALARY attributes.

* Product - Returns a relation containing all possible tuples thatare a combination of two tuples, one from each of two specifiedrelations. Product is also known variously as cartesian product,cross product, cross join, and cartesian join (in fact, itis just aspecial case of join, as we'll see in Chapter 5).

* Intersect - Returns a relation containing all tuples that appearin both of two specified relations. (Actually, intersect also is aspecial case of join.)

* Union - Returns a relation containing all tuples that appear ineither or both of two specified relations.

* Difference - Returns a relation containing all tuples that appearin the first and not the second of two specified relations.

* Join - Returns a relation containing all possible tuples that area combination of two tuples, one from each of two specifiedrelations, such that the two tuples contributing to any given resulttuple have a common value for the common attributes of the tworelations (and that common value appears just once, not twice, inthat result tuple). NOTE, This kind of join was originally calledthe natural join. Since natural join is far and away the mostimportant kind, however, it's become standard practice to take theunqualified term join to mean the natural join specifically, and I'llfollow that practice in this book.

* Divide - Takes two relations, one binary and one unary, andreturns a relation consisting of all values of one attribute of thebinary relation that match (in the other attribute) all values in theunary relation.

Now, all that I'm saying, could be implemented as a Perl 6 module,and if necessary I can do this for illustrative purposes, but Ibelieve that this is essentially simple and something analagousshould be included in the core language for similar reasons thatjunctions and PDL are.

I also want to make clear that this functionality is entirely aboutbetter support for data processing with Perl native variables, andhas nothing to do with external data repositores such as SQLdatabases. Though I anticipate that one could extend or overridebuilt-ins so that they interact with remote databases instead ofinternal variables, such as with the concept of sub-classing or rolereusing or tying.


Thank you. -- Darren Duncan

relational data models and Perl 6

Reply via email to