Re: relational data models and Perl 6

Darren Duncan Thu, 15 Dec 2005 00:13:03 -0800

At 2:54 AM +0000 12/15/05, Luke Palmer wrote:

On 12/15/05, Darren Duncan <[EMAIL PROTECTED]> wrote:

 I propose, perhaps redundantly, that Perl 6 include a complete set of
 native


Okay, I'm with you here.  Just please stop saying "native" and "core".
 Everyone.

Yes, of course. What I meant was that I considered relational dataimportant enough for common programming to be considered by the Perl6 language designers, so that the language allows for it to beelegantly represented and processed. The implementation detailsaren't that important.

I would like to hear from Ovid and Dave Rolsky on this issue too, as
they seem to have been researching pure relational models.

As am I now. My own database access framework in development isevolving to be centered more around an ideal relational model ratherthan simply what SQL or existing databases define. It does anyserious database developer good to be familiar with what therelational model actually says, and not just what tangential thingshave actually been implemented by various vendors. The sources Icited are good reference and/or explanatory materials.

 > Essentially it comes down to better handling of data sets.

Cool.  I've recently been taken by list comprehensions, and I keep
seeing "set comprehensions" in my math classes.  Maybe we can steal
some similar notation.

You probably could; the terms used in relational theory are mostly orentirely from mathematics. (I could stand to learn more about thosemaths too.)

Hmm.  I would say it's a hash not so much.  For instance, the
difference between an array and a tuple in many languages is that an
array is homogeneously-typed--that's what allows you to access it
using runtime values (integers).  Tuples are heterogeneously-typed, so
you can't say

    my $idx = get_input();
    say $tuple[$idx];

(Pretend that Perl 6 is some other language :-), because the compiler
can't know what type it's going to say.

In the same way, I see a hash as homogeneously-typed, because you can
index it by strings.  What you're referring to as a tuple here would
be called a "record" or a "struct" in most languages.

Yes, you are right; a Tuple is very much a "record" or a "struct"; Ijust didn't use those because Perl doesn't have them per se; theclosest thing that Perl has is the "object", which you could say isexactly equivalent.

 >   * a Relation is an unordered set of Tuples, where every Tuple has

 the same definition, as if the Relation were akin to a specific Perl
 class and every Tuple in it were akin to a Perl object of that class


When you say "unordered set" (redundantly, of course), can this set be
infinite?  That is, can I consider this relation (using made-up set
comprehension notation):

    { ($x,$y) where $x & $y (in) Int, $x <= $y }

And do stuff with it?

Yes you can. A set can be infinite. For example, the set of INTEGERcontains every whole number from negative infinity to positiveinfinity. At the same time, this set excludes all fractional numbersand all data that is not a number, such as characters. This onlybecomes finite when you place bounds on the range, such as saying ithas to be between +/- 2 billion.

 > Specifically what I would like to see added to Perl, if that doesn't

 already exist, is a set of operators that work on Relations, like set
 operations, such as these (these bulleted definitions from "Database
 in Depth", 1.3.3, some context excluded):

   * Restrict - Returns a relation containing all tuples from a
 specified relation that satisfy a specified condition. For example,
 we might restrict relation EMP to just the tuples where the DNO value
 is D2.

Well, if we consider a relation to be a set, then we can use the setoperations:


    my $newrel = $emp.grep: { .DNO === 'D2' };

I don't know what EMP, DNO, and D2 are...

Part of the context I excluded before, from section 1.3.1, is thatthe author is talking about hypothetical DEPT (Department) and EMP(Employee) relations (tables); DEPT has the attributes [DNO, DNAME,BUDGET], and EMP has the attributes [ENO, ENAME, DNO, SALARY];DEPT.DNO is referenced by EMP.DNO; DEPT.DNO and EMP.ENO are primarykeys in their respective relations.


So the restrict example is like, as you said, but with EMP an object:

  my $NEWREL = $EMP.grep:{ $.DNO eq 'D2' };

A SQLish equivalent would be:

  INSERT INTO NEWREL SELECT FROM EMP WHERE DNO = 'D2';

 >   * Project - Returns a relation containing all (sub)tuples that

 remain in a specified relation after specified attributes have been
 removed. For example, we might project relation EMP on just the ENO
 and SALARY attributes.


Hmm...  Well, if we pretend that records and hashes are the same thing
for the moment, then:

    my $newrel = $emp.map: { .:<ENO SALARY> };

(See the new S06 for a description of the .: syntax)


Or with EMP an object:

  my $NEWREL = $EMP.map:{ $_.class.new( ENO => $_.ENO, SALARY => $.SALARY ) };

SQLish:

  INSERT INTO NEWREL (ENO, SALARY) SELECT ENO, SALARY FROM EMP;

 >   * Join - Returns a relation containing all possible tuples that are

 a combination of two tuples, one from each of two specified
 relations, such that the two tuples contributing to any given result
 tuple have a common value for the common attributes of the two
 relations (and that common value appears just once, not twice, in
 that result tuple).  NOTE, This kind of join was originally called
 the natural join. Since natural join is far and away the most
 important kind, however, it's become standard practice to take the
 unqualified term join to mean the natural join specifically, and I'll
 follow that practice in this book.


I would totally tell you what the best way to do this was if I could
understand what the hell it was talking about.  Examples?  Proposals?


Here's the example in the book (section 1.3.3):

The natural join of ...

 +--+--+
 |a1|b1|
 |a2|b1|
 |a3|b2|
 +--+--+

... and ...

 +--+--+
 |b1|c1|
 |b2|c2|
 |b3|c3|
 +--+--+

... is ...

 +--+--+--+
 |a1|b1|c1|
 |a2|b1|c1|
 |a3|b2|c2|
 +--+--+--+

SQLish is:

  INSERT INTO <foo> SELECT FROM <bar> NATURAL INNER JOIN <baz>;

 >   * Divide - Takes two relations, one binary and one unary, and

 returns a relation consisting of all values of one attribute of the
 binary relation that match (in the other attribute) all values in the
 unary relation.


Again, I don't follow.

First of all, for language, an n-ary relation is a relation having nattributes; a binary relation is like a table with 2 columns, and aunary relation is like a table with 1 column.


This is the example the book gives (section 1.3.3):

If you take ...

 +-+-+
 |a|x|
 |a|y|
 |a|z|
 |b|x|
 |c|y|
 +-+-+

... and divide it by ...

 +-+
 |x|
 |z|
 +-+

... the result is ...

 +-+
 |a|
 +-+

I'm not sure if Divide has an equivalent in SQL.

-- Darren Duncan

Re: relational data models and Perl 6

Reply via email to