At 2:54 AM +0000 12/15/05, Luke Palmer wrote:
On 12/15/05, Darren Duncan <[EMAIL PROTECTED]> wrote:
 I propose, perhaps redundantly, that Perl 6 include a complete set of
 native

Okay, I'm with you here.  Just please stop saying "native" and "core".
 Everyone.

Yes, of course. What I meant was that I considered relational data important enough for common programming to be considered by the Perl 6 language designers, so that the language allows for it to be elegantly represented and processed. The implementation details aren't that important.

I would like to hear from Ovid and Dave Rolsky on this issue too, as
they seem to have been researching pure relational models.

As am I now. My own database access framework in development is evolving to be centered more around an ideal relational model rather than simply what SQL or existing databases define. It does any serious database developer good to be familiar with what the relational model actually says, and not just what tangential things have actually been implemented by various vendors. The sources I cited are good reference and/or explanatory materials.

 > Essentially it comes down to better handling of data sets.

Cool.  I've recently been taken by list comprehensions, and I keep
seeing "set comprehensions" in my math classes.  Maybe we can steal
some similar notation.

You probably could; the terms used in relational theory are mostly or entirely from mathematics. (I could stand to learn more about those maths too.)

Hmm.  I would say it's a hash not so much.  For instance, the
difference between an array and a tuple in many languages is that an
array is homogeneously-typed--that's what allows you to access it
using runtime values (integers).  Tuples are heterogeneously-typed, so
you can't say

    my $idx = get_input();
    say $tuple[$idx];

(Pretend that Perl 6 is some other language :-), because the compiler
can't know what type it's going to say.

In the same way, I see a hash as homogeneously-typed, because you can
index it by strings.  What you're referring to as a tuple here would
be called a "record" or a "struct" in most languages.

Yes, you are right; a Tuple is very much a "record" or a "struct"; I just didn't use those because Perl doesn't have them per se; the closest thing that Perl has is the "object", which you could say is exactly equivalent.

 >   * a Relation is an unordered set of Tuples, where every Tuple has
 the same definition, as if the Relation were akin to a specific Perl
 class and every Tuple in it were akin to a Perl object of that class

When you say "unordered set" (redundantly, of course), can this set be
infinite?  That is, can I consider this relation (using made-up set
comprehension notation):

    { ($x,$y) where $x & $y (in) Int, $x <= $y }

And do stuff with it?

Yes you can. A set can be infinite. For example, the set of INTEGER contains every whole number from negative infinity to positive infinity. At the same time, this set excludes all fractional numbers and all data that is not a number, such as characters. This only becomes finite when you place bounds on the range, such as saying it has to be between +/- 2 billion.

 > Specifically what I would like to see added to Perl, if that doesn't
 already exist, is a set of operators that work on Relations, like set
 operations, such as these (these bulleted definitions from "Database
 in Depth", 1.3.3, some context excluded):

   * Restrict - Returns a relation containing all tuples from a
 specified relation that satisfy a specified condition. For example,
 we might restrict relation EMP to just the tuples where the DNO value
 is D2.

Well, if we consider a relation to be a set, then we can use the set operations:

    my $newrel = $emp.grep: { .DNO === 'D2' };

I don't know what EMP, DNO, and D2 are...

Part of the context I excluded before, from section 1.3.1, is that the author is talking about hypothetical DEPT (Department) and EMP (Employee) relations (tables); DEPT has the attributes [DNO, DNAME, BUDGET], and EMP has the attributes [ENO, ENAME, DNO, SALARY]; DEPT.DNO is referenced by EMP.DNO; DEPT.DNO and EMP.ENO are primary keys in their respective relations.

So the restrict example is like, as you said, but with EMP an object:

  my $NEWREL = $EMP.grep:{ $.DNO eq 'D2' };

A SQLish equivalent would be:

  INSERT INTO NEWREL SELECT FROM EMP WHERE DNO = 'D2';

 >   * Project - Returns a relation containing all (sub)tuples that
 remain in a specified relation after specified attributes have been
 removed. For example, we might project relation EMP on just the ENO
 and SALARY attributes.

Hmm...  Well, if we pretend that records and hashes are the same thing
for the moment, then:

    my $newrel = $emp.map: { .:<ENO SALARY> };

(See the new S06 for a description of the .: syntax)

Or with EMP an object:

  my $NEWREL = $EMP.map:{ $_.class.new( ENO => $_.ENO, SALARY => $.SALARY ) };

SQLish:

  INSERT INTO NEWREL (ENO, SALARY) SELECT ENO, SALARY FROM EMP;

 >   * Join - Returns a relation containing all possible tuples that are
 a combination of two tuples, one from each of two specified
 relations, such that the two tuples contributing to any given result
 tuple have a common value for the common attributes of the two
 relations (and that common value appears just once, not twice, in
 that result tuple).  NOTE, This kind of join was originally called
 the natural join. Since natural join is far and away the most
 important kind, however, it's become standard practice to take the
 unqualified term join to mean the natural join specifically, and I'll
 follow that practice in this book.

I would totally tell you what the best way to do this was if I could
understand what the hell it was talking about.  Examples?  Proposals?

Here's the example in the book (section 1.3.3):

The natural join of ...

 +--+--+
 |a1|b1|
 |a2|b1|
 |a3|b2|
 +--+--+

... and ...

 +--+--+
 |b1|c1|
 |b2|c2|
 |b3|c3|
 +--+--+

... is ...

 +--+--+--+
 |a1|b1|c1|
 |a2|b1|c1|
 |a3|b2|c2|
 +--+--+--+

SQLish is:

  INSERT INTO <foo> SELECT FROM <bar> NATURAL INNER JOIN <baz>;

 >   * Divide - Takes two relations, one binary and one unary, and
 returns a relation consisting of all values of one attribute of the
 binary relation that match (in the other attribute) all values in the
 unary relation.

Again, I don't follow.

First of all, for language, an n-ary relation is a relation having n attributes; a binary relation is like a table with 2 columns, and a unary relation is like a table with 1 column.

This is the example the book gives (section 1.3.3):

If you take ...

 +-+-+
 |a|x|
 |a|y|
 |a|z|
 |b|x|
 |c|y|
 +-+-+

... and divide it by ...

 +-+
 |x|
 |z|
 +-+

... the result is ...

 +-+
 |a|
 +-+

I'm not sure if Divide has an equivalent in SQL.

-- Darren Duncan

Reply via email to