On Sun, Nov 01, 2009 at 01:23:52PM -0800, CityDev scratched on the wall: > > Darren Duncan wrote: > > > > Or at least it is in the version of the relational model > > that allows non-scalar attribute values, but that is the one that Chris > > Date et al, as well as myself ascribe to. > > I didn't read this through but I recall Chris Date defining a relational > database as formed from relations normalised to 1NF. First Normal Form > basically means no substructure ie each column is a single value from a > scalar domain.
> To say it would be nice to have structured data is ignoring > two things: i) we've got this for just fine without it, and ii) SQL is a > first-order language so it can't easily handle substructure. Let me address these two points first. As for "i", then please explain to my why nearly every SQL database supports array values? I'm not sure it is in the standard, but it is so common it might as well be (for those wondering, no, SQLite does not support arrays). The answer for "ii" is easy: SQL is not the Relational Model, so the only point you might be making is that SQL makes for a poor relational language. Now back to your first paragraph. This is the more interesting one. Executive Summary: Multi-valued attributes (column types) are not allowed in the Relational Model. Relational-Valued Attributes (tables in tables) are OK, however, because they don't count. Now a few pages on why.... The term used by Date (and Codd) is "atomic", not scalar. Of course, this is generally meaningless, as it is very hard to define what makes a value "atomic." Are text values atomic? The SQL standard includes a number of functions to extract, match, and manipulate sub-strings. What about integers? SQLite includes bit operations that can be applied to integer values. Does that make them non-atomic? In "Database In Depth", Date eventually just gives up trying to define "atomic" in a formal sense, says it cannot and should not be done. In his writings, Date points out that even atoms are not "atomic," and can be broken down into quarks-- but that the idea of atomic values is still a good and powerful one, even if it is inexact. Given all that, it might come as a surprise that Codd and Date *do* support the idea that RVA (Relation-Valued Attribute) fit into the Relational Model. A few points on that. First, the whole concept of 1NF is from the SQL world. To put a table into 1NF is basically to make it mimic the requirements of a relation. Under the Relational Model, it is basically impossible to have a valid Relation that is not in 1NF. If it isn't in 1NF, it isn't a relation. So why allow RVAs? Well, to start with, under the Relational Model relations are atomic. Yes, really. Under the formal Model you're not allowed to change or alter any sub-value of a relation. A relational value is what it is, and the whole state of the relation is what represents its value. By taking the product of the column type domains, you can compute a known, finite domain for every possible value for a given relation (relations don't allow duplicate rows, so the number of rows is bound). That's typically a *very* large domain, but since it is known and finite, it is a valid type domain. As an extension of that idea, when you alter a column or row under the Model you're not changing a sub-part of the relation, you're replacing the whole relational value with a new value. INSERT doesn't add a new row to an existing relation, it computes a new relation value that is the UNION of the old relation and the new row. Similar manipulations can apply the UPDATE and DELETE, but in all cases you're not altering a relation, you're computing a new one. So, in terms of the model that defines both what a relation is and what it can contain, relations are atomic-- and from that RVAs are naturally allowed. At this point, if you're raising an eyebrow and thinking that sounds like a pretty questionable slight of hand in the formal definition of atomic values, I'd be quick to agree. But here's another point that is, perhaps, more important: Any relation with a RVA can be decomposed into a non-RVA relation. And, just as importantly, can be re-composed back into the original RVA. That is, you can "round trip" in and out of a RVA representation <IMPORTANT!>without any loss of data or meaning</IMPORTANT!>. One can make a strong argument that RVAs are just a different way to organize the data "on paper", but in the end the addition (or removal) of RVA doesn't alter what the Model is capable of representing or doing. This works the same way that any static multi-dimensional array type can be reduced to a one-dimensional array type without loss of functionality. And it is *this* point that separates RVAs from storing arrays, lists, or other "multi-value" containers as a row/column value. Exactly why this is true would take pages and pages to explain, but basically boils down to the idea that the only thing that can flawlessly represent a relation without adding or losing state or informational value is... (surprise!) another relation. On the details of this you're just going to have to trust me. Or, better yet: don't. Pick up a copy of "Database In Depth" yourself. You're likely to look at databases and database design in a whole new light. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Our opponent is an alien starship packed with atomic bombs. We have a protractor." "I'll go home and see if I can scrounge up a ruler and a piece of string." --from Anathem by Neal Stephenson _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users