On Tue, Nov 03, 2009 at 02:51:23AM -0800, CityDev scratched on the wall: > > Whilst it's true that SQL isn't essential for a relational database
More to the point, any database that supports SQL does not, and cannot, support the data typing and data manipulation rules set out by the Relational Model. The Relational Model defines a set of structure and manipulation rules. It is a mathematical system. It does not define any kind of query language or symbolic expression system-- that is outside of the scope of the Model. In fact, almost by definition a query language has to be outside of the scope of the Model in order to be practical (for example, some kind of ORDER BY option in a query). The SQL standard defines not only the language syntax, but also defines it's own data typing and data manipulation rules. It defines it's own mathematical system-- kind of. The data manipulation environment defined by SQL is not really rigid or self-consistent enough to call a "mathematical system." The SQL system is strongly based off the Relational Model, but it is fundamentally incompatible in many small but important ways. So even if we ignore "SQL the language", what we have left in "SQL the data manipulation environment" is not the Relational Model. Siblings, perhaps, but not the same. So when you said, "SQL is a first-order language so it can't easily handle substructure," as a point of argument, my feeling is that A) that might be true, B) but it doesn't matter, C) and it really doesn't matter. A) Sure, why not. SQL can't easily handle a lot of the things it tries to do. But since the computational environment is not defined by the language, it B) doesn't really matter. Just because my fancy calculator doesn't have a way to express and generate 4D graphs doesn't mean 4D geometry doesn't exist as a full and mature system, and isn't useful for somethings. It only means I need a better calculator (with a *way* better display...) or I need to do it "by hand" using the rules of the environment directly. But, C) it really doesn't matter, because the context of the discussion was the Relational Model, and you can't really make Relational Model arguments when thinking in SQL. Like one of my computational professors used to say, "When you argue in English, you're wrong." It's kind of the same thing. You can't really make Relational Model arguments in SQL. > I'm interested in your remark that relational databases now cope with > 'arrays'. Didn't say that. I said several SQL databases do. Although looking through some documentation, it looks like that number is a bit less than I first thought. > That just seems so contrary to the original idea of the relational > model that you shouldn't have any data whose meaning is not defined > by data (in the case of an array you need to understand the > significance of relative position - remember relations have no row > or column order to stop you playing that game). Yes, EXACTLY. And so it is for nearly any other compound datatype. Arrays have implied ordering. You can't JOIN against an array without extracting meta-data like that. Even if you do that, you have to pair that data together and carry it through any manipulation. If the data gets separated or lost, both sets of data becomes worthless. Not Relational. Same is true of any other container with an inherent ordering or listing (which is most of them). What about sets? The Relational Model has a lot of set theory in it, but it isn't "pure" set theory. For example, many "true" set operations will result in sets of sets. Those operations in the Relational Model (applied to the sub-sections of relations) will always be "flattened" to just sets of elements. So sets don't work within the Relational Model unless you modify their behavior. If you do that, you no longer have "real" sets. You basically have... a relation. Or something that acts like a relation in pretty much every way that counts. So now you have a RVA, and we know those don't count because they can be seamlessly transformed into a non-RVA representation without losing data or meaning (or generating meta-data, which is what I meant by "meaning"). And, BTW, SQL does consider table/result-set columns to be ordered. One more fundamental difference that causes all kinds of headaches for query optimizers, because of issues like technically A NATURAL JOIN B is NOT equal to B NATURAL JOIN A. > So I still go back to point i) - you don't need to do this. I hope not, as my main point was that I feel they aren't allowed under the Relational Model. I agree with your thoughts, I just think there is a more rigid way of showing it to be true. But that doesn't mean people don't want them. So why are they there? Most SQL people aren't all that aware of the theory behind database manipulation. That's fine, they're just trying to get work done. But it means they're not always "thinking in sets" so they're just trying to find some way to get what they need to get done in a way that they feel makes sense. Seriously, consider how many databases out there are nothing more than over-glorified CSV files. When all that is coupled with the fact that SQL has traditionally been an "after the fact" standard (e.g. it documents what the market has already been doing), most database companies are looking for features and technology to make their development customers happy (== life easier) and there isn't a huge concern if something is pure in theory or not. They only care that some database programmer might decide to buy their product because it supports some feature that fits the developer's way of thinking, like array attributes. Using array attributes screams, "I don't understand JOINs." And you see them, or some likeness of them (like CSV strings), all the time. That's part of the reason I find all these Relational Model discussions interesting, and why I hope people on the mailing list put up with them. I'm a full-time software engineer. I build products, I don't do research or theory. But I believe strongly in ripping boxes apart and understanding how they work before I start to build with them. If you're a budding database programmer, the best thing you can do is go out and learn some theory. It will help with your work, your data modeling, your query design. > All you are doing is collapsing a join and computers are so > powerful nowadays that you should treat that as an unusual exception. It doesn't matter how powerful computers are. What matters is that the mathematics support it as a valid operation. The Relational Model supports a lot of things that border on computationally impossible. Practical implementation details are the source of many of the differences between the Model and SQL. The point you're getting at is still valid, however. One aspect of the Model is that a system can *physically* store the data in all kinds of interesting ways because all the transforms are 100% reversible. You don't really need to store the table, as defined, in any physical way as long as you can compute it from the data you've stored. RVAs, as a data modeling idea, fit right into that because the only real difference between an RVA and a non-RVA is how you write the data out on the display. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Our opponent is an alien starship packed with atomic bombs. We have a protractor." "I'll go home and see if I can scrounge up a ruler and a piece of string." --from Anathem by Neal Stephenson _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users