I left the discussion alone until some people started talking about both directions, although we do help people go both ways.
I actually had a customer at one time who had migrated off a mainframe. They had so much data that when you looked at the disk-head movement, average access time, and volume of data that they had to process, they were able to prove mathematically that they could not, with the technology that was then available (and I think still not) get away with a FNF relational setup. They had to use a MV approach, and decided on Unidata. I recall having to do indexed SELECT statements against 30 GB files and doing intersect merges to combine indexes because LIST/SORT/SELECT/SSELECT would use the first indexed field you referenced, and then ignore all other indexes, brute forcing the rest of the selection criteria. With a 30 GB file, that was never an option. And yes, if you throw enough joins at SQL Server it gives up optimizing, but lately, you have to really make it complex before it gives up. I haven't tried it recently, so maybe Unidata handles indexes better these days. Maybe Universe does. But the last time I tried doing anything significant using indexes on large U2 files, it got tricky. My point is that an application using SQL Server, Oracle or some other relational engine, often uses indexes in a way that may not port easily to MV. They also work differently. One designs Relational data differently than one designs MV data. So maybe instead of saying of indexes on MV that "they don't work as well", I should have said "they solve different problems in different ways". I agree with your assessment of the Relational approach having issues with "reality". They intentionally abstract out reality, and the programmer gets to reconstitute it. One of the things that I've noticed is that when the data that describes an object is in multiple tables, as with a FNF relational setup, as soon as you want to work with it in an application, you effectively have to create internal structures to work with the "whole item" anyways. But now you're making a programmer do it, using variables, instead of allowing the database to do it for you. On the flip side, doing data mining is much easier when the data is fully normalized and duplicates are eliminated... Having worked on both sides of the house, there are decidedly times when each one excels, and tasks for which each model is best. I will, in the same day, program PICK/BASIC, Java, C#, C++ and JavaScript and probably at least one other language. The power of MV for rapidly working with data is amazing. The fact that I can create a file and start to use it (it's my problem to make sure I use it consistently) without having to jump through hurdles to configure everything I need in it is both inspiring and a bit scary. Two programmers can decide to grab the same next field and start using it and if they don't happen to compare notes, they'll get away with it, with devastating results when they deploy their changes. This doesn't happen in the relational world, but then, the hurdles you go through to work with data are considerably more daunting. In the end, where databases are concerned, there is no substitute for good architecture, design and planning. And while you're at it, design for flexibility: You'll almost certainly get some things *wrong* the first time around! -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Wols Lists Sent: Thursday, December 23, 2010 4:28 PM To: [email protected] Subject: Re: [U2] Migration On 24/12/10 00:07, Robert Houben wrote: > I've been watching this thread with some interest. Because I'm going to > reference our product, I'm putting th [AD] marker on this. > > One of our best-selling products assists our customers in rapid > migration/data warehousing of Multivalued and Subvalued data to either SQL > Server, Oracle, DB2, Progress, MySQL and other relational databases. > > We actually have a whole suite of tools to assist our customers in solving > these issues. Except that's not the issue that the OP posed :-) He wanted to migrate AWAY from SQL server etc :-) > In order to go the other direction, there are some inescapable questions that > have to be answered: > > MV can only support 2 levels of nesting. If you have 3 or more 1-to-many > relationships, you have to decide at some point to keep a set of keys > (item-ids) in a multivalued or subvalued cross-reference and use another > file. Deciding where and when to do this becomes the tricky thing. > This, imho, is where the relational guys have a problem with reality. Yes I know that decision is HARD. So the relational guys abstract it away. The cost is that relational databases are not deterministic. It becomes a case of "yes there is definitely a solution, and I will definitely find it, but it may take longer than the life of the universe to get there" :-) (And I'm only *half* joking). THAT is the reason that MV engines are *always* faster than relational engines. Einstein said "don't make things *too* simple". The relational guys have done exactly that! > SQL uses indexes. MV uses cross references to item-ids (MV sometimes > supports indexes, but they don't always work as well as in the > relational world.) > I don't know as that is true ... or are you using the word "index" to mean something completely different to me? I'll agree the implementation of indices can be buggy, but surely that's true of relational engines too? > There are other issues, but that's a good starting point. If you really want > to explore the concept there is no substitute for playing with it, and you'll > discover there are more issues lurking there... Yup ... Cheers, Wol _______________________________________________ U2-Users mailing list [email protected] http://listserver.u2ug.org/mailman/listinfo/u2-users _______________________________________________ U2-Users mailing list [email protected] http://listserver.u2ug.org/mailman/listinfo/u2-users
