On Tue, Oct 05, 2010 at 03:44:38PM +0100, Ian Hardingham scratched on the wall: > Hey Jay, thanks for your feedback. > > I am indeed using (several) delineated lists. I would very much > appreciate your input into how bad a decision this is.
I hesitate to call it a bad decision, so much as it is rather un-database like, and from that standpoint it is a poor database design. It might be an acceptable application design, however. There are kind of two camps on this. If you're more comfortable doing everything in a scripting language or client environment, and your data needs are fairly simple, then there isn't a big concern about the database design. You're more or less using the database as nothing more than a fancy CSV container. While this is, in a sense, missing the point on what a relational database gives you and throwing away a huge amount of expression and processing ability, sometimes that's all you need-- and if you need to make frequent updates and queries, there are still advantages to using a library like SQLite vs. an actual CSV or similar file. Then there is the feeling that if you're going to be using a database, you might as well "do it right" and actually use the database the way it was designed to be used. I tend to agree with that idea, especially when the data needs and structures are anything other than the most trivial one-table design. HOWEVER, I'm well aware that many programmers-- even seasoned, experienced developers-- don't have any kind of background with relational concepts and design. Without a basic understanding of what a database is good at doing (and how to get the database to do such tricks), it is very difficult to design toward the strengths of a relational system. You can't "do it right" if you're not aware what "right" is. I don't really have an answer to that, other than suggesting resources to help you learn. Maybe it's worth your time, maybe it isn't. There are many books and website tutorials on SQL and databases. Some of these are for the serious database professional, while some are very entry level. You might enjoy something like "Using SQLite" <http://oreilly.com/catalog/9780596521189/>, which was specifically written for the experienced developer that, until this point, has never had a need to play around with a database. While it covers the SQLite API and the SQL language, it also tries to provide a brief "get you started" introduction to database design and the basic "design patterns" used in most designs. (I should also point out that I'm somewhat biased in my preference for this book, given that I wrote it.) > So, I basically need to find entrys of Table B that do not appear in > that list. Obviously, it would be better to have a playedInf table and > do some kind of SELECT FROM B NOT IN A query. Returning to the problem at hand, one of the first principals of database design is that each attribute of each row (that is, a column/row "cell") should hold only one value. (There are some that will argue that this "value" can be an array, but that is a point for the theory people to argue over, and I'm not interested in having that discussion.) Having "singular" values is one of the rules for what is called the First Normal Form. You can Google that for more information, if you don't already know what the Normal Forms are all about. A big reason for singular values is that all the built-in techniques for cross indexing data in one table to another table depend on the single values, as that allows the database to match things back up. This is one of the more difficult aspects of database, however... The concept of JOINs and how to effectively use them tend to trip newcomers up a bit, and if you don't understand how to use JOINs to bring your data back together, it isn't a big surprise you might not want to split your data back up. So, in your case, if I can guess some of the details, you've got a Many-to-Many relationship (i.e. people in groups; people can belong to more than one group, groups have more than one member). This is typically done with a people table, a group table, and a "membership" table, known as a "link table" or "bridge table" that connects people to groups. You can then use JOINs to bring all that data back together and get the answers you need, even if you're just doing equi-JOIN filtering (filtering based off JOIN values that aren't actually returned). To find rows that aren't there, you can often use OUTER JOINs and search for NULLs where there shouldn't be any, or you can use other operations, such as NOT EXISTS( ). > However, I would rather not do the processing at all. I'm pushing off > the calculation to the client - and the client does not use a database > at all. I simply send the list as a text field and get the client to > sort it out, as he has plenty of processor time. It seems to me that > this is quicker than making the list on the fly by doing a SELECT and > then a concatination from another table... > > Am I completely off-base on this? Not at all. It depends a lot on the data design and the application requirements. Much of the big "No-SQL" movement does exactly this. The big secret is that most of these applications are still doing relational data operations, they're just pushing the manipulations (such as a JOIN) down into the client code, making the "database" a more simple level storage engine. Many of the end results are the same, however. It makes sense when performance is at an utmost premium, and allows greater scalability, but it also increases application complexity and tends to skirt some issues like transactional safety. Those types of systems also trade flexibility in the general case for complexity and speed in the common case. Even if that's an appropriate application design, you might still do well to use a somewhat traditional database structure. You still need to filter the data before you pass it off to the client, and generally the database is going to be better at that then you are. Having a more traditional design also simplifies insertions and updates, which can get clumsy and complex in a text field containing a CSV list, or something similar. A traditional design will also let you prototype and test using the database in a full SQL style, and then optimize the common-case queries, perhaps pulling some of the processing or business logic back down into the application. Which is "right" depends on what you're trying to do, and the compromises and motivations you need to make along the way. Kind of like all software engineering. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Intelligence is like underwear: it is important that you have it, but showing it to the wrong people has the tendency to make them feel uncomfortable." -- Angela Johnson _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users