On Wed, Mar 06, 2002 at 04:45:23PM -0800, Peter Jay Salzman wrote: > i have to admit, having a human readable flat file is kind of sexy.
Yes it's nice sometimes. This makes much sense on things that need to be a certain way, ie HTTP server logs, where the data will never be iterated over repeatedly nor individual lines changed or deleted from. > my application is to store dates that certain work is done on a vehicle. > like a list of all dates/odometer readings of oil changes, brake pad > inspections, battery electrolyte checks, coolant flushes and stuff > like that. Have you studied relational databases yet? If there are well defined relationships between those datum you listed, than an RDBMS is probably a better choice than so many flat files -- even if you index them. > i don't think this can compare with what commerce sites use databases > for, but after the years roll by, i'm sure the databases can reach a > respectable size. conceivably, someone could keep records on the same > vehicle for a decade. It won't be years. If you do *any* sort of deleting, changing or aging of data, you'll be reparsing and reindexing those flat files. That will start to noticably slow you down with just thousands of records. Note that as your program develops you'll probably reference the dataset repeatedly, meaning that a 0.1 second operation done 20 times to display a single report page is now taking 2 seconds. If that's the "next page" option, 2 seconds is a long time to wait. > what do you think? i'd rather stay away from a server based DB system. > would DBD::CSV be ok for this sort of thing? DBD:: is nice. I've used it against MS-SQL (think Sybase) and MySQL servers quite a bit. I don't know about DBD::CSV. If you can't run sql against it I'd say no. Real world. I worked on a home grown accounting system for a period of several years. It started as un-indexed flat files. It became slow as we hit thousands of records (totalled across the whole system). It became unbearable as we hit 100,000 records. Billing was taking nearly 24 hours to run on just 1,000 customer accounts. We'd already cached the files in memory arrays at that time too... Indexing the flatfiles would have helped. However, since it was flat files, those datum were cycled through to find the right line. An indexing scheme useing Berkeley DB can only hold one index, and usually the keys must be unique (these are not always true, but traditionally are). This means if the key is first name, you can't have two Joe's. If the key is "FirstnameLastname" you can't have two John Smiths. And so on. If the key is on phone number, you can't search on last name. And so on. Addtionally, if you do your own flat file and/or indexed file, all the "logic" is on you to glue your program space to your data. Data management is still your job. Enter SQL. Yes, the servers can be a problem to build and maintain (Oracle, MSSQL, Sybase), but can also be quite easy and still featureful (MySQL), and problem free. With SQL the data storage and retrieval are abstracted from your program. You say "give me all the Smiths in Auburn who owe us money for more than 60 days" and you get EXACTLY that data back. You are free to improve your code, UI, etc... and not worry about "how am I going to get that data from my files quickly and efficiently". I migrated that accounting system from a flatfile data layer to MySQL. 24 hours run time turned into 15 minutes run time. All I did was replace all the data access module function calls with appropriate simple SQL. I could have gotten another order of magnatude improvement by modifying the application code to ask more specific questions of the data layer. Use SQL. Use DBI so you can change SQL servers later if you find the need (your app gets bought by a dotcom). You won't regret it. I'm off the soapbox. Hope it helps. -- Ted Deppner http://www.psyber.com/~ted/ _______________________________________________ vox-tech mailing list [EMAIL PROTECTED] http://lists.lugod.org/mailman/listinfo/vox-tech
