On 2006.07.02, Andrew Piskorski <[EMAIL PROTECTED]> wrote: > Choosing BerkeleyDB for its *scalability* vs. MySQL seems... odd. > They are very different tools.
If all you're using MySQL (or any SQL-fronted data persistence mechanism) for is key-value lookups, then something like BDB ought to win because SQL parse time overhead is not zero. SQL is not "free" in that regard. Often, I get into discussions with folks about why I prefer not to use stored procedures in a DBMS. My biggest argument against it is for code versioning: I prefer to version my SQL closely with the application code. Ever write a scheme to support versioned stored procedures inside the DBMS? (I've done it, both for Oracle and Sybase.) It's hard to get right, and the DBMS doesn't make it easy on you. Of course, the *ONLY* compelling reason to push SQL into the DBMS using stored procedures is the optimization around SQL parse time: parse it once, store it inside the DBMS, to remove that cost at the per query execution level. This makes sense when you have queries that are short-lived and are executed rapidly, which can often be the case in web-based applications. Sidenote: the argument of "we push SQL into stored procedures so the developers don't have to learn SQL and/or prevent them from mucking with the database" implies a culture of knowledge hiding and hoarding (at best) or distrust (at worst). I dislike working in those environments. Sorry, lets bring this back around. In the scenario of simple key-value based lookups, even if you push your code into a stored procedure to pre-parse as much of your query as possible, there's still some non-zero parse time for the code invoking the stored procedure. Suppose your requirement is 400 queries/second. Suppose your parse time is 1 msec, probably a lot shorter than what the real cost is because of all sorts of overhead (network I/O, etc.). That means every second, your spending 400 msec or 40% of your time parsing! Wouldn't going to BDB, which would eliminate that parsing, be a measurable 40% gain in performance? Of course, this only starts to make a difference at the very high end of performance requirements. Today's approach is "just scale horizontally" and have 10 servers that just have to handle 40 q/s. Sure, that's great: it's what most of us do. However, as folks continue to push cost containment and operational optimization, floor tiles in data centers are a big expense. Sun is banking on the fact that hosting resellers will look to their products to compete on price because the ceiling is HVAC and auxiillary power capability: eventually, you run out of physical space to build backup power generators and cooling capability. Again, the problem in the small (10 servers doing 40 q/s) is easy to solve. But suppose you had to handle, say, 1MM q/s. Would you rather a solution that required 2,500 servers, or 25,000? That factor of 10 gets exponentially harder to support given physical requirements. -- Dossy -- Dossy Shiobara | [EMAIL PROTECTED] | http://dossy.org/ Panoptic Computer Network | http://panoptic.com/ "He realized the fastest way to change is to laugh at your own folly -- then you can let go and quickly move on." (p. 70) -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.