On 2006.07.02, Andrew Piskorski <[EMAIL PROTECTED]> wrote:
> Choosing BerkeleyDB for its *scalability* vs. MySQL seems...  odd.
> They are very different tools.

If all you're using MySQL (or any SQL-fronted data persistence
mechanism) for is key-value lookups, then something like BDB ought to
win because SQL parse time overhead is not zero.  SQL is not "free" in
that regard.

Often, I get into discussions with folks about why I prefer not to use
stored procedures in a DBMS.  My biggest argument against it is for code
versioning: I prefer to version my SQL closely with the application
code.  Ever write a scheme to support versioned stored procedures inside
the DBMS?  (I've done it, both for Oracle and Sybase.)  It's hard to get
right, and the DBMS doesn't make it easy on you.

Of course, the *ONLY* compelling reason to push SQL into the DBMS using
stored procedures is the optimization around SQL parse time: parse it
once, store it inside the DBMS, to remove that cost at the per query
execution level.  This makes sense when you have queries that are
short-lived and are executed rapidly, which can often be the case in
web-based applications.

Sidenote: the argument of "we push SQL into stored procedures so the
developers don't have to learn SQL and/or prevent them from mucking with
the database" implies a culture of knowledge hiding and hoarding (at
best) or distrust (at worst).  I dislike working in those environments.

Sorry, lets bring this back around.  In the scenario of simple key-value
based lookups, even if you push your code into a stored procedure to
pre-parse as much of your query as possible, there's still some non-zero
parse time for the code invoking the stored procedure.  Suppose your
requirement is 400 queries/second.  Suppose your parse time is 1 msec,
probably a lot shorter than what the real cost is because of all sorts
of overhead (network I/O, etc.).  That means every second, your spending
400 msec or 40% of your time parsing!  Wouldn't going to BDB, which
would eliminate that parsing, be a measurable 40% gain in performance?

Of course, this only starts to make a difference at the very high end of
performance requirements.  Today's approach is "just scale horizontally"
and have 10 servers that just have to handle 40 q/s.  Sure, that's
great: it's what most of us do.  However, as folks continue to push cost
containment and operational optimization, floor tiles in data centers
are a big expense.  Sun is banking on the fact that hosting resellers
will look to their products to compete on price because the ceiling is
HVAC and auxiillary power capability: eventually, you run out of
physical space to build backup power generators and cooling capability.

Again, the problem in the small (10 servers doing 40 q/s) is easy to
solve.  But suppose you had to handle, say, 1MM q/s.  Would you rather a
solution that required 2,500 servers, or 25,000?  That factor of 10 gets
exponentially harder to support given physical requirements.

-- Dossy

-- 
Dossy Shiobara              | [EMAIL PROTECTED] | http://dossy.org/
Panoptic Computer Network   | http://panoptic.com/
  "He realized the fastest way to change is to laugh at your own
    folly -- then you can let go and quickly move on." (p. 70)


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> 
with the
body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: 
field of your email blank.

Reply via email to