At 02:44 PM 8/1/2001 -0700, Jeffrey W. Baker wrote:
On Thu, 2 Aug 2001, Gunther Birznieks wrote:
When you've had your fill of wrestling over mySQL vs PostGres and stored
procs versus inline SQL (I know I have long ago)
You guys should definitely read the following:
http://www.ambysoft.com/persistenceLayer.html
One of my current coworkers turned me on to this. I have found it to be one
of the best series of articles related towards what it takes to abstract
database away from your object layer and the various levels at which it
makes sense to do so.
You may find the design a little complex, but Scott pretty explicitly
states that this is what is necessary for a *large* system. You can always
go down a less complex path by choice if you feel your programs aren't
complex enough to need the full Persistence Layer structure he advocates.
I've worked with Scott Ambler, and I could record everything Scott Ambler
knows about actually devleloping large systems on the head of a pin, using
a magic marker. That guy is a hopeless academic without the slightest
clue of how to actually make software happen.
I suppose I can't comment on your opinion as I do not personally know him.
But I find his statements to be worthy (as explained further below)
regardless of what you say about his real-world knowledge.
So I can only imagine that he has taken in many comments from users over
the years and made up his articles based on feedback since I think this one
is particular is reasonable. Although I've never had to implement all 6 or
so object abstractions in the ultimate persistence layer he recommends. :)
Here's the brutal truth about persistance abstractions using an RDBMS
backing store. At some point, your DBA is going to come up to you and
tell you that you code is too slow. You need to rewrite some SQL queries
to use a different index, or some sorting hints, or whatever. You will
realize that you need to pass some extra information down through your
abstraction layers to make it all happen. After that happens twice or
thrice, you will slowly come to realize that your abstraction is really no
abstraction at all: every time the schema changes, the top level interface
needs to change as well.
I can't say that I agree.
It depends on what you are coding for. Are you coding for performance or
are you coding for getting a product out there that is easy to maintain?
In many cases, these two requirements are quite at odds. This thread was
originally sparked by someone getting annoyed that SQL was embedded
throughout the code and finding it hard to grasp how to deal with this.
While it's true that the best performance comes from hand-coding the SQL,
and if you hand-code the SQL, it should arguably be close to the section of
code that requires this SQL, not all programs require this. In fact, very
few in my experience. Those that have required speed have required it for a
small subset of operations in a larger project.
I strongly believe many apps can get away without having SQL embedded. I've
been doing it for the last several years and definitely coding and
maintenance time improves with some persistence layer abstraction. But yes,
you run the risk of having to go back and code a SQL statement or two, and
you run the risk of somewhat lower performance, but as Scott mentions in
his article, these should be the well-documented exception, not the rule.
Nick Tonkin posted a very clear and well written post a few minutes ago
about embedding SQL close to the code which may demonstrate the opposite of
what I am trying to say. But on the other hand, I could understand that a
company such as ValueClick really have to make sure their databases and the
SQL that accesses them are completely tweaked.
So I think given speed requirements, making a HERE document and using other
clean-coding techniques to keep the SQL close to the code that needs it is
quite reasonable.
However, in my experience...
Of the things that are harder to duplicate in a persistence layer to one
degree or another...
Not all applications require transactions
Not all applications require aggregation beyond count
Not all applications require blinding speed (just decent speed)
Not all applications require joins
Not all applications require unions
Not all applications require subselects
And even if you would argue that taking into account a union of
probabilities an application may need at least one of the above, I have
found it simply is not true. Usually when an application has a fairly
complex data model then they need more than one of the above and that's
when you have to move to SQL.
In other words, if the probability that an app needs each of the features
above is 5%, then rather than the union of the probabilities being 5 + 5 +
5 + 5 + 5 + 5, it is really more like 8% where the majority of the 5% is
really in applications that needs more than one of the above advanced SQL
list.