Not embedding SQL in perl (was RE: [OT] Inspired by closing comments from the UBB thread.)

2001-08-01 Thread mgraham


Joe Breeden [mailto:[EMAIL PROTECTED]] wrote:
...
 wondering about alternatives to embedding SQL in to the code 
 of a program.
...
 It would be interesting to know how other people have solved 
 that problem.

One approach is to use something like Ima::DBI, which I'm currently toying
with.  With Ima::DBI, you still embed your SQL in your perl code, but at
least you put all of your SQL into a single module somewhere and you do so
in a very structured way.  

To access the database from the rest of your program, you call methods of
your database query object.  This is a lot cleaner than whipping up a query
string every time you want to hit the database.  It's also a lot more
flexible.  You could, for instance, create different database classes for
different database backends, and still keep the programming interface the
same.

Of course you could do all this without Ima::DBI; roll up your own custom
database wrapper classes.  But Ima::DBI also handles some mod_perl DBI
issues such as guaranteeing one DBI statement handle per process.  


Michael




Re: Not embedding SQL in perl (was RE: [OT] Inspired by closing comments from the UBB thread.)

2001-08-01 Thread Gunther Birznieks

At 02:44 PM 8/1/2001 -0700, Jeffrey W. Baker wrote:


On Thu, 2 Aug 2001, Gunther Birznieks wrote:

  When you've had your fill of wrestling over mySQL vs PostGres and stored
  procs versus inline SQL (I know I have long ago)
 
  You guys should definitely read the following:
 
  http://www.ambysoft.com/persistenceLayer.html
 
  One of my current coworkers turned me on to this. I have found it to be one
  of the best series of articles related towards what it takes to abstract
  database away from your object layer and the various levels at which it
  makes sense to do so.
 
  You may find the design a little complex, but Scott pretty explicitly
  states that this is what is necessary for a *large* system. You can always
  go down a less complex path by choice if you feel your programs aren't
  complex enough to need the full Persistence Layer structure he advocates.

I've worked with Scott Ambler, and I could record everything Scott Ambler
knows about actually devleloping large systems on the head of a pin, using
a magic marker.  That guy is a hopeless academic without the slightest
clue of how to actually make software happen.

I suppose I can't comment on your opinion as I do not personally know him. 
But I find his statements to be worthy (as explained further below) 
regardless of what you say about his real-world knowledge.

So I can only imagine that he has taken in many comments from users over 
the years and made up his articles based on feedback since I think this one 
is particular is reasonable. Although I've never had to implement all 6 or 
so object abstractions in the ultimate persistence layer he recommends. :)

Here's the brutal truth about persistance abstractions using an RDBMS
backing store.  At some point, your DBA is going to come up to you and
tell you that you code is too slow.  You need to rewrite some SQL queries
to use a different index, or some sorting hints, or whatever.  You will
realize that you need to pass some extra information down through your
abstraction layers to make it all happen.  After that happens twice or
thrice, you will slowly come to realize that your abstraction is really no
abstraction at all: every time the schema changes, the top level interface
needs to change as well.

I can't say that I agree.

It depends on what you are coding for. Are you coding for performance or 
are you coding for getting a product out there that is easy to maintain?

In many cases, these two requirements are quite at odds. This thread was 
originally sparked by someone getting annoyed that SQL was embedded 
throughout the code and finding it hard to grasp how to deal with this.

While it's true that the best performance comes from hand-coding the SQL, 
and if you hand-code the SQL, it should arguably be close to the section of 
code that requires this SQL, not all programs require this. In fact, very 
few in my experience. Those that have required speed have required it for a 
small subset of operations in a larger project.

I strongly believe many apps can get away without having SQL embedded. I've 
been doing it for the last several years and definitely coding and 
maintenance time improves with some persistence layer abstraction. But yes, 
you run the risk of having to go back and code a SQL statement or two, and 
you run the risk of somewhat lower performance, but as Scott mentions in 
his article, these should be the well-documented exception, not the rule.

Nick Tonkin posted a very clear and well written post a few minutes ago 
about embedding SQL close to the code which may demonstrate the opposite of 
what I am trying to say. But on the other hand, I could understand that a 
company such as ValueClick really have to make sure their databases and the 
SQL that accesses them are completely tweaked.

So I think given speed requirements, making a HERE document and using other 
clean-coding techniques to keep the SQL close to the code that needs it is 
quite reasonable.

However, in my experience...

Of the things that are harder to duplicate in a persistence layer to one 
degree or another...

Not all applications require transactions
Not all applications require aggregation beyond count
Not all applications require blinding speed (just decent speed)
Not all applications require joins
Not all applications require unions
Not all applications require subselects

And even if you would argue that taking into account a union of 
probabilities an application may need at least one of the above, I have 
found it simply is not true. Usually when an application has a fairly 
complex data model then they need more than one of the above and that's 
when you have to move to SQL.

In other words, if the probability that an app needs each of the features 
above is 5%, then rather than the union of the probabilities being 5 + 5 + 
5 + 5 + 5 + 5, it is really more like 8% where the majority of the 5% is 
really in applications that needs more than one of the above advanced SQL 
list.