Re: [sqlite] Database logic in TRIGGER and CONFLICT, or in software ?

Jay A. Kreibich Sun, 14 Jun 2009 20:45:00 -0700

On Sat, Jun 13, 2009 at 11:42:21PM +0100, Simon Slavin scratched on the wall:


> Do any of you have experience with doing this ?  Are there some  
> conclusive points which will make me decide immediately that I should  
> do it one way or the other ?  I accept reasoned argument, URLs,  
> anecdotes, or anything else relevant.

  Every system that uses a database must deal with this problem, and
  there are no real easy answers.  But here are a few deeper points to
  consider.

  From the view point of Relational Model, the formal mathematical
  model that most RDBMS engines are based off (in theory, anyways), you
  should push as many constraints, rules, and general enforcement into
  the database as possible.  The database is meant to represent "truth"
  and under the Model it should be impossible (or at least as difficult
  as possible) to make the database become inconsistent-- that is, be
  in a state that doesn't represent truth.

  There are two main issues with this.  First, an modern SQL database
  is not a Relational Model database.  SQL only offers a number of
  different constraints, but it is not as flexible or complete as the
  theoretical model presented by the Relational Model.

  Second, the Relational Model doesn't account for access methods or
  the concept of middle-ware or tiered applications.  It is assumed
  that anyone and everything from applications to command-line systems
  will be accessing the database.  If you have a user typing in raw SQL
  commands and you want your database to remain consistent, you have no
  choice but to push as much as possible into the database and do lots
  of audits.

  Thankfully, most real-world applications do not work this way.  You
  provide some piece of software that accesses the database, and (in
  general) the only people that have command line access are DBAs and
  other admin types.  In that case, you can relax some of the constraints
  or rules that are difficult to express in the database and move those
  out to the application.

  This can be good or bad.  In the case of an old-school mainframe
  application, where you logged into a central system via your VT100 or
  TN3270 terminal and ran the application on the mainframe, the
  software was centralized and easy to maintain.  A system-wide
  application upgrade consists of replacing one binary.  This made it
  relatively safe to push rules and constraints into the application, as
  it was easy to keep the end-user application and the database --
  including schema changes and format updates -- in sync.  After all,
  both the DB and the application were sitting on one machine.

  Then came the era of desktop systems.  Now it was common to have a
  GUI desktop application that was used to access and manipulate the
  database.  The problem is, it is nearly impossible to keep every
  desktop system in perfect sync.  If a schema change required an
  update to a query, that required a new version of the desktop
  application, which required a network-wide upgrade.  Really fast, you
  learned to either make your upgrades backwards compatible OR you had
  an extremely simply automatic update system.  Regardless, there were
  dangers associated with pushing too many high-level rules into the
  application.  If someone managed to access the database with an older
  client that had a different set of rules, bad things could happen.
  To get around these problems people learned to push more and more
  logic into the database, including complex stored functions and
  procedures to do just about any update or adjustment and views for
  all but the most basic filtering of a query.  This would allow for
  schema changes without client updates.

  As the web gained popularity, along with middle-ware and multi-tier
  architectures, we returned more or less to the model of "everything
  on one machine" or, at least, under one control.  If the application
  is completely web driven, the problem of upgrading the SQL access
  application (e.g. the web server code) and the database are greatly
  reduced compared to the desktop model, allowing more and more logic
  to return to the code.

  So much of the answer to your question depends on the environment
  you're trying to support and how much control you have over all the
  bits of code that have direct access to the database.  If you control
  the database AND the next tier down, pushing logic, rules, and
  constraints into that layer has fewer issues and ramifications than
  keeping them in the database.

  From a pure software engineering standpoint, the best solution is to
  keep the rules and constraints as close to the data as possible.
  Just like the Relational Model thinking, the fewer ways there are to
  screw up the data, the better.  If you have control over the next
  layer you can "let it hang out" a bit.  There is still the danger
  that some admin can screw something up with a command line tool, but
  in theory those people know what they're doing and either keep their
  mitts off, or are being constructive with their access.  Either way,
  they shouldn't be doing application tasks via the command line.

  Actually, that can sometimes be useful if you need to over-ride a
  point of policy or something similar.  That one exec that made an
  exception to the rules for themselves, and now that data needs to
  make it into the database somehow.  Sometimes having the rules higher
  up allows the command line tools to deal with those cases.

  There is also the question of cost of change.  If you need to alter a
  constraint, and that requires dropping and re-creating a whole table,
  that can be a big pain in the butt.  Most database changes are.  So
  if your application is still somewhat immature, or operates in an
  environment where the business rules and policies are in great flux,
  it is often better to push the rules out a level to where they are
  easier to audit/change/update.  The main issue with that is that you
  expose a level of access that can get around the rules.



  All in all, I try to balance things out in my own designs.  I try to
  push constraints and limitations that are inherent in the data model
  into the database.  Basically try to keep the database from modeling
  the impossible.  Business rules and policy usually gets implemented
  in the access layer.  For one thing, business rules and policy have a
  tendency to change, while (if you did it right) the data model should
  more or less remain constant.  The line is a bit more black and white
  if you're modeling some physical thing, like computers on a network,
  or items in a library.  They get a bit more gray when you're modeling
  something that's defined by the business rules-- like what *exactly*
  an invoice is, and what it can contain.  Still, you want to keep the
  absurd data out of the database and the fluid rules in more mailable
  code.
  
  It can still take a lot of auditing and control.  With a complex
  application, it is all too easy to have two different functions or
  update procedures that access the same data in slightly different
  ways that apply a different set of update rules or sanity checking.
  As backwards as it sounds, if the logic is pushed out of the
  database, it is often even more important that all the queries and
  updates be written by one person (or audited by one person).  If all
  the constraints are in the database, it keeps everyone -- including
  the application programmers -- from doing something silly.  If the
  rules are enforced by the programmer(s), we assume they know what
  they're doing.  This is often a weak assumption if you have a
  large(r) team working from a poorly documented data model (in other
  words, most software projects).

  SQLite fills a lot of roles that normal databases and DBAs don't have
  to deal with, which can present even more complex problems.  For example,
  if you're using SQLite as an application file format, you need to get
  the file format pretty darn tight.  If you find out version 2 of the
  application requires a new schema for the file format, not only do you
  need to write an automatic update and conversion function, you need to
  maintain that function for the rest of the application's lifetime.
  Same for every other schema change and update that changes the file
  version/format.



  If you're a professional software engineer, you've faced similar
  questions a hundred times before.  As usual, the big question isn't
  so much the code you want to write today, but the update, maintenance,
  and roll-out issues in the future.

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Our opponent is an alien starship packed with atomic bombs.  We have
 a protractor."   "I'll go home and see if I can scrounge up a ruler
 and a piece of string."  --from Anathem by Neal Stephenson
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Database logic in TRIGGER and CONFLICT, or in software ?

Reply via email to