RFC - SQL::ObjectModel - to SQL what a DOM is to XML

Darren Duncan Sat, 31 May 2003 17:48:53 -0700

Hello.  I would like to collect some advance feedback on a new module (or
small collection thereof) which I have already been working on as part of
a larger effort, but which I am considering releasing as a separate
project.  I had already brought up this issue on [EMAIL PROTECTED] and Tim
Bunce suggested that [EMAIL PROTECTED] would be the best place to ask
potential users about its viability.


You should read the email to the end before replying; the questions are at
the end.

Please reply both to [EMAIL PROTECTED] and to [EMAIL PROTECTED], as
I am not subscribed to the mailing list, and to keep everyone in the loop;
I would do the same.

The module I propose will implement structured container objects that
model or describe all kinds of de-serialized SQL schema and instructions.
Each object is equivalent to either a single SQL statement or an ordered
collection of them (such as a stored SQL procedure or a database schema).
This is similar to how an XML DOM is equivalent to an XML Document; an XML
Document is a serialized form of an XML DOM, and an XML DOM is a
de-serialized form of an XML Document.

The idea is that my objects can be used by other database-using
applications or modules in place of SQL strings.  An advantage of the
objects is that you can build SQL simply by plugging individual scalar
values (like names of table columns or names of functions or literal data
values) into slots, which are as easy to set as individual elements in
arrays or hashes.  Similarly, it is easy to read individual attributes
from SQL instructions, like reading array elements.  Using the objects
makes it a lot easier to, for example, analyze a select/view definition in
order to generate appropriate insert/update/delete instructions against
selected data.  You never have to parse or concatenate or escape strings
at any time.

Using the objects is especially well suited for applications that make use
of a data dictionary to describe what they do, as individual scalar values
in the data dictionary can just be straight copied between the data
dictionary and appropriate places in my object model.  This is a lot
easier than having to generate SQL yourself from said data dictionaries,
and you can save on common mistakes such as not escaping your data.

Of course, since many databases use serialized SQL as their standard input
(as does the DBI module), my objects would have to be serialized at an
appropriate time.  However, serializing and de-serializing/parsing SQL
is not and will never be handled by my module.  Rather, this can be done
by any third party generic modules of your choice, each of which would
specialize in some way or other.  Said modules would usually specialize in
what specific database they are relating to, which have their own special
ways of saying the same things.  (Outer join syntax, anyone?)  And to get
things rolling, I will probably make a serializing module myself, as an
example, unless someone else is quick to step up and do it.

Since SQL is not identical with each database, what my object
model would actually do is model a normalized superset of the features of
different databases.  When I say normalized, I mean that if there is a
single task that is known by several different names or syntaxes (such as
outer-joins), it will have exactly one name in the model.  When I say
superset, I mean that if a database supports a useful feature but other
databases don't, it will still be represented in the model so people using
that database can access it; but you can only use the feature with that
database, or alternately with other databases for which someone decided
to emulate the feature.  But so long as the actual functionality you
want is supported, your applications written against my object model will
never have to be changed for the small ideosyncosies (sp?) of various
databases that have features you use.

Sort of in the middle between those
two extremes is functionality which is not natively supported in some
databases, but that is simple enough to emulate using built-in
functionality (for example, emulating "left()" with "substr()"); in such
cases, it is expected that when you use such features they will work with
any database.  For example, if you want a model-specified boolean data
type, you will always get it, whether it is implemented  on a
per-database-basis as a "boolean" or an "int(1)" or a "number(1,0)".  Or a
model-specified "str" data type you will always get it, whether it is
called a "text" or a "varchar2" or a "sql_varchar".

Basically, my SQL object-model is tasked simply with describing, in a
non-ambiguous and structured manner, the schemas or instructions that you
want a database to do.  The object-model doesn't know anything specific to
a particular database; it doesn't even care whether the database uses SQL
as its input; an intermediary module can translate the object-model into
some other kind of non-SQL output for them.  The object-model simply
defines an ideal complete capabilities list and works towards those; and
the databases just implement whichever parts of that list they can.  More
complete databases like Oracle will probably implement almost all of the
list, while MySQL 3.x would implement much less of it natively.  This is
how I keep my module simple and easy to use.  Much like DBI itself is
fundamentally a simple and easy to use interface, while the separate DBD
plug-in modules take care of the implementation details.

I note that my object-model is intended to model all of SQL in a
deserialized form.  This is much unlike all of the other
SQL-generating or parsing modules I have seen on CPAN, which are mainly
limited to the most common tasks, which are select/insert/update/delete,
and creating/removing tables.  And even those usually don't support a lot
of SQL's features for selects.  And the few frameworks that I did see
which recognized all the other main parts of a database schema, they
simply extract or set serialized SQL which users have to write themselves.
My object model will represent all of these things, in a fully
deserialized form: table definitions (with multiple-column unique keys and
foreign keys), view definitions (also known as named select statements),
definitions for stored procedures, stored functions, triggers, sequences,
and Oracle's packages (do other databases have a concept like
'packages'?); select statements or views or cursor definitions can be
arbitrarily complex, including multiple-column table joins, outer joins,
unions, Oracle's start-with-connect-by, group-by, have subqueries in the
'from' or 'where' clauses, use various built-in functions (like 'substr',
'concat', '+','*','log','count','sum','and','or','=', '<', more) almost
anywhere, call user-defined stored functions (if the underlying database
allows it), and more.  I will note that an advantage of having these
definitions fully parsed is that, for example, if a database doesn't
support built-in stored procedures, a "sql generator" could alternately
generate an anonymous Perl function that does the same thing, to be called
instead.

Now, I will note that some of the features may not be important to many
people; eg: stored procedures.  However, if you are working on any
projects of significant size, you can get an idea that these things would
actually be a large speed, reliability, and security savings for you.
Look at many large corporate or government systems.  I believe that if
these advanced features can also be represented in a database-neutral
format, then users can get the full power of a database without being
locked into a single vendor due to all their investment in stored
procedures et al.  If customers can move a lot more easily, it will help
encourage database vendors to keep improving their products or lower
prices to keep their customers, and users in general benefit.  So I do
have reasons for trying to tackle the advanced database features in my
database abstraction effort.

So I think I would be bringing a lot of new functionality to CPAN, and not
be making "yet another SQL generator module".

What I am proposing is that my object-model module could be quite useful
to makers of other database-related modules, as sort of a useful (or,
dare I say it, "standard") interchange or intermediate format, becoming as
ubiquitous in the database area as basic arrays or hashes are elsewhere,
or such as XML DOMs are in many places.

And this isn't vaporware, per se.  I have the full intention of making
this and making it quickly, should there be the demand (or if nothing
else, to use myself).  That said, it will take more time or people to move
this beyond alpha or beta quality within a short time.  But it shouldn't
be long before you can play with it.  Only if I shall become incapacitated
soon shall this die.

Anyway, getting to the point, I wanted to solicit your opinions or
feedback, as users of DBI and related modules.

0. Does it look like I am blowing smoke or bull and am unlikely to deliver
on what I propose?

1. Does my module proposal sound like something you would want to use,
either in your applications or your own generic database modules?

2. Alternately, do you have trouble following what my proposal is, and so
wouldn't be able to judge whether it might be useful?

3. If such a module would be useful, what are some specific features or
priorities you would want it to have?

4. Are you aware of any existing modules or frameworks on CPAN that
already do what I am proposing?  If so, please point them out to me, as
all of the similar modules I found do not do some of the major features,
as I interpreted their documentation.  What are their names, and where are
they found.

5. While I will, of course, be putting out my module under an open-source
licence, is it important to you that a certain license or licenses are
used or not used?  For example, if I put it out under plain GPL, would
that provide any roadblocks to your use of the module?  Or would you only
use it if you had the choice between GPL and Artistic (as is Perl itself).
Alternately, would you use it if the licensing terms were like MySQL?

6. Does this module sound like something you would want to work on
yourself, either to get it done sooner or to ensure it has the features
you want?  Not that I expect anyone to, but I ask anyway.

So, I appreciate any feedback you can give me.  Thank you very much in
advance.

Please reply both to [EMAIL PROTECTED] and to [EMAIL PROTECTED], as
I am not subscribed to the mailing list, and to keep everyone in the loop;
I would do the same.

Have a good day. -- Darren Duncan

RFC - SQL::ObjectModel - to SQL what a DOM is to XML

Reply via email to