Re: [ADVANCED-DOTNET] Abstracting SQL

Frans Bouma Thu, 07 Oct 2004 01:08:03 -0700

> Haha - nothing like some free advertising ;). I was hoping 
> someone would share the details of their brilliant solution!


        hehe :D. 

        Ok, I'll explain our way of doing it below.

> As far as O/R mapper goes though - would it work with our 
> custom metadata table?

        Depends, most mappers have their own source for the persistence
data, so it is very likely your table with metadata is not directly
usable.

        Abstracting away SQL is nothing more than having constructs in
another language to implement what you WANT, and feed these constructs
to an engine which converts them to SQL.

        SQL is very complex, although it looks very simple. It has a lot
of possible constructs, which can have different meanings. Abstracting
away SQL with a language similar to SQL is a waste of time, you'll then
re-stating SQL but with different syntaxis. It's also undoable to embed
every proprietry SQL element into your constructs. After all,
abstracting away SQL is mostly done to get some sort of database
independence: write filters/queries without knowing the syntactic
details about the particular sql used.

        O/R mappers deal with entities. I'll not go into details about
what is/isn't an entity, as that's a long story, but let's say for
simplicity, that an entity is roughly represented by a table. (with
inheritance, it can be more than 1 table, but for the explanation that's
not that important). So what you need is some sort of mechanism with
which you can tell the system that you want 1 or more entities of type T
when a given set of predicates is true. The predicates can be targeting
multiple entities (give me all customers with orders in late june) and
the relationships between the targeted entities and the predicates
combined is called a filter: after all you're filtering on the entities
you want to retrieve, THAT's important. 

        The SQL engine thus needs to have: (I'll explain SELECT, the
others work the same)
- names of entity fields (for field aliassing, see below) 
- persistence info for each field (which table, table field, field type,
type specifics etc.)
- the predicates (part of filter)
- the relationships (part of filter)
- eventually constructs like sortclauses, group by elements etc. 

        Per database type we use a separate engine, so we can fully
tweak the SQL per database. This is required for the more advanced
features.

        The predicates will become the WHERE clause. The relationships
will become the FROM clause. If inheritance is supported, the
persistence info for the fields will likely contain fields from more
than 1 table, so these tables will also be added to the FROM clause.

        We use EntityField objects. These objects are different in the
two paradigms we support. In SelfServicing (which offers persistence
logic on teh entity, like customer.Save()) the EntityField object
contains the persistence info of it as well, in adapter (which uses a
separate object for persistence logic) it doesn't, separate persistence
info objects are used.

        So the logic supplies a set of entityfield objects and
persistence info objects (for selfservicing, it's just 2 times the same
object, just casted to another interface). With this, the SELECT list
can be constructed.

        We use predicate classes, like FieldCompareValuePredicate. All
these classes implement IPredicate and can be grouped into
PredicateExpression objects (which can also be added to those objects,
for nesting). These classes perform the SQL generation for their own
where clause element. However to keep everything database generic, every
database has its own databasespecificgenerator object. So the SQL engine
receives a predicate expression, and calls its ToQueryText method,
passing in the DatabaseSpecificgenerator object for that database type.
That generator object produces parameter objects, subqueries, field
names, object names etc. Predicate classes can make formulating a query
a little more verbose (although intellisense helps you a lot), it does
make the query system fully typed and extensible: you can implement
IPredicate yourself to implement a predicate class you'd like to use
often but which isn't provided yet.

        The string returned (together with the parameter set) is used by
the SQL engine as the WHERE clause. If no relations are specified and no
multi-table entities are used, the FROM clause will be build from the
first table object found in the persistence object, otherwise the
relationcollection will be asked to build a FROM clause. This can be an
ANSI join, or a non-ansi, per database specific join. 

        When these constructs are done, the query almost looks like an
SQL query. You then append the sort clauses, and group by clauses if
supplied.

        Because we use EntityField objects which are related to the
persistence info, we can do advanced things like building lists with
them in code, append Sql expression objects to them, add aggregate
functions and a group by and create a dynamic query which groups on
fields, uses expressions and aggregates and produces a resultset not
related to the entity classes defined before but fully utilizes the
power of the relational model. 

        This requires of course field aliassing, if you want multiple
times the same field (for some reason) or use the field in a filter,
that's why the names are important. You can use the same SQL producing
routine to create these dynamic list queries, as the input is the same
as for an entity. 

        When the query is build, the query is stored in a RetrievalQuery
or an ActionQuery object which is returned. The calling code can then
simply call Execute on these objects to execute the object. When a
retrieval query was build, Execute will return a datareader. The calling
code will use that datareader to fill in the fields of new entity
objects. Entity objects are produced by a passed in factory. The logic
then steps through the datareader, uses the mapping information to place
a datareader cell into an entity's field. When all the data is read, the
object (or objects) created are returned. 

        You can build all this with ADO.NET interfaces. There are some
details with transactions (SaveTransaction is not defined on an
interface :X) but these are minor. So the core logic is fully database
generic, your code targeting that code is fully database generic, you
just use a different database SQL engine to produce the actual SQL based
on the input the generic code gives it. 

        Don't think this is a walk in the park :). I'm currently at
1.6MB of C# code just for the db engines and the generic framework on
top of it to provide all the functionality. But yuo can have something
up and running pretty fast, it gets tough when advanced features like
prefetch paths (fetch graphs of objects with 1 query per graph node and
clientside merging) and subqueries, sql expressions and server side
paging are added. :)

                Frans.

===================================
This list is hosted by DevelopMentor�  http://www.develop.com
Some .NET courses you may be interested in:

Essential .NET: building applications and components with CSharp
August 30 - September 3, in Los Angeles
http://www.develop.com/courses/edotnet

View archives and manage your subscription(s) at http://discuss.develop.com

Re: [ADVANCED-DOTNET] Abstracting SQL

Reply via email to