At 9:48 AM -0600 4/12/07, Dennis Cote wrote:
Yes I did assume no coupling because you didn't suggest any. If
there is coupling this is just another case of the second example.
While I didn't explicitly suggest coupling before, I was making my
arguments on the general case where actions against a database may
possibly be coupled, and my argument was towards solutions that work
for the general case. Sorry if I didn't communicate before that I
was speaking to the general case.
So put all the sub-steps in a subroutine and call it.
In the general case, I don't control what the sub-steps are, but I am
being a proxy for someone else, and I don't know in advance what they
would ask for. Also, as users may want data returned to them between
the sub-steps, their use for which could include determining what
sub-steps are, I can't just generate a subroutine at runtime to
execute, as then they wouldn't get anything back from their
intermediate queries on time. That said, I recognize that in some
situations it is possible for the stored procedure to embed all the
decision making logic necessary from the application, but this isn't
always true, as eg some user may be involved in intermediate steps.
I think that a SQLite pager-based mechanism for tracking child
transactions is quite a bit less complicated and more reliable than
using your workaround, since no details have to be remembered but
for the pages that changed.
That is not true and you know it.
On the contrary, I believe what I said.
You are just pushing the complexity back to Richard. He will have to
implement the changes to the parser, code generation, pager layers,
and test suite, as well as address the backwards compatibility
issues.
I don't see this as a problem. While it is true that a lot of
complexity can be layered on top of the DBMS rather than being
internal to the DBMS, I see child transactions as something that is
best implemented inside the DBMS.
Speaking in a very loose analogy, I see the complexity as SQLite is
now compared to with child transaction support to be like replacing:
foo();
foo();
With:
for ... {
foo();
}
That is, I see it as the difference between explicitly doing
something twice, and doing it once but inside a loop.
So as one can refactor code to use loops rather than explicit
repeating, I don't see the end result here being much larger or more
difficult to maintain. That is, we aren't just adding code, but also
taking away some that has become redundant, is how I conceptualize it.
So SQLite with child transactions is only trivially less lite than
it is now, which is still lite.
If it is a trivial as you suggest, then you should have already
prepared a patch. :-)
I wasn't saying that the patch itself was trivial (though I'm saying
it should be a simpler than the patches for many other requested
features), but rather that the measures of how "lite" SQLite is would
change a trivially small amount between before and after.
In fact, I propose moving rollbackable child transaction support to
the top of the todo list, rather than it being in the middle, given
that its presence can make a lot of other todo or wishlist items
much easier to implement, I believe.
And if it will make a difference, I will even make a monetary
donation (as I can afford to) in order to sponsor its development
(though I would like to think that the benefits are compelling on
their own).
You will have to discuss this with Richard Hipp.
Yes, of course. And I already did do that a few minutes after the list post.
How will nested transactions make creating a your wrapper easier?
Please be specific.
Well, to help people better understand this, I should start but
outlining my own connected work.
I am writing a free and open source RDBMS of my own, whose main
innovations relative to the general DBMS field are in the query
engine, namely the public face (programmatic API and query language)
that application developers and their users interact with. My RDBMS
has its own query language and feature set which overlaps with but
isn't the same as that of existing SQL DBMSs.
My RDBMS is structured as a framework with separate public interface
and backend implementation layers (called "Interface" and "Engine),
such that the backend is a swappable plugin-style component. The
"interface" or wrappers thereof handle parsing user queries into an
standardized AST format, which is what an "Engine" takes as input and
the engine implements the AST-defined query however it wants. The
native language and AST of my RDBMS define rigorous semantics which
users should be able to expect, and which an Engine is supposed to
comply with.
Note that a single query in my language is a full-blown routine
definition (which in the trivial case just contains a single
statement), so what it does and what format of data it can process
for input or output is arbitrarily complex. All routine arguments
are named, and they serve a purpose analagous to SQL's "host
parameters" (aka, "bind variables"), which can exchange input and
output with the application. A query-routine is separately
prepared/compiled and executed by the application, the latter
repeatable as expected.
An Engine can either be self-contained, natively implementing
everything the AST specifies by itself, or it can implement a glue
layer to some other DBMS, translating the AST into queries native to
the other DBMS. The features of the underlying DBMS will be
exploited as much as possible, so we get the best resource
efficiency, and only if the underlying DBMS can't do something
natively, does the Engine emulate the missing features (often a more
complicated affair) over top of something else that the underlying
does support, in so much as is possible, so the user still gets the
features.
In the general case, my own focus is on the Interface, which is the
framework core, and each Engine is made and distributed separately by
a third party, citing the core as its main external dependency.
However, there is one "Example" Engine that I bundle with the
interface so that it is possible to thoroughly test the core in
isolation from external dependencies.
The framework core will also have a "Validator" test suite, akin to
Sun's certification suite for Java implementations, that uses the
whole RDBMS API against a user-specified Engine to check that it
behaves to spec. Any third party Engine can generally just invoke
this test suite in 'make test' rather than having to write up its own
test suite. Validator itself is tested using Example.
The core-bundled Example Engine is naively implemented and focuses on
delivering the correct semantics for all language features, but
doesn't worry about being scalable; its main purpose is to serve as a
proof of concept to study from, but that it can actually be used as a
test platform when building applications over my RDBMS.
But I will also develop or co-develop several other Engines (not just
depending on third parties for said), to be distributed separately,
which are intended more for handling larger workloads in a typical
production environment; they are non-naive, at a cost of being more
complex.
Generally speaking, I leave the development of storage layers, and/or
modules that are more concerned with the low level details, to
others, as they are a lot more innovative, expert, and interested in
this area than I am, while I focus on the user experience side of
things.
So this is where SQLite comes in.
One of the, probably *the*, non-Example Engines will do its job using
SQLite, and it will try to do so in the most optimal way possible,
using all the native and high-level features that SQLite offers to
their full advantage, as is possible. In large part, it will
generate SQLite-flavor SQL from my language's AST, have SQLite
execute it, and package any bind variables or output. If a single
SQLite query
It stands to reason, then, that my job in implementing my RDBMS over
SQLite will be easier if SQLite natively supports the fundamental /
conceptually low-level features that my RDBMS does, so mapping will
be fairly straight forward.
For example, since SQLite 3 supports prepared statements, and named
bind variables, a prepare() against my DBMS turns into a
generate-and-prepare-SQL against SQLite, a variable-bind against my
DBMS turns into one against SQLite. An execute() against my DBMS
turns into one against SQLite. Assuming SQLite supports multiple
simultaneous prepared statements, then any multitude of SQL
statements that my routine may turn into (when I can't just do the
preferred action of using a single SQL statement defining a SQL
stored procedure or nested query), will be a multitude of SQLite
prepared statements, which are then simply executed in sequence by
execute(). This is more efficient than instead having to both
prepare+execute SQL against SQLite in my execute() were it not to
have its own prepared statement support.
Now, about child transactions.
Part of my own RDBMS' feature set is that users can define their own
arbitrarily complex data types, operators/routines, and database
constraints (common examples being unique key and foreign key
constraints), which then can live in the database and just be
available to its users like built-ins. But besides that, my RDBMS
has a native set of types and operators that are different from those
in SQL. It also supports multiple-assignment statements (eg, make
changes to multiple tables in a single statement), and updateable
views; insofar as possible, views are treated the same as base
tables. It also supports performing data-definition by performing
data-manipulation against the information schema (in fact, typical
DDL is short-hand for doing that).
All database constraints in my RDBMS are immediate, and are applied
between statement boundaries at all levels, so no statement will ever
see a version of the database that is inconsistent / violates any
constraints. And so then what, you may ask do we do if we have a
constraint saying one table must be credited while another debited,
and the constraint shouldn't apply between the first and second step;
well to that I say, is what single multi-update statements are for;
the 2 updates are conceptually happening "at the same time".
My RDBMS is ACID in the strictest sense. It uses implicit
transactions everywhere. Every operator/routine is implicitly atomic
and hence a transaction. Every statement at any level of the call
stack is atomic. Any explicit transaction within most types of
routines takes the form of a try-style code block. So generally, the
number of transaction layers is equal to the depth of the call stack.
There are no standalone "start/commit/rollback" statements except in
the parent-most anonymous routine in the callstack that the
application directly invokes; all "stored" routines use the block or
implicit form only, so we ensure no dangling transactions. Any
statement failure, which can be due to the statement violating a
database constraint, will throw an exception, which will rewind the
routine call stack (and transaction stack) one at a time until some
routine or block catches it. If a statement/routine completes/exits
normally, its implicit transaction commits; it implicitly rolls back
if an exception causes it to exit early. When an exception is
caught, the transaction layers of the catching statement and its
parents have not rolled back and can still be committed, and new
child transactions can still be started, such as a "try again
differently" on the failure.
Suffice it to say that it will be a lot easier for me if the
implementation of each operator and routine et al in my language
against SQLite can simply issue "start/commit/rollback" transaction
at its start/end as is appropriate, which would happen nicely if
SQLite has native child transaction support.
If SQLite doesn't support child transactions, I would have to add the
complexity of remembering everything that was done by the parent-most
routine and keeping track of level counts and what-have you, which is
a real pain.
Oh, and in case you say that I'm already managing it myself with
Example, then I would say you are right, however in that case,
Example's analogy to SQLite's pager layer is entirely built-in to it,
and so I am implementing the feature right where it should be, there,
and Example only has to track what pages changed, not a list of
executed statements, which is a lot simpler.
So what I'm proposing for SQLite is no less than what I would expect
to do myself.
How much more complicated is the nested transaction solution if
*you* have to implement it?
If you mean, in my own SQLite-using program, hopefully this has now
been explained between my various posts.
If you mean, my implementing it in SQLite itself, that is highly
impractical since I'm very poor at C and the regular maintainers of
SQLite would be able to do the job many orders of magnitude faster
than I could.
My contribution to the development of SQLite is mainly on the side of
design suggestions (which are generally programming language
agnostic) and what things are helpful from the users' perspective.
-- Darren Duncan
-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------