At 9:48 AM -0600 4/12/07, Dennis Cote wrote:
Yes I did assume no coupling because you didn't suggest any. If there is coupling this is just another case of the second example.

While I didn't explicitly suggest coupling before, I was making my arguments on the general case where actions against a database may possibly be coupled, and my argument was towards solutions that work for the general case. Sorry if I didn't communicate before that I was speaking to the general case.

So put all the sub-steps in a subroutine and call it.

In the general case, I don't control what the sub-steps are, but I am being a proxy for someone else, and I don't know in advance what they would ask for. Also, as users may want data returned to them between the sub-steps, their use for which could include determining what sub-steps are, I can't just generate a subroutine at runtime to execute, as then they wouldn't get anything back from their intermediate queries on time. That said, I recognize that in some situations it is possible for the stored procedure to embed all the decision making logic necessary from the application, but this isn't always true, as eg some user may be involved in intermediate steps.

I think that a SQLite pager-based mechanism for tracking child transactions is quite a bit less complicated and more reliable than using your workaround, since no details have to be remembered but for the pages that changed.

That is not true and you know it.

On the contrary, I believe what I said.

You are just pushing the complexity back to Richard. He will have to implement the changes to the parser, code generation, pager layers, and test suite, as well as address the backwards compatibility issues.

I don't see this as a problem. While it is true that a lot of complexity can be layered on top of the DBMS rather than being internal to the DBMS, I see child transactions as something that is best implemented inside the DBMS.

Speaking in a very loose analogy, I see the complexity as SQLite is now compared to with child transaction support to be like replacing:

  foo();
  foo();

With:

  for ... {
    foo();
  }

That is, I see it as the difference between explicitly doing something twice, and doing it once but inside a loop.

So as one can refactor code to use loops rather than explicit repeating, I don't see the end result here being much larger or more difficult to maintain. That is, we aren't just adding code, but also taking away some that has become redundant, is how I conceptualize it.

So SQLite with child transactions is only trivially less lite than it is now, which is still lite.

If it is a trivial as you suggest, then you should have already prepared a patch. :-)

I wasn't saying that the patch itself was trivial (though I'm saying it should be a simpler than the patches for many other requested features), but rather that the measures of how "lite" SQLite is would change a trivially small amount between before and after.

In fact, I propose moving rollbackable child transaction support to the top of the todo list, rather than it being in the middle, given that its presence can make a lot of other todo or wishlist items much easier to implement, I believe.

And if it will make a difference, I will even make a monetary donation (as I can afford to) in order to sponsor its development (though I would like to think that the benefits are compelling on their own).

You will have to discuss this with Richard Hipp.

Yes, of course.  And I already did do that a few minutes after the list post.

How will nested transactions make creating a your wrapper easier? Please be specific.

Well, to help people better understand this, I should start but outlining my own connected work.

I am writing a free and open source RDBMS of my own, whose main innovations relative to the general DBMS field are in the query engine, namely the public face (programmatic API and query language) that application developers and their users interact with. My RDBMS has its own query language and feature set which overlaps with but isn't the same as that of existing SQL DBMSs.

My RDBMS is structured as a framework with separate public interface and backend implementation layers (called "Interface" and "Engine), such that the backend is a swappable plugin-style component. The "interface" or wrappers thereof handle parsing user queries into an standardized AST format, which is what an "Engine" takes as input and the engine implements the AST-defined query however it wants. The native language and AST of my RDBMS define rigorous semantics which users should be able to expect, and which an Engine is supposed to comply with.

Note that a single query in my language is a full-blown routine definition (which in the trivial case just contains a single statement), so what it does and what format of data it can process for input or output is arbitrarily complex. All routine arguments are named, and they serve a purpose analagous to SQL's "host parameters" (aka, "bind variables"), which can exchange input and output with the application. A query-routine is separately prepared/compiled and executed by the application, the latter repeatable as expected.

An Engine can either be self-contained, natively implementing everything the AST specifies by itself, or it can implement a glue layer to some other DBMS, translating the AST into queries native to the other DBMS. The features of the underlying DBMS will be exploited as much as possible, so we get the best resource efficiency, and only if the underlying DBMS can't do something natively, does the Engine emulate the missing features (often a more complicated affair) over top of something else that the underlying does support, in so much as is possible, so the user still gets the features.

In the general case, my own focus is on the Interface, which is the framework core, and each Engine is made and distributed separately by a third party, citing the core as its main external dependency. However, there is one "Example" Engine that I bundle with the interface so that it is possible to thoroughly test the core in isolation from external dependencies.

The framework core will also have a "Validator" test suite, akin to Sun's certification suite for Java implementations, that uses the whole RDBMS API against a user-specified Engine to check that it behaves to spec. Any third party Engine can generally just invoke this test suite in 'make test' rather than having to write up its own test suite. Validator itself is tested using Example.

The core-bundled Example Engine is naively implemented and focuses on delivering the correct semantics for all language features, but doesn't worry about being scalable; its main purpose is to serve as a proof of concept to study from, but that it can actually be used as a test platform when building applications over my RDBMS.

But I will also develop or co-develop several other Engines (not just depending on third parties for said), to be distributed separately, which are intended more for handling larger workloads in a typical production environment; they are non-naive, at a cost of being more complex.

Generally speaking, I leave the development of storage layers, and/or modules that are more concerned with the low level details, to others, as they are a lot more innovative, expert, and interested in this area than I am, while I focus on the user experience side of things.

So this is where SQLite comes in.

One of the, probably *the*, non-Example Engines will do its job using SQLite, and it will try to do so in the most optimal way possible, using all the native and high-level features that SQLite offers to their full advantage, as is possible. In large part, it will generate SQLite-flavor SQL from my language's AST, have SQLite execute it, and package any bind variables or output. If a single SQLite query

It stands to reason, then, that my job in implementing my RDBMS over SQLite will be easier if SQLite natively supports the fundamental / conceptually low-level features that my RDBMS does, so mapping will be fairly straight forward.

For example, since SQLite 3 supports prepared statements, and named bind variables, a prepare() against my DBMS turns into a generate-and-prepare-SQL against SQLite, a variable-bind against my DBMS turns into one against SQLite. An execute() against my DBMS turns into one against SQLite. Assuming SQLite supports multiple simultaneous prepared statements, then any multitude of SQL statements that my routine may turn into (when I can't just do the preferred action of using a single SQL statement defining a SQL stored procedure or nested query), will be a multitude of SQLite prepared statements, which are then simply executed in sequence by execute(). This is more efficient than instead having to both prepare+execute SQL against SQLite in my execute() were it not to have its own prepared statement support.

Now, about child transactions.

Part of my own RDBMS' feature set is that users can define their own arbitrarily complex data types, operators/routines, and database constraints (common examples being unique key and foreign key constraints), which then can live in the database and just be available to its users like built-ins. But besides that, my RDBMS has a native set of types and operators that are different from those in SQL. It also supports multiple-assignment statements (eg, make changes to multiple tables in a single statement), and updateable views; insofar as possible, views are treated the same as base tables. It also supports performing data-definition by performing data-manipulation against the information schema (in fact, typical DDL is short-hand for doing that).

All database constraints in my RDBMS are immediate, and are applied between statement boundaries at all levels, so no statement will ever see a version of the database that is inconsistent / violates any constraints. And so then what, you may ask do we do if we have a constraint saying one table must be credited while another debited, and the constraint shouldn't apply between the first and second step; well to that I say, is what single multi-update statements are for; the 2 updates are conceptually happening "at the same time".

My RDBMS is ACID in the strictest sense. It uses implicit transactions everywhere. Every operator/routine is implicitly atomic and hence a transaction. Every statement at any level of the call stack is atomic. Any explicit transaction within most types of routines takes the form of a try-style code block. So generally, the number of transaction layers is equal to the depth of the call stack. There are no standalone "start/commit/rollback" statements except in the parent-most anonymous routine in the callstack that the application directly invokes; all "stored" routines use the block or implicit form only, so we ensure no dangling transactions. Any statement failure, which can be due to the statement violating a database constraint, will throw an exception, which will rewind the routine call stack (and transaction stack) one at a time until some routine or block catches it. If a statement/routine completes/exits normally, its implicit transaction commits; it implicitly rolls back if an exception causes it to exit early. When an exception is caught, the transaction layers of the catching statement and its parents have not rolled back and can still be committed, and new child transactions can still be started, such as a "try again differently" on the failure.

Suffice it to say that it will be a lot easier for me if the implementation of each operator and routine et al in my language against SQLite can simply issue "start/commit/rollback" transaction at its start/end as is appropriate, which would happen nicely if SQLite has native child transaction support.

If SQLite doesn't support child transactions, I would have to add the complexity of remembering everything that was done by the parent-most routine and keeping track of level counts and what-have you, which is a real pain.

Oh, and in case you say that I'm already managing it myself with Example, then I would say you are right, however in that case, Example's analogy to SQLite's pager layer is entirely built-in to it, and so I am implementing the feature right where it should be, there, and Example only has to track what pages changed, not a list of executed statements, which is a lot simpler.

So what I'm proposing for SQLite is no less than what I would expect to do myself.

How much more complicated is the nested transaction solution if *you* have to implement it?

If you mean, in my own SQLite-using program, hopefully this has now been explained between my various posts.

If you mean, my implementing it in SQLite itself, that is highly impractical since I'm very poor at C and the regular maintainers of SQLite would be able to do the job many orders of magnitude faster than I could.

My contribution to the development of SQLite is mainly on the side of design suggestions (which are generally programming language agnostic) and what things are helpful from the users' perspective.

-- Darren Duncan

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to