Brian Aker wrote:
Hi!
On Dec 14, 2009, at 4:34 PM, Jay Pipes wrote:
Right, but this is NOT what Stewart has proposed for the AlterTable statement.
Stewart (and Stewart, correct me if I'm wrong) would like to send the *actions*
that the master executed for the Alter Table. I am opposed to this (see link
above for the ML post with my reasons why)
Agreed, I would not want to send the actions. That would be completely
unportable.
I was just commenting on the list of actions being accurate.
OK.
We have 3 types of statements:
DML update statements: INSERT, UPDATE, DELETE
DML read-only statements: SELECT.
DDL statements: CREATE TABLE, ALTER TABLE, etc.
This right here is the heart of the differences. STATEMENTs are not really what you want. You want to know the sort of action, aka read/write/reformat...
Actually, no, what we've been discussing is that the actions are exactly *not*
what Paul/Toru need. They need to know the start and end of the statements...
By Statement, I mean the object Statement. The start and end of an execution?
Sure, that makes sense, but you don't want people hardcoding in if/else for
Statement objects.
Ah, OK, that makes much more sense :) Yes, we are talking about the
Statement *messages* (i.e. GPB messages) to be sent to the engine which
describe the SQL statement which should be executed.
I'm thinking that you and Stewart would benefit from a summary,
including all the example code me, Paul, and Toru have been working
through, so I will put that together shortly. Basically, what Paul,
Toru and myself have been putting together is a flexible API that would
allow the engine to handle the SQL statement entirely if it wanted to,
to let the kernel execute the SQL statement entirely, or a work-sharing
API that allows the engine to participate at various times in the
execution of the statement by the kernel...
Anyway, I'll write up the summary.
Cheers,
Jay
-jay
And assuming we have 2 sets of calls:
- beginTransaction, commitTransaction/rollbackTransaction
- startStatement, endStatement
We could say, all types of statements require a beginTransaction() and a
startStatement() (and the corresponding endStatement() and
commitTransaction/rollbackTransaction()).
But I don't think this is absolutely correct:
* DML update statements require both beginTransaction() and a startStatement().
* DML read-only statements only require a beginTransaction() call because a
SELECT does not need a statement level transaction (because they cannot be
rolled back).
* And DDL statements only require a startStatement() because it is up to the
engine to decide if this can be done within a transaction or not.
For example if beginTransaction() is called before startStatement() then
engines that do not handle DDL in transactions should return an error. In
addition, if a engine does atomic DDL, then it can use the startStatement() to
begin a transaction.
With these calls the engine will have most of the information it needs.
There is some additional information which should be provided when a cursor is
used:
For example, PBXT needs to know:
- which columns will be accessed (an optimization so that not all need to be
loaded),
- whether rows retrieved will be updated or deleted,
- if the rows need to be locked (as in SELECT FOR UPDATE).
Toru, what's your opinion?
-jay
And this is how the engine would handle "ADD INDEX", or "ENCRYPT TABLE":
startStatement("ENCRYPT TABLE", "t1") --> return: use custom method
doTableOperation("ENCRYPT TABLE", "t1")
endStatement()
The engine can write table operations to its transaction log, and in this way
it could ensure that the entire ALTER TABLE statement is atomic.
On Dec 7, 2009, at 4:10 PM, Jay Pipes wrote:
Paul McCullagh wrote:
Hi Toru,
On Dec 7, 2009, at 3:31 AM, Toru Maesaka wrote:
Great to hear another use-case where knowing a statement type in
advance is useful :)
Yes, generally I need to know the following:
- If I have a update type statement (i.e. whether the statement modifies rows).
- Whether I need a table lock (examples: ALTER TABLE, TRUNCATE, CHECK).
But, Paul, doesn't this depend on the engine itself? I mean, some
engines can do (some types of) ALTER TABLE without taking a table lock.
So, is this request really for whether the kernel thinks a table-level
lock is necessary, or is it really just for a descriptor of the
statement type?
And, if it really does just boil down to the statement type, then how do
we deal with the reality that Brian speaks about -- that statement type
will be pluggable, and how do we deal with future statement types for
pluggable engines?
Is a reasonable solution to pass to engines a sort of "statement
traits"? So, instead of passing ALTER_TABLE, CREATE_TABLE, UPDATE,
DELETE, etc, we instead pass a std::bitset<> (or uint64_t for C folks)
containing traits of the statement such as:
MODIFIES_DATA
MODIFIES_DEFINITION
etc, etc
And then to deal with transaction locking concerns, just add a method to Cursor:
void Cursor::setTransactionIsolationLevel(enum enum_tx_isolation);
Cheers!
Jay
- If we have a SELECT FOR UPDATE.
I was talking to Toru about this, and another possibility is that we have statements
declare a needed "lock type" that any plugin could then query. I outlined the
solution for Toru, but I don't know if he has written the patch yet :)
I've taken notes from our discussion the other day. I'm planning on
working on it when I finish testing through my current progress of
BlitzDB.
Great! :)
For now, I'm happy with Jay's advise of using
current_session().
Cheers,
Toru
On Sat, Dec 5, 2009 at 5:59 AM, Brian Aker <[email protected]> wrote:
Hi!
On Dec 4, 2009, at 3:12 AM, Paul McCullagh wrote:
If we have a startStatement() call, then it could be used in place of
beginAlter(), assuming we can determine the statement type, and the tables
involved.
The problem with relying on statement type is that at some point statement type
will be pluggable... which means you would constantly need to update your
engine for new statements.
Yuck!
I was talking to Toru about this, and another possibility is that we have statements
declare a needed "lock type" that any plugin could then query. I outlined the
solution for Toru, but I don't know if he has written the patch yet :)
Then, when a handle is returned to the pool it is deleted, instead of adding it
back to the pool.
BTW very soon engines will own their Cursor objects and will be free to reuse
them.
The locking thread waits until all handles are returned and deleted before it
can proceed. The lock on the pool then prevents a new table handle from being
created while the locking thread is busy.
Either way, it would be good if Drizzle closes all handlers/cursors before a
table is deleted or renamed.
I would say that long term this will be optional, based on what the engine
requires.
OK, this make things a lot simpler! Indeed, if we don't need to support LOCK
TABLE then external_lock() can be removed altogether.
Tried removing the external_lock() right now and seeing if any issues pop up?
Cheers,
-Brian
--
Paul McCullagh
PrimeBase Technologies
www.primebase.org
www.blobstreaming.org
pbxt.blogspot.com
--
Paul McCullagh
PrimeBase Technologies
www.primebase.org
www.blobstreaming.org
pbxt.blogspot.com
--
Paul McCullagh
PrimeBase Technologies
www.primebase.org
www.blobstreaming.org
pbxt.blogspot.com
_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help : https://help.launchpad.net/ListHelp