Re: [Drizzle-discuss] Improving the Engine API

Jay Pipes Mon, 14 Dec 2009 16:35:11 -0800

Brian Aker wrote:

Hi!


On Dec 8, 2009, at 2:24 AM, Paul McCullagh wrote:

OK, I think we agree on this, but to sum up, here are the best arguments for 
providing both methods (create/copy/rename/drop method & the engine handles the 
entire operation itself):

- The create/copy/rename/drop method must be used to alter the table engine.
- It saves engine developers time because they only have to implement the basic 
operations to support ALTER TABLE.
- Engines can concentrate on optimizing certain operations (e.g. add index) 
without having to implement the entire ALTER TABLE.

No matter what method the engine chooses. The server should still just 
replicate the GPB statement data (as you suggest in 
https://lists.launchpad.net/drizzle-discuss/msg05305.html).

This basically means that the statement is replicated, and not the operations.


Right, which is the case today.

Right, but this is NOT what Stewart has proposed for the AlterTablestatement. Stewart (and Stewart, correct me if I'm wrong) would like tosend the *actions* that the master executed for the Alter Table. I amopposed to this (see link above for the ML post with my reasons why)

You would have the Statement type as listed in the transaction.proto:

message Statement
{
enum Type
{
  ROLLBACK = 0; /* A ROLLBACK indicator */
  INSERT = 1; /* An INSERT statement */
  DELETE = 2; /* A DELETE statement */
  UPDATE = 3; /* An UPDATE statement */
  TRUNCATE_TABLE = 4; /* A TRUNCATE TABLE statement */
  CREATE_SCHEMA = 5; /* A CREATE SCHEMA statement */
  ALTER_SCHEMA = 6; /* An ALTER SCHEMA statement */
  DROP_SCHEMA = 7; /* A DROP SCHEMA statement */
  CREATE_TABLE = 8; /* A CREATE TABLE statement */
  ALTER_TABLE = 9; /* An ALTER TABLE statement */
  DROP_TABLE = 10; /* A DROP TABLE statement */
  SET_VARIABLE = 98; /* A SET statement */
  RAW_SQL = 99; /* A raw SQL statement */
}
...
}

There is no SELECT in the list, but maybe this is correct. I am just thinking 
allowed...

We have 3 types of statements:
DML update statements:    INSERT, UPDATE, DELETE
DML read-only statements: SELECT.
DDL statements:           CREATE TABLE, ALTER TABLE, etc.

This right here is the heart of the differences. STATEMENTs are not really what you want. You want to know the sort of action, aka read/write/reformat...

Actually, no, what we've been discussing is that the actions are exactly*not* what Paul/Toru need. They need to know the start and end of thestatements...


-jay

And assuming we have 2 sets of calls:
- beginTransaction, commitTransaction/rollbackTransaction
- startStatement, endStatement

We could say, all types of statements require a beginTransaction() and a 
startStatement() (and the corresponding endStatement() and 
commitTransaction/rollbackTransaction()).

But I don't think this is absolutely correct:

* DML update statements require both beginTransaction() and a startStatement().
* DML read-only statements only require a beginTransaction() call because a 
SELECT does not need a statement level transaction (because they cannot be 
rolled back).
* And DDL statements only require a startStatement() because it is up to the 
engine to decide if this can be done within a transaction or not.

For example if beginTransaction() is called before startStatement() then 
engines that do not handle DDL in transactions should return an error. In 
addition, if a engine does atomic DDL, then it can use the startStatement() to 
begin a transaction.

With these calls the engine will have most of the information it needs.

There is some additional information which should be provided when a cursor is 
used:

For example, PBXT needs to know:

- which columns will be accessed (an optimization so that not all need to be 
loaded),
- whether rows retrieved will be updated or deleted,
- if the rows need to be locked (as in SELECT FOR UPDATE).

Toru, what's your opinion?

-jay

And this is how the engine would handle "ADD INDEX", or "ENCRYPT TABLE":
startStatement("ENCRYPT TABLE", "t1") --> return: use custom method
doTableOperation("ENCRYPT TABLE", "t1")
endStatement()
The engine can write table operations to its transaction log, and in this way 
it could ensure that the entire ALTER TABLE statement is atomic.
On Dec 7, 2009, at 4:10 PM, Jay Pipes wrote:

Paul McCullagh wrote:

Hi Toru,
On Dec 7, 2009, at 3:31 AM, Toru Maesaka wrote:

Great to hear another use-case where knowing a statement type in
advance is useful :)

Yes, generally I need to know the following:
- If I have a update type statement (i.e. whether the statement modifies rows).
- Whether I need a table lock (examples: ALTER TABLE, TRUNCATE, CHECK).

But, Paul, doesn't this depend on the engine itself?  I mean, some
engines can do (some types of) ALTER TABLE without taking a table lock.
So, is this request really for whether the kernel thinks a table-level
lock is necessary, or is it really just for a descriptor of the
statement type?

And, if it really does just boil down to the statement type, then how do
we deal with the reality that Brian speaks about -- that statement type
will be pluggable, and how do we deal with future statement types for
pluggable engines?

Is a reasonable solution to pass to engines a sort of "statement
traits"?  So, instead of passing ALTER_TABLE, CREATE_TABLE, UPDATE,
DELETE, etc, we instead pass a std::bitset<> (or uint64_t for C folks)
containing traits of the statement such as:

MODIFIES_DATA
MODIFIES_DEFINITION
etc, etc

And then to deal with transaction locking concerns, just add a method to Cursor:

void Cursor::setTransactionIsolationLevel(enum enum_tx_isolation);

Cheers!

Jay

- If we have a SELECT FOR UPDATE.

I was talking to Toru about this, and another possibility is that we have statements 
declare a needed "lock type" that any plugin could then query. I outlined the 
solution for Toru, but I don't know if he has written the patch yet :)

I've taken notes from our discussion the other day. I'm planning on
working on it when I finish testing through my current progress of
BlitzDB.

Great! :)

For now, I'm happy with Jay's advise of using
current_session().

Cheers,
Toru

On Sat, Dec 5, 2009 at 5:59 AM, Brian Aker <[email protected]> wrote:

Hi!

On Dec 4, 2009, at 3:12 AM, Paul McCullagh wrote:

If we have a startStatement() call, then it could be used in place of 
beginAlter(), assuming we can determine the statement type, and the tables 
involved.

The problem with relying on statement type is that at some point statement type 
will be pluggable... which means you would constantly need to update your 
engine for new statements.

Yuck!

I was talking to Toru about this, and another possibility is that we have statements 
declare a needed "lock type" that any plugin could then query. I outlined the 
solution for Toru, but I don't know if he has written the patch yet :)

Then, when a handle is returned to the pool it is deleted, instead of adding it 
back to the pool.

BTW very soon engines will own their Cursor objects and will be free to reuse 
them.

The locking thread waits until all handles are returned and deleted before it 
can proceed. The lock on the pool then prevents a new table handle from being 
created while the locking thread is busy.
Either way, it would be good if Drizzle closes all handlers/cursors before a 
table is deleted or renamed.

I would say that long term this will be optional, based on what the engine 
requires.

OK, this make things a lot simpler! Indeed, if we don't need to support LOCK 
TABLE then external_lock() can be removed altogether.

Tried removing the external_lock() right now and seeing if any issues pop up?

Cheers,
    -Brian

--
Paul McCullagh
PrimeBase Technologies
www.primebase.org
www.blobstreaming.org
pbxt.blogspot.com

--
Paul McCullagh
PrimeBase Technologies
www.primebase.org
www.blobstreaming.org
pbxt.blogspot.com



--
Paul McCullagh
PrimeBase Technologies
www.primebase.org
www.blobstreaming.org
pbxt.blogspot.com


_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Re: [Drizzle-discuss] Improving the Engine API

Reply via email to