Re: [Drizzle-discuss] Improving the Engine API (was Re: New PBXT Drizzle-specific storage engine...)

Jay Pipes Tue, 22 Dec 2009 14:11:14 -0800

Paul McCullagh wrote:

On Dec 16, 2009, at 11:58 PM, Jay Pipes wrote:
Paul McCullagh wrote:
Hi Toru,
On Dec 8, 2009, at 6:37 PM, Toru Maesaka wrote:
Hi Paul,
- If I have a update type statement (i.e. whether the statementmodifies
rows).
- Whether I need a table lock (examples: ALTER TABLE, TRUNCATE,CHECK).
- If we have a SELECT FOR UPDATE.
Agreed! Especially for the third point. For me this is what I need:
- Whether the statement will change the table state (so, updates ingeneral)
- Whether the entire table needs to be locked.
- Whether the statement only performs READ operations.

The third point I don't think needs to be explicitly defined but it
would be relieving for the engine to know (or be guaranteed) that the
table state will not be changed.

Whether the storage engine should obey the statement characteristics
or not is up to the engine developer I guess... Nonetheless it would
be brilliant as a "hint" for everyone I think.
The question here is whether the engine should lock the tables in thestartStatement() call, or when the cursor is used?
Lets look at an UPDATE statement:
UPDATE t1, t2 SET t1.c1=50 WHERE t1.id = t2.id and t2.c3='abc';
In this statement, t1 is being read and updated, and t2 is just beingread. Both tables are being scanned (lets assume there are no indexes).
Drizzle does not support multi-table UPDATEs in a single statement.For Drizzle, the above statement would have to be written like so:
UPDATE t1 SET t1.c1=50
WHERE t1.id IN (
 SELECT t2.id FROM t2 WHERE t2.c3='abc'
);
Yes, OK. I think this makes it even more obvious that t2 is being read,and t1 is being read and updated.
Here is some pseudo code for the execution of this statement:
engine->beginTransaction()
engine->startStatement(gpb_stat_info)
a = engine->getCursor("t1", WILL_UPDATE)
b = engine->getCursor("t2", READ_ONLY)
Actually, the above is not correct. Because the update to t1 isdependent on rows in a consistent snapshot of t2, you would actuallyhave to do (again, in pseudocode):
b= engine->getCursor("t2", REPEATABLE_READ);
a= engine->getCursor("t1", WILL_UPDATE);
Yes, generally I agree.
In fact, WILL_UPDATE may be: COMMITTED_READ | FOR_UPDATE (at least Ibelieve this is the case with InnoDB).
Unless the engine uses "optimistic locking" (which is the pure MVCCapproach), and then it would be: REPEATABLE_READ | FOR_UPDATE.
So this raises the question of whether the use of REPEATABLE_READ andCOMMITTED_READ is not telling the engine too much about how these thingsshould be implemented...?

No, I don't view it as implementation, really. The REPEATABLE_READ orCOMMITTED_READ is something the kernel has to communicate to the enginebecause it is what the *user* has asked for -- either explicitly or viaa configuration default. So, it's not really the kernel saying to theengine "do it this way". Instead, it's the kernel saying "this is whatthe user is requiring." So, in that sense, I believe it absolutelybelongs in the API.

I want to flesh out the example code we've been working on in thisthread some more. I will do this over the holidays and post the code tothe wiki I've been using. I'll take the approach of demonstrating howthe kernel would call the engine for various SQL scenarios. Hopefully,that will give us a rounded look at the way the kernel can interact withthe engine and the engine's cursors.


Cheers,

Jay

_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Re: [Drizzle-discuss] Improving the Engine API (was Re: New PBXT Drizzle-specific storage engine...)

Reply via email to