Paul McCullagh wrote:
On Dec 16, 2009, at 11:58 PM, Jay Pipes wrote:
Paul McCullagh wrote:
Hi Toru,
On Dec 8, 2009, at 6:37 PM, Toru Maesaka wrote:
Hi Paul,
- If I have a update type statement (i.e. whether the statement
modifies
rows).
- Whether I need a table lock (examples: ALTER TABLE, TRUNCATE,
CHECK).
- If we have a SELECT FOR UPDATE.
Agreed! Especially for the third point. For me this is what I need:
- Whether the statement will change the table state (so, updates in
general)
- Whether the entire table needs to be locked.
- Whether the statement only performs READ operations.
The third point I don't think needs to be explicitly defined but it
would be relieving for the engine to know (or be guaranteed) that the
table state will not be changed.
Whether the storage engine should obey the statement characteristics
or not is up to the engine developer I guess... Nonetheless it would
be brilliant as a "hint" for everyone I think.
The question here is whether the engine should lock the tables in the
startStatement() call, or when the cursor is used?
Lets look at an UPDATE statement:
UPDATE t1, t2 SET t1.c1=50 WHERE t1.id = t2.id and t2.c3='abc';
In this statement, t1 is being read and updated, and t2 is just being
read. Both tables are being scanned (lets assume there are no indexes).
Drizzle does not support multi-table UPDATEs in a single statement.
For Drizzle, the above statement would have to be written like so:
UPDATE t1 SET t1.c1=50
WHERE t1.id IN (
SELECT t2.id FROM t2 WHERE t2.c3='abc'
);
Yes, OK. I think this makes it even more obvious that t2 is being read,
and t1 is being read and updated.
Here is some pseudo code for the execution of this statement:
engine->beginTransaction()
engine->startStatement(gpb_stat_info)
a = engine->getCursor("t1", WILL_UPDATE)
b = engine->getCursor("t2", READ_ONLY)
Actually, the above is not correct. Because the update to t1 is
dependent on rows in a consistent snapshot of t2, you would actually
have to do (again, in pseudocode):
b= engine->getCursor("t2", REPEATABLE_READ);
a= engine->getCursor("t1", WILL_UPDATE);
Yes, generally I agree.
In fact, WILL_UPDATE may be: COMMITTED_READ | FOR_UPDATE (at least I
believe this is the case with InnoDB).
Unless the engine uses "optimistic locking" (which is the pure MVCC
approach), and then it would be: REPEATABLE_READ | FOR_UPDATE.
So this raises the question of whether the use of REPEATABLE_READ and
COMMITTED_READ is not telling the engine too much about how these things
should be implemented...?
No, I don't view it as implementation, really. The REPEATABLE_READ or
COMMITTED_READ is something the kernel has to communicate to the engine
because it is what the *user* has asked for -- either explicitly or via
a configuration default. So, it's not really the kernel saying to the
engine "do it this way". Instead, it's the kernel saying "this is what
the user is requiring." So, in that sense, I believe it absolutely
belongs in the API.
I want to flesh out the example code we've been working on in this
thread some more. I will do this over the holidays and post the code to
the wiki I've been using. I'll take the approach of demonstrating how
the kernel would call the engine for various SQL scenarios. Hopefully,
that will give us a rounded look at the way the kernel can interact with
the engine and the engine's cursors.
Cheers,
Jay
_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help : https://help.launchpad.net/ListHelp