On 2/5/07, A. Pagaltzis <[EMAIL PROTECTED]> wrote:
* [EMAIL PROTECTED] <[EMAIL PROTECTED]> [2007-02-04 13:35]:
> "A. Pagaltzis" <[EMAIL PROTECTED]> wrote:
> > It's a pity that INSERT OR IGNORE (apparently?) does not set
> > last_insert_id properly regardless of outcome,
>
> Consider this case:
>
> CREATE TABLE t1(x INTEGER PRIMARY KEY, y UNIQUE, z UNIQUE);
> INSERT INTO t1 VALUES(1,'a','b');
> INSERT INTO t1 VALUES(2,'x','y');
>
> INSERT OR IGNORE INTO t1 VALUES(NULL, 'a','y');
>
> For the INSERT OR IGNORE statement, does the last_insert_rowid
> get set to 1 or 2?
Yeah, good point. Silly I didn't think that far as we've just had
this in another thread.
Out of curiosity, though – does SQLite find both rows in this
case, or does it abort as soon as it sees any one constraint
violation before it knows there would be more?
Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>
I do agree with DRH that it would be difficult to come up with a
consistent mechanism for returning the rowid for constraint
collisions, and that the benefit of this is negligible.
In my application, the benefit was reduced greatly with the addition
of some simple caching to make the likely hood of duplicate string
insertions even smaller.
However, in an effort to provide some useful performance data for
everyone, I'm making available the results of my testing with the
various suggested insertion algorithms below. The fastest was my
"Hack" version, followed closely by the "Insert/Select/Insert" version
(which DRH agreed would most likely be the fastest "approved"
mechanism), with the "Insert+Ignore/Insert+Select" version proposed by
Dennis Cote, while being simplest, was the slowest.
All the tests were performed with version 3.3.12 of SQLite on unloaded systems.
For the Windows version, I used the pre-compiled DLL version available here:
http://www.sqlite.org/sqlitedll-3_3_12.zip
For the embedded version, I used the "pure C" source version here:
http://www.sqlite.org/sqlite-source-3_3_12.zip
compiled with the following options:
-DOS_UNIX -DNO_TCL -DTHREADSAFE=1 -DNDEBUG -DTEMP_STORE=2 -DHAVE_USLEEP=1
The Windows test were done on a laptop running Windows XP SP2 with an
Intel Dual Core processor running at 1.83 GHz and 1GB of RAM. Under
Windows, the SetProcessAffinityMask() kernel API was used to force
single CPU/core usage to try and avoid multi-processor/core timing
issues. Additionally, the GetProcessTimes() kernel API was used to
get the total CPU usage time (rather than the "wall" time.)
The embedded tests were done on a proprietary board with a single CPU
and a 64MB RAM shared between application usage and a RAM based file
system (ie. no hard drive).
The full source code for the Windows version of my test is here:
http://www.mo-ware.com/shane/misc/insert_test.c
I'd appreciate any comments on how to improve it's performance
as at this point, that my key goal.
Schema:
CREATE TABLE Objects ( ObjectId INTEGER PRIMARY KEY, StringId INTEGER )
CREATE TABLE Strings ( StringId INTEGER PRIMARY KEY, Value VARCHAR(30) UNIQUE )
Version 0 (Hack):
INSERT INTO Strings (value) VALUES ('foo')
if string insert result is SQLITE_OK
rowid = last_insert_rowid
else if result is SQLITE_CONSTRAINT
rowid = get_rowid_from_VDBE_stack
end if
if rowid
INSERT INTO Objects (StringId) VALUES (rowid)
endif
Version 1 (Insert/Select/Insert):
INSERT INTO Strings (value) VALUES ('foo')
if string insert result is SQLITE_OK
rowid = last_insert_rowid
else if result is SQLITE_CONSTRAINT
rowid = SELECT StringId FROM Strings WHERE Value = 'foo'
end if
if rowid
INSERT INTO Objects (StringId) VALUES (rowid)
endif
Version 2 (Insert+Ignore/Insert+Select):
INSERT OR IGNORE INTO Strings (value) VALUES ('foo')
if string insert result is SQLITE_OK
INSERT INTO Objects (StringId) VALUES ((SELECT StringId FROM
Strings WHERE Value = 'foo'))
end if
Test procedure:
Perform 50 transactions of 1000 inserts each (50000 total).
The test was coded such that each insert has ~10% chance of being a
duplicate string insert. It should be noted that be using the same
random seed for all runs and versions, the exact same database (byte
for byte) is generated.
Timing data:
10 runs, throw out lowest and highest, and average the remaining 8.
Total run time reported in milliseconds (ms), with the percent slower
than the fastest.
Windows Embedded
(ms) (%) (ms) (%)
V0 2718 0 30233 0
V1 3014 10.89 32068 6.07
V2 3468 27.59 36275 19.98
-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------