> > Op 17 sep. 2019, om 04:26 heeft Keith Medcalf <kmedc...@dessus.com> het > volgende geschreven: > > > On Monday, 16 September, 2019 14:22, E.Pasma <pasm...@concepts.nl> wrote: > >> Stop stop stop > > You are right. What a difference a spelling error makes ... No wonder it > took so long as it was doing table scans -- and the optimizer was doing a > jolly job in the other cases in dealing with it. > > Note that the sqlite3 wrapper cannot do .executemany() with SELECT statements > ... but it will do them with INSERT statements. Nevertheless, the results > are reasonably similar to these obtained with APSW ... > ... >> st 1000 > Creating db and sample keys: 1000000 rows; 1000 keys > Method 1: Individual Row 00:00:00.019247 > Method 2: Individual Row (Sorted) 00:00:00.017748 > Method 3: Rows with ExecMany 00:00:00.016084 > Method 3: Rows with ExecMany Sort 00:00:00.015766 > Method 4: Using IN temp 00:00:00.007528 > Method 5: Using IN temp (sorted) 00:00:00.007821 > Method 6: Using IN temp no rowid 00:00:00.007600 > Method 7: Using IN (dynamic) 00:00:00.005317 > Method 8: Using IN (sorted) 00:00:00.004884 > Method 9: Using IN CArray 00:00:00.005081 > Method A: Using IN CArray sorted 00:00:00.005190 ..
> Using this code: > > #! python3 > > import apsw > import datetime > import random > import sqlite3 > import sys > import time > > datasize = 1000000 > rows = int(sys.argv[1]) > > elapsed = lambda st, et: datetime.datetime.utcfromtimestamp((et - st)).time() > tuplize = lambda x: (x,) > > db = apsw.Connection(':memory:') > #db = sqlite3.connect(':memory:', isolation_level=None) > > print('Creating db and sample keys:', end=' ', flush=True) > db.executescript(''' > create table x > ( > id integer primary key, > data blob > ); > insert into x > with a(x) as ( > select 1 > union all > select x + 1 > from a > where x < %d > ) > select x, randomblob(30) > from a; > analyze; > ''' % (datasize,)) > print(db.execute('select count(*) from x').fetchone()[0], 'rows;', end=' ') > > rowset = [i for i in range(datasize)] > random.shuffle(rowset) > rowset = rowset[:rows] > print(len(rowset), 'keys') > > print('Method 1: Individual Row ', end=' ', flush=True) > st = time.time() > db.executescript('BEGIN') > for key in rowset: > row = db.execute('select * from x where id=?', (key,)).fetchone() > db.commit() > print(elapsed(st, time.time())) > > print('Method 2: Individual Row (Sorted)', end=' ', flush=True) > st = time.time() > db.executescript('BEGIN') > for key in sorted(rowset): > row = db.execute('select * from x where id=?', (key,)).fetchone() > db.commit() > print(elapsed(st, time.time())) > > print('Method 3: Rows with ExecMany ', end=' ', flush=True) > st = time.time() > db.executescript('BEGIN') > for row in db.executemany('select * from x where id=?', list(map(tuplize, > rowset))): > pass > db.commit() > print(elapsed(st, time.time())) > > print('Method 3: Rows with ExecMany Sort', end=' ', flush=True) > st = time.time() > db.executescript('BEGIN') > for row in db.executemany('select * from x where id=?', list(map(tuplize, > sorted(rowset)))): > pass > db.commit() > print(elapsed(st, time.time())) > > print('Method 4: Using IN temp ', end=' ', flush=True) > st = time.time() > db.executescript('BEGIN') > db.executescript('create temporary table keys (key)') > db.executemany('insert into keys values (?)', list(map(tuplize, > sorted(rowset)))) > for row in db.execute('select * from x where id in temp.keys'): > pass > db.executescript('drop table temp.keys') > db.commit() > print(elapsed(st, time.time())) > > print('Method 5: Using IN temp (sorted) ', end=' ', flush=True) > st = time.time() > db.executescript('BEGIN') > db.executescript('create temporary table keys (key)') > db.executemany('insert into keys values (?)', list(map(tuplize, > sorted(rowset)))) > for row in db.execute('select * from x where id in temp.keys'): > pass > db.executescript('drop table temp.keys') > db.commit() > print(elapsed(st, time.time())) > > print('Method 6: Using IN temp no rowid ', end=' ', flush=True) > st = time.time() > db.executescript('BEGIN') > db.executescript('create temporary table keys (key primary key) without > rowid') > db.executemany('insert or ignore into keys values (?)', list(map(tuplize, > sorted(rowset)))) > for row in db.execute('select * from x where id in temp.keys'): > pass > db.executescript('drop table temp.keys') > db.commit() > print(elapsed(st, time.time())) > > print('Method 7: Using IN (dynamic) ', end=' ', flush=True) > st = time.time() > for row in db.execute('select * from x where id in (' + ','.join(map(str, > rowset)) + ')'): > pass > print(elapsed(st, time.time())) > > print('Method 8: Using IN (sorted) ', end=' ', flush=True) > st = time.time() > for row in db.execute('select * from x where id in (' + ','.join(map(str, > sorted(rowset))) + ')'): > pass > print(elapsed(st, time.time())) > > print('Method 9: Using IN CArray ', end=' ', flush=True) > st = time.time() > for row in db.execute('select * from x where id in carray(:l_address, > :l_length, :l_type)', apsw.carray('l', rowset)): > pass > print(elapsed(st, time.time())) > > print('Method A: Using IN CArray sorted ', end=' ', flush=True) > st = time.time() > for row in db.execute('select * from x where id in carray(:l_address, > :l_length, :l_type)', apsw.carray('l', sorted(rowset))): > pass > print(elapsed(st, time.time())) > > -- > The fact that there's a Highway to Hell but only a Stairway to Heaven says a > lot about anticipated traffic volume. > I edited this second script to use plain apsw. In the vi editor: %s/db.execute/db.cursor().execute/ %s/executescript/execute/ %s/db.commit()/db.cursor().execute("COMMIT")/ /Method 9 .,$d (the carray tests are left out) My test output for 1000 keys is: $ python3 keith2b.py 1000 Creating db and sample keys: 1000000 rows; 1000 keys Method 1: Individual Row 00:00:00.003748 Method 2: Individual Row (Sorted) 00:00:00.003545 Method 3: Rows with ExecMany 00:00:00.003300 Method 3: Rows with ExecMany Sort 00:00:00.003088 Method 4: Using IN temp 00:00:00.003838 Method 5: Using IN temp (sorted) 00:00:00.003850 Method 6: Using IN temp no rowid 00:00:00.003941 Method 7: Using IN (dynamic) 00:00:00.003276 Method 8: Using IN (sorted) 00:00:00.003223 This is much different as in your output. - the test is two times faster here (on a moderate system) - there is no substantial difference any longer between individual tests (excluded carray) Any idea? Thanks for the inviting tests. E.Pasma _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users