>
> Op 17 sep. 2019, om 04:26 heeft Keith Medcalf <[email protected]> het
> volgende geschreven:
>
>
> On Monday, 16 September, 2019 14:22, E.Pasma <[email protected]> wrote:
>
>> Stop stop stop
>
> You are right. What a difference a spelling error makes ... No wonder it
> took so long as it was doing table scans -- and the optimizer was doing a
> jolly job in the other cases in dealing with it.
>
> Note that the sqlite3 wrapper cannot do .executemany() with SELECT statements
> ... but it will do them with INSERT statements. Nevertheless, the results
> are reasonably similar to these obtained with APSW ...
>
...
>> st 1000
> Creating db and sample keys: 1000000 rows; 1000 keys
> Method 1: Individual Row 00:00:00.019247
> Method 2: Individual Row (Sorted) 00:00:00.017748
> Method 3: Rows with ExecMany 00:00:00.016084
> Method 3: Rows with ExecMany Sort 00:00:00.015766
> Method 4: Using IN temp 00:00:00.007528
> Method 5: Using IN temp (sorted) 00:00:00.007821
> Method 6: Using IN temp no rowid 00:00:00.007600
> Method 7: Using IN (dynamic) 00:00:00.005317
> Method 8: Using IN (sorted) 00:00:00.004884
> Method 9: Using IN CArray 00:00:00.005081
> Method A: Using IN CArray sorted 00:00:00.005190
..
> Using this code:
>
> #! python3
>
> import apsw
> import datetime
> import random
> import sqlite3
> import sys
> import time
>
> datasize = 1000000
> rows = int(sys.argv[1])
>
> elapsed = lambda st, et: datetime.datetime.utcfromtimestamp((et - st)).time()
> tuplize = lambda x: (x,)
>
> db = apsw.Connection(':memory:')
> #db = sqlite3.connect(':memory:', isolation_level=None)
>
> print('Creating db and sample keys:', end=' ', flush=True)
> db.executescript('''
> create table x
> (
> id integer primary key,
> data blob
> );
> insert into x
> with a(x) as (
> select 1
> union all
> select x + 1
> from a
> where x < %d
> )
> select x, randomblob(30)
> from a;
> analyze;
> ''' % (datasize,))
> print(db.execute('select count(*) from x').fetchone()[0], 'rows;', end=' ')
>
> rowset = [i for i in range(datasize)]
> random.shuffle(rowset)
> rowset = rowset[:rows]
> print(len(rowset), 'keys')
>
> print('Method 1: Individual Row ', end=' ', flush=True)
> st = time.time()
> db.executescript('BEGIN')
> for key in rowset:
> row = db.execute('select * from x where id=?', (key,)).fetchone()
> db.commit()
> print(elapsed(st, time.time()))
>
> print('Method 2: Individual Row (Sorted)', end=' ', flush=True)
> st = time.time()
> db.executescript('BEGIN')
> for key in sorted(rowset):
> row = db.execute('select * from x where id=?', (key,)).fetchone()
> db.commit()
> print(elapsed(st, time.time()))
>
> print('Method 3: Rows with ExecMany ', end=' ', flush=True)
> st = time.time()
> db.executescript('BEGIN')
> for row in db.executemany('select * from x where id=?', list(map(tuplize,
> rowset))):
> pass
> db.commit()
> print(elapsed(st, time.time()))
>
> print('Method 3: Rows with ExecMany Sort', end=' ', flush=True)
> st = time.time()
> db.executescript('BEGIN')
> for row in db.executemany('select * from x where id=?', list(map(tuplize,
> sorted(rowset)))):
> pass
> db.commit()
> print(elapsed(st, time.time()))
>
> print('Method 4: Using IN temp ', end=' ', flush=True)
> st = time.time()
> db.executescript('BEGIN')
> db.executescript('create temporary table keys (key)')
> db.executemany('insert into keys values (?)', list(map(tuplize,
> sorted(rowset))))
> for row in db.execute('select * from x where id in temp.keys'):
> pass
> db.executescript('drop table temp.keys')
> db.commit()
> print(elapsed(st, time.time()))
>
> print('Method 5: Using IN temp (sorted) ', end=' ', flush=True)
> st = time.time()
> db.executescript('BEGIN')
> db.executescript('create temporary table keys (key)')
> db.executemany('insert into keys values (?)', list(map(tuplize,
> sorted(rowset))))
> for row in db.execute('select * from x where id in temp.keys'):
> pass
> db.executescript('drop table temp.keys')
> db.commit()
> print(elapsed(st, time.time()))
>
> print('Method 6: Using IN temp no rowid ', end=' ', flush=True)
> st = time.time()
> db.executescript('BEGIN')
> db.executescript('create temporary table keys (key primary key) without
> rowid')
> db.executemany('insert or ignore into keys values (?)', list(map(tuplize,
> sorted(rowset))))
> for row in db.execute('select * from x where id in temp.keys'):
> pass
> db.executescript('drop table temp.keys')
> db.commit()
> print(elapsed(st, time.time()))
>
> print('Method 7: Using IN (dynamic) ', end=' ', flush=True)
> st = time.time()
> for row in db.execute('select * from x where id in (' + ','.join(map(str,
> rowset)) + ')'):
> pass
> print(elapsed(st, time.time()))
>
> print('Method 8: Using IN (sorted) ', end=' ', flush=True)
> st = time.time()
> for row in db.execute('select * from x where id in (' + ','.join(map(str,
> sorted(rowset))) + ')'):
> pass
> print(elapsed(st, time.time()))
>
> print('Method 9: Using IN CArray ', end=' ', flush=True)
> st = time.time()
> for row in db.execute('select * from x where id in carray(:l_address,
> :l_length, :l_type)', apsw.carray('l', rowset)):
> pass
> print(elapsed(st, time.time()))
>
> print('Method A: Using IN CArray sorted ', end=' ', flush=True)
> st = time.time()
> for row in db.execute('select * from x where id in carray(:l_address,
> :l_length, :l_type)', apsw.carray('l', sorted(rowset))):
> pass
> print(elapsed(st, time.time()))
>
> --
> The fact that there's a Highway to Hell but only a Stairway to Heaven says a
> lot about anticipated traffic volume.
>
I edited this second script to use plain apsw.
In the vi editor:
%s/db.execute/db.cursor().execute/
%s/executescript/execute/
%s/db.commit()/db.cursor().execute("COMMIT")/
/Method 9
.,$d
(the carray tests are left out)
My test output for 1000 keys is:
$ python3 keith2b.py 1000
Creating db and sample keys: 1000000 rows; 1000 keys
Method 1: Individual Row 00:00:00.003748
Method 2: Individual Row (Sorted) 00:00:00.003545
Method 3: Rows with ExecMany 00:00:00.003300
Method 3: Rows with ExecMany Sort 00:00:00.003088
Method 4: Using IN temp 00:00:00.003838
Method 5: Using IN temp (sorted) 00:00:00.003850
Method 6: Using IN temp no rowid 00:00:00.003941
Method 7: Using IN (dynamic) 00:00:00.003276
Method 8: Using IN (sorted) 00:00:00.003223
This is much different as in your output.
- the test is two times faster here (on a moderate system)
- there is no substantial difference any longer between individual tests
(excluded carray)
Any idea?
Thanks for the inviting tests.
E.Pasma
_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users