On Monday, 16 September, 2019 14:22, E.Pasma <pasm...@concepts.nl> wrote:

>Stop stop stop

>> create table x
>> (
>>    id      integer primay key,
>>    data    blob
>> );

>I did not see this until searching for the word PRIMARY and not finding
>it. Thus id is not a primary key at all. Probably it is a good habit to
>always add WITHOUT ROWID when there is an explicit primary key. The SQL
>parser would then have reported an error.

>The tests with individual rows must definitely be repeated.

>In my tests the results are closer together:

>Method 1: Retrieve Individual Row 00:00:00.081151 10000
>Method 3: using dynamic in        00:00:00.060368 9995
>Method 5: using in carray         00:00:00.050884 9995
>Method 5: using carray join       00:00:00.043127 10000
>Method 6: Using temp table        00:00:00.060808 9995

>(for oarameter = 10000)

>I tuned the Python script, using fetchone() in the individual row test.
>And I added a temp table test. In Python this just uses executemany() to
>insert the rowset.

You are right.  What a difference a spelling error makes ... No wonder it took 
so long as it was doing table scans -- and the optimizer was doing a jolly job 
in the other cases in dealing with it.

Note that the sqlite3 wrapper cannot do .executemany() with SELECT statements 
... but it will do them with INSERT statements.  Nevertheless, the results are 
reasonably similar to these obtained with APSW ...

>st 1
Creating db and sample keys: 1000000 rows; 1 keys
Method 1: Individual Row          00:00:00.000150
Method 2: Individual Row (Sorted) 00:00:00.000079
Method 3: Rows with ExecMany      00:00:00.000063
Method 3: Rows with ExecMany Sort 00:00:00.000061
Method 4: Using IN temp           00:00:00.000323
Method 5: Using IN temp (sorted)  00:00:00.000266
Method 6: Using IN temp no rowid  00:00:00.000284
Method 7: Using IN (dynamic)      00:00:00.000078
Method 8: Using IN (sorted)       00:00:00.000061
Method 9: Using IN CArray         00:00:00.000195
Method A: Using IN CArray sorted  00:00:00.000075

>st 10
Creating db and sample keys: 1000000 rows; 10 keys
Method 1: Individual Row          00:00:00.000366
Method 2: Individual Row (Sorted) 00:00:00.000309
Method 3: Rows with ExecMany      00:00:00.000293
Method 3: Rows with ExecMany Sort 00:00:00.000305
Method 4: Using IN temp           00:00:00.000418
Method 5: Using IN temp (sorted)  00:00:00.000359
Method 6: Using IN temp no rowid  00:00:00.000430
Method 7: Using IN (dynamic)      00:00:00.000147
Method 8: Using IN (sorted)       00:00:00.000141
Method 9: Using IN CArray         00:00:00.000372
Method A: Using IN CArray sorted  00:00:00.000132

>st 100
Creating db and sample keys: 1000000 rows; 100 keys
Method 1: Individual Row          00:00:00.002314
Method 2: Individual Row (Sorted) 00:00:00.002144
Method 3: Rows with ExecMany      00:00:00.002010
Method 3: Rows with ExecMany Sort 00:00:00.001807
Method 4: Using IN temp           00:00:00.001042
Method 5: Using IN temp (sorted)  00:00:00.000963
Method 6: Using IN temp no rowid  00:00:00.001004
Method 7: Using IN (dynamic)      00:00:00.000573
Method 8: Using IN (sorted)       00:00:00.000548
Method 9: Using IN CArray         00:00:00.000671
Method A: Using IN CArray sorted  00:00:00.000588

>st 1000
Creating db and sample keys: 1000000 rows; 1000 keys
Method 1: Individual Row          00:00:00.019247
Method 2: Individual Row (Sorted) 00:00:00.017748
Method 3: Rows with ExecMany      00:00:00.016084
Method 3: Rows with ExecMany Sort 00:00:00.015766
Method 4: Using IN temp           00:00:00.007528
Method 5: Using IN temp (sorted)  00:00:00.007821
Method 6: Using IN temp no rowid  00:00:00.007600
Method 7: Using IN (dynamic)      00:00:00.005317
Method 8: Using IN (sorted)       00:00:00.004884
Method 9: Using IN CArray         00:00:00.005081
Method A: Using IN CArray sorted  00:00:00.005190

>st 10000
Creating db and sample keys: 1000000 rows; 10000 keys
Method 1: Individual Row          00:00:00.178937
Method 2: Individual Row (Sorted) 00:00:00.180979
Method 3: Rows with ExecMany      00:00:00.165302
Method 3: Rows with ExecMany Sort 00:00:00.163846
Method 4: Using IN temp           00:00:00.076111
Method 5: Using IN temp (sorted)  00:00:00.076974
Method 6: Using IN temp no rowid  00:00:00.077122
Method 7: Using IN (dynamic)      00:00:00.049132
Method 8: Using IN (sorted)       00:00:00.050656
Method 9: Using IN CArray         00:00:00.052837
Method A: Using IN CArray sorted  00:00:00.050192

>st 100000
Creating db and sample keys: 1000000 rows; 100000 keys
Method 1: Individual Row          00:00:01.777458
Method 2: Individual Row (Sorted) 00:00:01.708890
Method 3: Rows with ExecMany      00:00:01.676193
Method 3: Rows with ExecMany Sort 00:00:01.639589
Method 4: Using IN temp           00:00:00.756932
Method 5: Using IN temp (sorted)  00:00:00.742670
Method 6: Using IN temp no rowid  00:00:00.786706
Method 7: Using IN (dynamic)      00:00:00.504242
Method 8: Using IN (sorted)       00:00:00.503634
Method 9: Using IN CArray         00:00:00.546801
Method A: Using IN CArray sorted  00:00:00.510597

>st 1000000
Creating db and sample keys: 1000000 rows; 1000000 keys
Method 1: Individual Row          00:00:17.748779
Method 2: Individual Row (Sorted) 00:00:17.859159
Method 3: Rows with ExecMany      00:00:16.959563
Method 3: Rows with ExecMany Sort 00:00:16.261775
Method 4: Using IN temp           00:00:07.299388
Method 5: Using IN temp (sorted)  00:00:07.380643
Method 6: Using IN temp no rowid  00:00:07.543117
Method 7: Using IN (dynamic)      00:00:05.441127
Method 8: Using IN (sorted)       00:00:05.403368
Method 9: Using IN CArray         00:00:05.326564
Method A: Using IN CArray sorted  00:00:04.763342


Using this code:

#! python3

import apsw
import datetime
import random
import sqlite3
import sys
import time

datasize = 1000000
rows = int(sys.argv[1])

elapsed = lambda st, et: datetime.datetime.utcfromtimestamp((et - st)).time()
tuplize = lambda x: (x,)

db = apsw.Connection(':memory:')
#db = sqlite3.connect(':memory:', isolation_level=None)

print('Creating db and sample keys:', end=' ', flush=True)
db.executescript('''
create table x
(
    id      integer primary key,
    data    blob
);
insert into x
with a(x) as (
        select 1
     union all
        select x + 1
          from a
         where x < %d
             )
select x, randomblob(30)
  from a;
analyze;
''' % (datasize,))
print(db.execute('select count(*) from x').fetchone()[0], 'rows;', end=' ')

rowset = [i for i in range(datasize)]
random.shuffle(rowset)
rowset = rowset[:rows]
print(len(rowset), 'keys')

print('Method 1: Individual Row         ', end=' ', flush=True)
st = time.time()
db.executescript('BEGIN')
for key in rowset:
    row = db.execute('select * from x where id=?', (key,)).fetchone()
db.commit()
print(elapsed(st, time.time()))

print('Method 2: Individual Row (Sorted)', end=' ', flush=True)
st = time.time()
db.executescript('BEGIN')
for key in sorted(rowset):
    row = db.execute('select * from x where id=?', (key,)).fetchone()
db.commit()
print(elapsed(st, time.time()))

print('Method 3: Rows with ExecMany     ', end=' ', flush=True)
st = time.time()
db.executescript('BEGIN')
for row in db.executemany('select * from x where id=?', list(map(tuplize, 
rowset))):
    pass
db.commit()
print(elapsed(st, time.time()))

print('Method 3: Rows with ExecMany Sort', end=' ', flush=True)
st = time.time()
db.executescript('BEGIN')
for row in db.executemany('select * from x where id=?', list(map(tuplize, 
sorted(rowset)))):
    pass
db.commit()
print(elapsed(st, time.time()))

print('Method 4: Using IN temp          ', end=' ', flush=True)
st = time.time()
db.executescript('BEGIN')
db.executescript('create temporary table keys (key)')
db.executemany('insert into keys values (?)', list(map(tuplize, 
sorted(rowset))))
for row in db.execute('select * from x where id in temp.keys'):
    pass
db.executescript('drop table temp.keys')
db.commit()
print(elapsed(st, time.time()))

print('Method 5: Using IN temp (sorted) ', end=' ', flush=True)
st = time.time()
db.executescript('BEGIN')
db.executescript('create temporary table keys (key)')
db.executemany('insert into keys values (?)', list(map(tuplize, 
sorted(rowset))))
for row in db.execute('select * from x where id in temp.keys'):
    pass
db.executescript('drop table temp.keys')
db.commit()
print(elapsed(st, time.time()))

print('Method 6: Using IN temp no rowid ', end=' ', flush=True)
st = time.time()
db.executescript('BEGIN')
db.executescript('create temporary table keys (key primary key) without rowid')
db.executemany('insert or ignore into keys values (?)', list(map(tuplize, 
sorted(rowset))))
for row in db.execute('select * from x where id in temp.keys'):
    pass
db.executescript('drop table temp.keys')
db.commit()
print(elapsed(st, time.time()))

print('Method 7: Using IN (dynamic)     ', end=' ', flush=True)
st = time.time()
for row in db.execute('select * from x where id in (' + ','.join(map(str, 
rowset)) + ')'):
    pass
print(elapsed(st, time.time()))

print('Method 8: Using IN (sorted)      ', end=' ', flush=True)
st = time.time()
for row in db.execute('select * from x where id in (' + ','.join(map(str, 
sorted(rowset))) + ')'):
    pass
print(elapsed(st, time.time()))

print('Method 9: Using IN CArray        ', end=' ', flush=True)
st = time.time()
for row in db.execute('select * from x where id in carray(:l_address, 
:l_length, :l_type)', apsw.carray('l', rowset)):
    pass
print(elapsed(st, time.time()))

print('Method A: Using IN CArray sorted ', end=' ', flush=True)
st = time.time()
for row in db.execute('select * from x where id in carray(:l_address, 
:l_length, :l_type)', apsw.carray('l', sorted(rowset))):
    pass
print(elapsed(st, time.time()))

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.




_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to