I have elaborated the test-script - this time, with a customizable-number
of chained-tables.
I went all the way, to test the hypothesis that it is the JOINs that are
the culprit.
Lets see how we do with a *single *record, joining 50 tables of *1 field*each +
*foreign-key-field
*(except the first clumn):
C:\Python26\python.exe F:/Experimental/web2py_272/web2py/test_executesql_0.
3.py
Preparing....
Times are for 100 iterations, into 50 tables, of querying for 1 results,each
with 99 values
started bench
speed_executesql() total=0.019787 avg=0.000198
speed_normal_rows() total=0.731611 avg=0.007316
speed_rows_with_cacheable() total=0.669495 avg=0.006695
speed_executesql_with_colnames() total=0.016620 avg=0.000166
speed_rows_with_simple_processor() total=0.364101 avg=0.003641
Process finished with exit code 0
Again - a x44 times increase in performance - quite consistent.
Now let's see a more modest and realistic example - 5 tables of 10 fields
each - 1k records query:
C:\Python26\python.exe F:/Experimental/web2py_272/web2py/test_executesql_0.
3.py
Preparing....
Times are for 100 iterations, into 5 tables, of querying for 1000 results,each
with 54 values
started bench
speed_executesql() total=2.619404 avg=0.026194
speed_normal_rows() total=52.479113 avg=0.524791
speed_rows_with_cacheable() total=48.437764 avg=0.484378
speed_executesql_with_colnames() total=2.789842 avg=0.027898
speed_rows_with_simple_processor() total=2.865223 avg=0.028652
This is x19 times speed-improvement between speed_executesql_with_colnames
and speed_normal_rows
This is x17 times speed-improvement between ..simple_processor and ..
rows_with_cacheable
Here is the file:
On Wednesday, November 20, 2013 1:47:47 PM UTC+2, Niphlod wrote:
>
>
>
> On Tuesday, November 19, 2013 11:12:52 PM UTC+1, Arnon Marcus wrote:
>>
>> 10x Niphlod, I'll check that tomorrow...
>>
>> I can provide any code you like - this is solid and consistent.
>>
>
> Exactly what I'm asking. a self-contained testbase to reproduce (including
> data, models and such).
>
>
>>
>> The interesting (and somewhat disappointing) factoid I got from your
>> results, was that PyPy was just as "slow" in un-parsed mode... ;)
>>
>
> pypy isn't a silver bullet for everything.
>
> BTW, lots of the time is spent actually interacting with SQLite, so the
> fact that a pure python library on pypy catches up with the C library
> shipped with python is quite itself the proof that pypy is a reeeeaally
> good project.
>
> BTW2: look at the "with_simple_processor" code. It does what your patch,
> but in one line, and it works for any select().
>
--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to the Google Groups
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.
import timeit
from gluon.dal import DAL, Field
tableCount = 50
fieldCount = 1
recordCount = 1
times = 100
##prepare a large dict
print 'Preparing....'
db = DAL('sqlite:memory')
tableNames = [('tb_'+str(i)) for i in range(tableCount)]
fieldNames = [('fl_'+str(i)) for i in range(fieldCount)]
#load 1k records into a test table
for t in range(tableCount):
fields = [Field(fieldName) for fieldName in fieldNames]
if t:
fields.append(Field(tableNames[t-1], db[tableNames[t-1]]))
db.define_table(tableNames[t], *fields)
db.commit()
for t in range(tableCount):
db(db[tableNames[t]].id>0).delete()
db.commit()
for r in range(recordCount):
ids = []
for t in range(tableCount):
tableName = tableNames[t]
values = dict([(fieldName,('%s_%s'%(tableName,fieldName))) for fieldName in fieldNames])
newId = db[tableName].insert(**values)
ids.append(newId)
for t in range(tableCount-1):
#row = db(db[tableNames[t+1]].id==ids[t+1]).select(db[tableNames[t+1]].ALL).as_dict()
#row[tableNames[t]] = ids[t]
#row.update()
db[tableNames[t+1]][ids[t+1]].update_record(**{tableNames[t]:ids[t]})
db.commit()
query = (db[tableNames[1]][tableNames[0]]==db[tableNames[0]].id)
for t in range(1, (tableCount-1)):
query = query & (db[tableNames[t+1]][tableNames[t]]==db[tableNames[t]].id)
selection = [db[tableName].ALL for tableName in tableNames]
sql = db(query)._select(*selection)
sql_colnames = sql.split(' FROM')[0][7:].split(', ')
print("Times are for %i iterations, into %i tables, of querying for %i results, each with %i values" % (times,tableCount,recordCount,((fieldCount*tableCount)+(tableCount-1))))
def bench(cmd, imprt):
t = timeit.Timer(cmd, imprt)
s = t.timeit(number=times)
print("%s total=%02f avg=%02f" % (cmd, s, (s/times)))
return s
def myprocessor(rows, fields, colnames, blob_decode=True, cacheable = False):
return [dict(zip(colnames,row)) for row in rows]
def speed_executesql():
rtn = db.executesql(sql, as_dict=True)
def speed_executesql_with_colnames():
rtn = db.executesql(sql, colnames=sql_colnames, as_dict=True)
def speed_normal_rows():
rtn = db(query).select()
def speed_rows_with_cacheable():
rtn = db(query).select(cacheable=True)
def speed_rows_with_simple_processor():
rtn = db(query).select(processor=myprocessor)
def full_bench():
print 'started bench'
b1 = bench('speed_executesql()', 'from __main__ import speed_executesql')
b2 = bench('speed_normal_rows()', 'from __main__ import speed_normal_rows')
b3 = bench('speed_rows_with_cacheable()', 'from __main__ import speed_rows_with_cacheable')
b4 = bench('speed_executesql_with_colnames()', 'from __main__ import speed_executesql_with_colnames')
b5 = bench('speed_rows_with_simple_processor()', 'from __main__ import speed_rows_with_simple_processor, myprocessor')
if __name__ == '__main__':
full_bench()