Everyone:
You may play with the numbers at the top in this version of the code -
everything is now customizable for generating any number of chained-joined
tables with any number of fields and inserting any number of records.
I find that the more joins you have, the more overhead you'll pay when
parsing to a Rows object with regular .select().
It can be anywhere from a x15 speed-bump with just 2 tables, up-to x44 with
an extreme-case of 50 tables.
Usually, it would hover around ~x20 or so - in real-world use-cases.
But this is using just simple string-based fields... It goes way-up from
there when you introduce other data-types.
As evident with our real-world example of almost x100 penalty for regular
.select() in a simple 2-table join of about 15 columns.
Caching gains maybe 5% improvement...
I also found the .select(processor=<>) approach to be anywhere from 10%
slower to x6 slower than my .quickSelect() approach for the same queries.
The gap get's *worse *for more *simple *the use-case:
i.e:
C:\Python26\python.exe
F:/Experimental/web2py_272/web2py/test_executesql_0.3.py
Preparing....
Times are for 100 iterations, into 2 tables, of querying for 1 results,
each with 3 values
started bench
speed_executesql() total=0.002431 avg=0.000024
speed_normal_rows() total=0.028979 avg=0.000290
speed_rows_with_cacheable() total=0.029703 avg=0.000297
speed_executesql_with_colnames() total=0.002551 avg=0.000026
speed_rows_with_simple_processor() total=0.015669 avg=0.000157
Process finished with exit code 0
speed_rows_with_simple_processor() is x6 times slower than
speed_executesql_with_colnames()
This is when configured as such:
import timeit
from gluon.dal import DAL, Field
tableCount = 2
fieldCount = 1
recordCount = 1
times = 100
##prepare a large dict
...
Granted, this is hardly an issue, though...
But there is a gradient between the more simple and the more complex.
Try it out...
--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to the Google Groups
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.
import timeit
from gluon.dal import DAL, Field
tableCount = 5
fieldCount = 10
recordCount = 1000
times = 100
##prepare a large dict
print 'Preparing....'
db = DAL('sqlite:memory')
tableNames = [('tb_'+str(i)) for i in range(tableCount)]
fieldNames = [('fl_'+str(i)) for i in range(fieldCount)]
#load 1k records into a test table
for t in range(tableCount):
fields = [Field(fieldName) for fieldName in fieldNames]
if t:
fields.append(Field(tableNames[t-1], db[tableNames[t-1]]))
db.define_table(tableNames[t], *fields)
db.commit()
for t in range(tableCount):
db(db[tableNames[t]].id>0).delete()
db.commit()
for r in range(recordCount):
ids = []
for t in range(tableCount):
tableName = tableNames[t]
values = dict([(fieldName,('%s_%s'%(tableName,fieldName))) for fieldName in fieldNames])
newId = db[tableName].insert(**values)
ids.append(newId)
for t in range(tableCount-1):
db[tableNames[t+1]][ids[t+1]].update_record(**{tableNames[t]:ids[t]})
db.commit()
query = (db[tableNames[1]][tableNames[0]]==db[tableNames[0]].id)
for t in range(1, (tableCount-1)):
query = query & (db[tableNames[t+1]][tableNames[t]]==db[tableNames[t]].id)
selection = [db[tableName].ALL for tableName in tableNames]
sql = db(query)._select(*selection)
sql_colnames = sql.split(' FROM')[0][7:].split(', ')
print("Times are for %i iterations, into %i tables, of querying for %i results, each with %i values" % (times,tableCount,recordCount,((fieldCount*tableCount)+(tableCount-1))))
def bench(cmd, imprt):
t = timeit.Timer(cmd, imprt)
s = t.timeit(number=times)
print("%s total=%02f avg=%02f" % (cmd, s, (s/times)))
return s
def myprocessor(rows, fields, colnames, blob_decode=True, cacheable = False):
return [dict(zip(colnames,row)) for row in rows]
def speed_executesql():
rtn = db.executesql(sql, as_dict=True)
def speed_executesql_with_colnames():
rtn = db.executesql(sql, colnames=sql_colnames, as_dict=True)
def speed_normal_rows():
rtn = db(query).select()
def speed_rows_with_cacheable():
rtn = db(query).select(cacheable=True)
def speed_rows_with_simple_processor():
rtn = db(query).select(processor=myprocessor)
def full_bench():
print 'started bench'
b1 = bench('speed_executesql()', 'from __main__ import speed_executesql')
b2 = bench('speed_normal_rows()', 'from __main__ import speed_normal_rows')
b3 = bench('speed_rows_with_cacheable()', 'from __main__ import speed_rows_with_cacheable')
b4 = bench('speed_executesql_with_colnames()', 'from __main__ import speed_executesql_with_colnames')
b5 = bench('speed_rows_with_simple_processor()', 'from __main__ import speed_rows_with_simple_processor, myprocessor')
if __name__ == '__main__':
full_bench()