You can also always do: db.executesql(db(query)._select(), as_dict=True)
to get just a list of dictionaries for any DAL select. Allowing custom column names, as you suggest, sounds like a good idea. Anthony On Tuesday, November 19, 2013 3:24:52 PM UTC-5, Anthony wrote: > > Note, you can do: > > db(query).select(..., processor=myprocessor) > > where myprocessor is a function that takes rows, fields, and > colnames arguments. Perhaps, as you suggest, we could include a built-in > quick processor that simply generates a list of dictionaries or Storage > objects. We might also include options to turn various features on and off > (e.g., turn on parsing of dates, but turn off generation of Reference > objects for reference fields). > > Note, you might also see if .select(..., cacheable=False) speeds things > up a bit. > > Anthony > > On Tuesday, November 19, 2013 3:00:24 PM UTC-5, Arnon Marcus wrote: >> >> We recently did some profiling on some slow queries, and found out >> something pretty surprising: >> The parsing of the result-set into *Rows *object, incurs an overhead >> that is many times the time of the actual query (including the entire >> round-trip to the database). >> >> We are using web2py 2.7.2 and a PostgreSQL database (9.3x64 on >> CentOS-6.4x64) using python 2.6x64 (on Win7x64) and profiling with cProfile >> and RunSnakeRun. >> >> It's a moderately-complex query of a result-set of about a 1000 rows in a >> single query. >> Here is the result: >> >> >> <https://lh6.googleusercontent.com/-ZHKeRiR6Ytw/Uou0mhcE76I/AAAAAAAAAGw/wHLjnuT6IwI/s1600/NormalSelect.jpg> >> >> >> The time it takes the database to fetch them, is 0.02s. >> The entire time of the select, including the parse, is about 0.79s (In >> other tests it was 1.07s). >> It is just an example, but we had queries that took around 5.x-9.x >> seconds (!) in which the database round-trip was a mere 0.04s... >> >> This is an order-of-magnitude difference between the parsing and the >> actual query. >> The gap gets worse the more results there are in the result-set. >> >> Now, in any discussion about performance of server-side web-frameworks, >> you almost unanimously here this argument: >> "...the database always ends up being the bottleneck anyways..." >> So, this is not supposed to happen, right? >> Either we have a really slow framework, or a really fast database... >> >> I then did a little digging, to see if we can use this to our advantage >> (since we rarely have a need for complex usage-patters of the *Rows *object >> anyways...) >> >> First thing I did, was to see if the *executesql *method, could return >> something just a bit more useful then it usually does, when using >> *as_dict=True >> *on it - I changed this line: >> def executesql(self, query, placeholders=None, as_dict=False,fields= >> None, colnames=None): >> .... >> if as_dict: >> if not hasattr(adapter.cursor,'description'): >> raise RuntimeError("database does not support >> executesql(...,as_dict=True)") >> columns = adapter.cursor.description >> fields = [f[0] for f in columns] >> data = adapter._fetchall() >> return [dict(zip(fields,row)) for row in data] >> into this: >> def executesql(self, query, placeholders=None, as_dict=False,fields= >> None, colnames=None): >> .... >> if as_dict: >> if not hasattr(adapter.cursor,'description'): >> raise RuntimeError("database does not support >> executesql(...,as_dict=True)") >> columns = adapter.cursor.description >> fields = *colnames **or *[f[0] for f in columns] >> data = adapter._fetchall() >> return [dict(zip(fields,row)) for row in data] >> 1 line of code... >> >> This single change, makes the function *potentially*-able to return a >> list of dictionaries, each with *practically-usable* keys. >> We went after a key-naming of '<TableName>.<FieldName>' as is in the >> actual SQL string. >> Then I though, why not use the SQL string itself? >> And thus the *quickSelect *method was born into the *Set* class: >> >> def select(self, *fields, **attributes): >> adapter = self.db._adapter >> tablenames = adapter.tables(self.query, >> attributes.get('join',None), >> attributes.get('left',None), >> attributes.get('orderby',None), >> attributes.get('groupby',None)) >> fields = adapter.expand_all(fields, tablenames) >> return adapter.select(self.query,fields,attributes) >> ############################################################### >> def quickSelect(self, *fields, **attributes): >> query = self._select(*fields, **attributes) >> colnames = query.split(' FROM')[0][7:-1].split(', ') >> return self._db.executesql(query, colnames=colnames, as_dict=True >> ) >> ############################################################### >> def nested_select(self,*fields,**attributes): >> return Expression(self.db,self._select(*fields,**attributes)) >> 3 lines of code... Could even be shorten to 2... >> >> Was was the outcome? >> Well, lets see: >> >> >> <https://lh4.googleusercontent.com/-A_2xljL5gJQ/Uou5nSpH4lI/AAAAAAAAAHA/NtCjwJQxICc/s1600/QuickSelect.jpg> >> Voila! >> 0.04s for a *practically-usable *list of dictionaries(!) >> An order-of-magnitude improvement in less than 5 lines of code... >> >> I *though *I could do *better*, and wrap the result-set myself with a >> simple class, and inject the dictionaries into the object.__dict__ >> attribute, so I could *emulate*, and perhaps even *point-in-place* *replace >> *the "real" *select() *method itself (including support for *JOIN*s) >> This way, I could leave all our app-code unchanged: >> >> class SimpleObject: >> def __init__(self, attributes=None): >> self.__dict__ = attributes >> >> >> class Set(object): >> ... >> def select(self, *fields, **attributes): >> query = self._select(*fields, **attributes) >> colnames = query.split(' FROM')[0][7:-1].split(', ') >> tableNames = [colname.split('.')[0] for colname in colnames] >> uniqueTableNames = set(tableNames) >> records = self._db.executesql(query, colnames=colnames, as_dict= >> True) >> if len(uniqueTableNames) is 1: # Single Table >> tableNameLemgth = len(uniqueTableNames[0]) + 1 >> return [ >> SimpleObject( >> dict( >> [ >> ( >> key[tableNameLemgth:-1], >> value >> ) for key, value in record.iteritems() >> ] >> ) for record in records >> ) >> ] >> else: # Multiple Tables (JOIN) >> tableNamesLemgth = dict( # A dictionary mapping >> table-names to their length+1 >> [ # This is for later "easy" >> splicing of the key-names >> ( >> tableName, >> (len(tableName) + 1) >> ) for tableName in uniqueTableNames >> ] >> ) >> return [ >> SimpleObject( >> dict( >> [ >> ( >> tableName, >> SimpleObject( >> dict( >> [ >> ( >> key[tableNamesLemgth[ >> tableName]:-1], >> value >> ) for key, value in record. >> iteritems() if ( >> key[:(tableNamesLemgth[ >> tableName] - 1)] == tableName >> ) >> ] >> ) >> ) >> ) for tableName in tableNames >> ] >> ) >> ) for record in records >> ] >> >> But that didn't work as I expected... I got an even worst result than the >> original web2py *Rows *parsing... >> Tried to remove the SimpleObject class from the equation, and just return >> nested-dictionaries - No luck... Same result... >> So I guess the *Rows *parsing isn't "slow" after all, as it's doing a >> better job than I can... >> >> However, I still think there is room for having my *quickSelect()* method, >> in conjunction with the small addition to the *executesql()* method - >> for people who want to dramatically speed-up their application, and are >> willing to pay the price of getting back a flat list of dictionaries with >> "table.filed" key-names. >> >> We're finding it potentially useful enough, that we are prepared to >> experiment this in our entire app... >> The speed-bump is just too impressive (and tempting) to overlook... >> >> What say you? >> > -- Resources: - http://web2py.com - http://web2py.com/book (Documentation) - http://github.com/web2py/web2py (Source code) - https://code.google.com/p/web2py/issues/list (Report Issues) --- You received this message because you are subscribed to the Google Groups "web2py-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

