#830: dbquery: run_sql() should optionally return list-of-dictionaries
--------------------------+-----------------
  Reporter:  simko        |      Owner:
      Type:  enhancement  |     Status:  new
  Priority:  major        |  Milestone:
 Component:  MiscUtil     |    Version:
Resolution:               |   Keywords:
--------------------------+-----------------
Description changed by simko:

Old description:

> **0)** Prelude and motivation.  `run_sql()` currently returns tuple-of-
> tuples:
>
> {{{
> In [21]: res = run_sql("SELECT id,name,dbquery FROM collection WHERE
> id<3");
>
> In [22]: res
> Out[22]:
> ((1, 'Atlantis Institute of Fictive Science', None),
>  (2, 'Preprints', 'collection:PREPRINT'))
> }}}
>
> The returned values are usually transformed into //named symbols// in the
> business logic, as it would not be very readable to work with
> //position// of entities in the resulting tuple representing rows.
>
> One can transform tuple-of-tuples into list-of-dictionaries from the get
> go, via:
>
> {{{
> In [23]: resd = [dict(id=row[0],name=row[1],dbquery=row[2]) for row in
> run_sql("SELECT id,name,dbquery FROM collection WHERE id<3")]
>
> In [24]: resd
> Out[24]:
> [{'dbquery': None, 'id': 1, 'name': 'Atlantis Institute of Fictive
> Science'},
>  {'dbquery': 'collection:PREPRINT', 'id': 2, 'name': 'Preprints'}]
> }}}
>
> This format of SQL results is especially nice for the forthcoming
> migration to Jinja templates, because one will then be able to use
> symbols in templates easily, see `collection.name` and friends in the
> following example:
>
> {{{
> #!python
> from jinja2 import Template
> from invenio.dbquery import run_sql
>
> tmpl_collection_table = Template("""\
> <table border="1">
>   <thead>
>     <tr>
>       <th>ID</th>
>       <th>Name</th>
>       <th>Query Definition</th>
>     </tr>
>   </thead>
>   <tbody>
>   {% for collection in collections %}
>     <tr>
>       <td>{{ collection.id }}</td>
>       <td>{{ collection.name }}</td>
>       <td>{{ collection.dbquery }}</td>
>     </tr>
>   {% endfor %}
>   </tbody>
> </table>
> """)
>
> collections = [dict(id=row[0],name=row[1],dbquery=row[2]) \
>                for row in run_sql("SELECT id,name,dbquery FROM
> collection")]
>
> print tmpl_collection_table.render(collections=collections)
> }}}
>
> It is useful to generalise this technique of using SQL results throughout
> Invenio, but the `dict()` boilerplate code should then be eliminated.
>
> **1)** The goal of this ticket is therefore to enrich `run_sql()` with a
> new option called say `with_dict=True` that would return directly list-
> of-dictionaries instead of tuple-of-tuples.  This will enable programmers
> to write elegantly:
>
> {{{
> #!python
> collections = run_sql("SELECT id,name,dbquery FROM collection",
> with_dict=True)]
> }}}
>
> instead of currently convoluted:
>
> {{{
> #!python
> collections = [dict(id=row[0],name=row[1],dbquery=row[2]) \
>                for row in run_sql("SELECT id,name,dbquery FROM
> collection")]
> }}}
>
> The `with_dict` option would be set to `False` by default, for backwards
> compatibility, and for use cases when speed difference may be important.
> Otherwise, once introduced, the majority of `run_sql()` callers should
> probably switch to using it, for better code readability.
>
> Note that the new option `with_dict` would behave somewhat similarly to
> how current `with_desc` option behaves, but it would return more directly
> exploitable results.
>
> **2)** Beware of SQL queries like:
>
> {{{
> SELECT DATE_FORMAT(creation_date, '%%Y') FROM bibrec
> }}}
>
> or:
>
> {{{
> SELECT id+nbrecs,name FROM collection
> }}}
>
> or (even though the following technique is bad style):
>
> {{{
> SELECT * FROM collection
> }}}
>
> when constructing names of keys of the resulting dictionaries
> representing rows.
>
> **3)** Extensive regression test cases covering the above examples should
> be naturally added.
>
> **P.S.** We can also inspire ourselves from how `tornado.database` DB
> wrapper behaves in this respect, see
> [[wiki:Talk/WebFrameworks#a4.6.Tornado]].

New description:

 **0)** Prelude and motivation.  `run_sql()` currently returns tuple-of-
 tuples:

 {{{
 In [21]: res = run_sql("SELECT id,name,dbquery FROM collection WHERE
 id<3");

 In [22]: res
 Out[22]:
 ((1, 'Atlantis Institute of Fictive Science', None),
  (2, 'Preprints', 'collection:PREPRINT'))
 }}}

 The returned values are usually transformed into //named symbols// in the
 business logic, as it would not be very readable to work with //position//
 of entities in the resulting tuple representing rows.

 One can transform tuple-of-tuples into list-of-dictionaries from the get
 go, via:

 {{{
 In [23]: resd = [dict(id=row[0],name=row[1],dbquery=row[2]) for row in
 run_sql("SELECT id,name,dbquery FROM collection WHERE id<3")]

 In [24]: resd
 Out[24]:
 [{'dbquery': None, 'id': 1, 'name': 'Atlantis Institute of Fictive
 Science'},
  {'dbquery': 'collection:PREPRINT', 'id': 2, 'name': 'Preprints'}]
 }}}

 This format of SQL results is especially nice for the forthcoming
 migration to Jinja templates, because one will then be able to use symbols
 in templates easily, see `collection.name` and friends in the following
 example:

 {{{
 #!python
 from jinja2 import Template
 from invenio.dbquery import run_sql

 tmpl_collection_table = Template("""\
 <table border="1">
   <thead>
     <tr>
       <th>ID</th>
       <th>Name</th>
       <th>Query Definition</th>
     </tr>
   </thead>
   <tbody>
   {% for collection in collections %}
     <tr>
       <td>{{ collection.id }}</td>
       <td>{{ collection.name }}</td>
       <td>{{ collection.dbquery }}</td>
     </tr>
   {% endfor %}
   </tbody>
 </table>
 """)

 collections = [dict(id=row[0],name=row[1],dbquery=row[2]) \
                for row in run_sql("SELECT id,name,dbquery FROM
 collection")]

 print tmpl_collection_table.render(collections=collections)
 }}}

 It is useful to generalise this technique of using SQL results throughout
 Invenio, but the `dict()` boilerplate code should then be eliminated.

 **1)** The goal of this ticket is therefore to enrich `run_sql()` with a
 new option called say `with_dict=True` that would return directly list-of-
 dictionaries instead of tuple-of-tuples.  This will enable programmers to
 write elegantly:

 {{{
 #!python
 collections = run_sql("SELECT id,name,dbquery FROM collection",
 with_dict=True)
 }}}

 instead of currently convoluted:

 {{{
 #!python
 collections = [dict(id=row[0],name=row[1],dbquery=row[2]) \
                for row in run_sql("SELECT id,name,dbquery FROM
 collection")]
 }}}

 The `with_dict` option would be set to `False` by default, for backwards
 compatibility, and for use cases when speed difference may be important.
 Otherwise, once introduced, the majority of `run_sql()` callers should
 probably switch to using it, for better code readability.

 Note that the new option `with_dict` would behave somewhat similarly to
 how current `with_desc` option behaves, but it would return more directly
 exploitable results.

 **2)** Beware of SQL queries like:

 {{{
 SELECT DATE_FORMAT(creation_date, '%%Y') FROM bibrec
 }}}

 or:

 {{{
 SELECT id+nbrecs,name FROM collection
 }}}

 or (even though the following technique is bad style):

 {{{
 SELECT * FROM collection
 }}}

 when constructing names of keys of the resulting dictionaries representing
 rows.

 **3)** Extensive regression test cases covering the above examples should
 be naturally added.

 **P.S.** We can also inspire ourselves from how `tornado.database` DB
 wrapper behaves in this respect, see
 [[wiki:Talk/WebFrameworks#a4.6.Tornado]].

--

-- 
Ticket URL: <http://invenio-software.org/ticket/830#comment:1>
Invenio <http://invenio-software.org>

Reply via email to