[PyGreSQL] PyGreSQL Commit r683 - in trunk: docs module module/tests

darcy Fri, 01 Jan 2016 15:11:29 -0800

Author: cito
Date: Fri Jan  1 18:11:01 2016
New Revision: 683

Log:
Return rows as named tuples in pgdb


By default, we now return result rows as named tuples in pgdb.

Note that named tuples can be accessed like normal lists and tuples,
and easily converted to these. They can also be easily converted to
(ordered) dictionaries by calling row._asdict(). Therefore the need
for alternative Cursor types with different row types has been greatly
reduced, so I have simplified the implementation in the last revision
by removing the added Cursor classes and cursor() methods again,
leaving only the old row_factory method for customizing the returned
row types. I complemented this with a new build_row_factory method,
because different named tuple classes must be created for different
result sets, so a static row_factory method is not so appropriate.

Tests and documentation for these changes are included.

Modified:
   trunk/docs/changelog.rst
   trunk/docs/pgdb.rst
   trunk/module/pgdb.py
   trunk/module/tests/test_dbapi20.py

Modified: trunk/docs/changelog.rst
==============================================================================
--- trunk/docs/changelog.rst    Fri Jan  1 12:32:24 2016        (r682)
+++ trunk/docs/changelog.rst    Fri Jan  1 18:11:01 2016        (r683)
@@ -6,6 +6,9 @@
 - This version runs on both Python 2 and Python 3.
 - The supported versions are Python 2.6, 2.7, 3.3, 3.4 and 3.5.
 - The supported PostgreSQL versions are 9.0, 9.1, 9.2, 9.3 and 9.4.
+- The DB-API 2 module now always returns result rows as named tuples
+  instead of simply lists as before. The documentation explains how
+  you can restore the old behavior or use custom row objects instead.
 - The names of the various types supported by the classic and DB-API 2
   modules have been renamed to become simpler, more intuitive and in
   line with the names used in the DB-API 2 documentation.

Modified: trunk/docs/pgdb.rst
==============================================================================
--- trunk/docs/pgdb.rst Fri Jan  1 12:32:24 2016        (r682)
+++ trunk/docs/pgdb.rst Fri Jan  1 18:11:01 2016        (r683)
@@ -190,7 +190,7 @@
 cursor -- return a new cursor object
 ------------------------------------
 
-.. method:: Connection.cusor()
+.. method:: Connection.cursor()
 
     Return a new cursor object using the connection
 
@@ -200,6 +200,21 @@
 This method returns a new :class:`Cursor` object that can be used to
 operate on the database in the way described in the next section.
 
+Attributes that are not part of the standard
+--------------------------------------------
+
+.. note::
+
+   The following attributes are not part of the DB-API 2 standard.
+
+.. attribute:: cursor_type
+
+    The default cursor type used by the connection
+
+If you want to use your own custom subclass of the :class:`Cursor` class
+with he connection, set this attribute to you custom cursor class. You will
+then get your custom cursor whenever you call :meth:`Connection.cursor`.
+
 
 Cursor -- The cursor object
 ===========================
@@ -318,10 +333,12 @@
     Fetch the next row of a query result set
 
     :returns: the next row of the query result set
-    :rtype: tuple or None
+    :rtype: named tuple or None
 
-Fetch the next row of a query result set, returning a single tuple,
-or ``None`` when no more data is available.
+Fetch the next row of a query result set, returning a single named tuple,
+or ``None`` when no more data is available. The field names of the named
+tuple are the same as the column names of the database query as long as
+they are valid Python identifiers.
 
 An :exc:`Error` (or subclass) exception is raised if the previous call to
 :meth:`Cursor.execute` or :meth:`Cursor.executemany` did not produce
@@ -339,10 +356,12 @@
     :param keep: if set to true, will keep the passed arraysize
     :tpye keep: bool
     :returns: the next set of rows of the query result
-    :rtype: list of tuples
+    :rtype: list of named tuples
 
-Fetch the next set of rows of a query result, returning a list of tuples.
-An empty sequence is returned when no more rows are available.
+Fetch the next set of rows of a query result, returning a list of named
+tuples. An empty sequence is returned when no more rows are available.
+The field names of the named tuple are the same as the column names of
+the database query as long as they are valid Python identifiers.
 
 The number of rows to fetch per call is specified by the *size* parameter.
 If it is not given, the cursor's :attr:`arraysize` determines the number of
@@ -370,45 +389,95 @@
     Fetch all (remaining) rows of a query result
 
     :returns: the set of all rows of the query result
-    :rtype: list of tuples
+    :rtype: list of named tuples
+
+Fetch all (remaining) rows of a query result, returning them as list of
+named tuples. The field names of the named tuple are the same as the column
+names of the database query as long as they are valid Python identifiers.
 
-Fetch all (remaining) rows of a query result, returning them as list of tuples.
 Note that the cursor's :attr:`arraysize` attribute can affect the performance
 of this operation.
 
-row_factory -- process a row of the query result
-------------------------------------------------
+arraysize - the number of rows to fetch at a time
+-------------------------------------------------
+
+.. attribute:: Cursor.arraysize
+
+    The number of rows to fetch at a time
+
+This read/write attribute specifies the number of rows to fetch at a time with
+:meth:`Cursor.fetchmany`. It defaults to 1, meaning to fetch a single row
+at a time.
+
+Methods and attributes that are not part of the standard
+--------------------------------------------------------
+
+.. note::
+
+   The following methods and attributes are not part of the DB-API 2 standard.
 
 .. method:: Cursor.row_factory(row)
 
     Process rows before they are returned
 
     :param tuple row: the currently processed row of the result set
-    :returns: the transformed row that the cursor methods shall return
-
-Note that this method is not part of the DB-API 2 standard.
+    :returns: the transformed row that the fetch methods shall return
 
-You can overwrite this method with a custom row factory, e.g.
-if you want to return rows as dicts instead of tuples::
+This method is used for processing result rows before returning them through
+one of the fetch methods. By default, rows are returned as named tuples.
+You can overwrite this method with a custom row factory if you want to
+return the rows as different kids of objects. This same row factory will then
+be used for all result sets. If you overwrite this method, the method
+:meth:`Cursor.build_row_factory` for creating row factories dynamically
+will be ignored.
+
+Note that named tuples are very efficient and can be easily converted to
+dicts (even OrderedDicts) by calling ``row._asdict()``. If you still want
+to return rows as dicts, you can create a custom cursor class like this::
 
     class DictCursor(pgdb.Cursor):
 
         def row_factory(self, row):
-            return {desc[0]:value
-                for desc, value in zip(self.description, row)}
+            return {key: value for key, value in zip(self.colnames, row)}
 
-    cur = DictCursor(con)
+    cur = DictCursor(con)  # get one DictCursor instance or
+    con.cursor_type = DictCursor  # always use DictCursor instances
 
-arraysize - the number of rows to fetch at a time
--------------------------------------------------
 
-.. attribute:: Cursor.arraysize
+.. method:: Cursor.build_row_factory()
 
-    The number of rows to fetch at a time
+    Build a row factory based on the current description
 
-This read/write attribute specifies the number of rows to fetch at a time with
-:meth:`Cursor.fetchmany`. It defaults to 1 meaning to fetch a single row
-at a time.
+    :returns: callable with the signature of :meth:`Cursor.row_factory`
+
+This method returns row factories for creating named tuples. It is called
+whenever a new result set is created, and :attr:`Cursor.row_factory` is
+then assigned the return value of this method. You can overwrite this method
+with a custom row factory builder if you want to use different row factories
+for different result sets. Otherwise, you can also simply overwrite the
+:meth:`Cursor.row_factory` method. This method will then be ignored.
+
+The default implementation that delivers rows as named tuples essentially
+looks like this::
+
+    def build_row_factory(self):
+        return namedtuple('Row', self.colnames, rename=True)._make
+
+.. attribute:: Cursor.colnames
+
+    The list of columns names of the current result set
+
+The values in this list are the same values as the *name* elements
+in the :attr:`Cursor.description` attribute. Always use the latter
+if you want to remain standard compliant.
+
+.. attribute:: Cursor.coltypes
+
+    The list of columns types of the current result set
+
+The values in this list are the same values as the *type_code* elements
+in the :attr:`Cursor.description` attribute. Always use the latter
+if you want to remain standard compliant.
 
 
 Type -- Type objects and constructors
@@ -472,7 +541,9 @@
 
     Used to describe the ``oid`` column of PostgreSQL database tables
 
-The following more specific types are not part of the DB-API 2 standard:
+.. note::
+
+  The following more specific types are not part of the DB-API 2 standard.
 
 .. class:: BOOL
 

Modified: trunk/module/pgdb.py
==============================================================================
--- trunk/module/pgdb.py        Fri Jan  1 12:32:24 2016        (r682)
+++ trunk/module/pgdb.py        Fri Jan  1 18:11:01 2016        (r683)
@@ -228,16 +228,21 @@
 class Cursor(object):
     """Cursor object."""
 
-    row_factory = tuple  # the factory for creating result rows
-
     def __init__(self, dbcnx):
         """Create a cursor object for the database connection."""
         self.connection = self._dbcnx = dbcnx
         self._cnx = dbcnx._cnx
         self._type_cache = dbcnx._type_cache
         self._src = self._cnx.source()
-        self.colnames = self.coltypes = None
+        # the official attribute for describing the result columns
         self.description = None
+        # unofficial attributes for convenience and performance
+        self.colnames = self.coltypes = None
+        if self.row_factory is Cursor.row_factory:
+            # the row factory needs to be determined dynamically
+            self.row_factory = None
+        else:
+            self.build_row_factory = None
         self.rowcount = -1
         self.arraysize = 1
         self.lastrowid = None
@@ -302,8 +307,8 @@
     def close(self):
         """Close the cursor object."""
         self._src.close()
-        self.colnames = self.coltypes = None
         self.description = None
+        self.colnames = self.coltypes = None
         self.rowcount = -1
         self.lastrowid = None
 
@@ -325,11 +330,11 @@
         if not param_seq:
             # don't do anything without parameters
             return
-        self.colnames = self.coltypes = None
         self.description = None
+        self.colnames = self.coltypes = None
         self.rowcount = -1
         # first try to execute all queries
-        totrows = 0
+        rowcount = 0
         sql = "BEGIN"
         try:
             if not self._dbcnx._tnx:
@@ -347,7 +352,7 @@
                     sql = operation
                 rows = self._src.execute(sql)
                 if rows:  # true if not DML
-                    totrows += rows
+                    rowcount += rows
                 else:
                     self.rowcount = -1
         except DatabaseError:
@@ -366,8 +371,10 @@
             self.coltypes = [info[1] for info in description]
             self.description = description
             self.lastrowid = None
+            if self.build_row_factory:
+                self.row_factory = self.build_row_factory()
         else:
-            self.rowcount = totrows
+            self.rowcount = rowcount
             self.lastrowid = self._src.oidstatus()
         # return the cursor object, so you can write statements such as
         # "cursor.execute(...).fetchall()" or "for row in cursor.execute(...)"
@@ -434,60 +441,53 @@
         """Not supported."""
         pass  # unsupported, but silently passed
 
+    @staticmethod
+    def row_factory(row):
+        """Process rows before they are returned.
 
-CursorDescription = namedtuple('CursorDescription',
-    ['name', 'type_code', 'display_size', 'internal_size',
-     'precision', 'scale', 'null_ok'])
-
-
-class ListCursor(Cursor):
-    """Cursor object that returns rows as lists."""
-
-    row_factory = list
-
-
-class DictCursor(Cursor):
-    """Cursor object that returns rows as dictionaries."""
+        You can overwrite this statically with a custom row factory, or
+        you can build a row factory dynamically with build_row_factory().
 
-    def row_factory(self, row):
-        """Turn a row from a tuple into a dictionary."""
-        # not using dict comprehension to stay compatible with Py 2.6
-        return dict((key, value) for key, value in zip(self.colnames, row))
+        For example, you can create a Cursor class that returns rows as
+        Python dictionaries like this:
 
+            class DictCursor(pgdb.Cursor):
 
-class OrderedDictCursor(Cursor):
-    """Cursor object that returns rows as ordered dictionaries."""
+                def row_factory(self, row):
+                    return {desc[0]: value
+                        for desc, value in zip(self.description, row)}
 
-    def row_factory(self, row):
-        """Turn a row from a tuple into an ordered dictionary."""
-        return OrderedDict(
-            (key, value) for key, value in zip(self.colnames, row))
+            cur = DictCursor(con)  # get one DictCursor instance or
+            con.cursor_type = DictCursor  # always use DictCursor instances
 
+        """
+        raise NotImplementedError
 
-class NamedTupleCursor(Cursor):
-    """Cursor object that returns rows as named tuples."""
+    def build_row_factory(self):
+        """Build a row factory based on the current description.
 
-    @property
-    def colnames(self):
-        return self._colnames
+        This implementation builds a row factory for creating named tuples.
+        You can overwrite this method if you want to dynamically create
+        different row factories whenever the column description changes.
 
-    @colnames.setter
-    def colnames(self, value):
-        self._colnames = value
-        if value:
+        """
+        colnames = self.colnames
+        if colnames:
             try:
                 try:
-                    factory = namedtuple('Row', value, rename=True)._make
+                    return namedtuple('Row', colnames, rename=True)._make
                 except TypeError:  # Python 2.6 and 3.0 do not support rename
-                    value = [v if v.isalnum() else 'column_%d' % n
-                             for n, v in enumerate(value)]
-                    factory = namedtuple('Row', value)._make
-            except ValueError:
-                value = ['column_%d' % n for n in range(len(value))]
-                factory = namedtuple('Row', value)._make
-        else:
-            factory = tuple
-        self._colnames, self.row_factory = value, factory
+                    colnames = [v if v.isalnum() else 'column_%d' % n
+                             for n, v in enumerate(colnames)]
+                    return namedtuple('Row', colnames)._make
+            except ValueError:  # there is still a problem with the field names
+                colnames = ['column_%d' % n for n in range(len(colnames))]
+                return namedtuple('Row', colnames)._make
+
+
+CursorDescription = namedtuple('CursorDescription',
+    ['name', 'type_code', 'display_size', 'internal_size',
+     'precision', 'scale', 'null_ok'])
 
 
 ### Connection Objects
@@ -513,7 +513,6 @@
         self._tnx = False  # transaction state
         self._type_cache = TypeCache(cnx)
         self.cursor_type = Cursor
-        self.row_type = tuple
         try:
             self._cnx.source()
         except Exception:
@@ -579,36 +578,16 @@
         else:
             raise _op_error("connection has been closed")
 
-    def cursor(self, cls=None):
+    def cursor(self):
         """Return a new cursor object using the connection."""
         if self._cnx:
             try:
-                return (cls or self.cursor_type)(self)
+                return self.cursor_type(self)
             except Exception:
                 raise _op_error("invalid connection")
         else:
             raise _op_error("connection has been closed")
 
-    def list_cursor(self):
-        """Return a new tuple cursor object using the connection."""
-        return self.cursor(ListCursor)
-
-    def tuple_cursor(self):
-        """Return a new tuple cursor object using the connection."""
-        return self.cursor(Cursor)
-
-    def named_tuple_cursor(self):
-        """Return a new named tuple cursor object using the connection."""
-        return self.cursor(NamedTupleCursor)
-
-    def dict_cursor(self):
-        """Return a new dict cursor object using the connection."""
-        return self.cursor(DictCursor)
-
-    def ordered_dict_cursor(self):
-        """Return a new ordered dict cursor object using the connection."""
-        return self.cursor(OrderedDictCursor)
-
     if shortcutmethods:  # otherwise do not implement and document this
 
         def execute(self, operation, params=None):

Modified: trunk/module/tests/test_dbapi20.py
==============================================================================
--- trunk/module/tests/test_dbapi20.py  Fri Jan  1 12:32:24 2016        (r682)
+++ trunk/module/tests/test_dbapi20.py  Fri Jan  1 18:11:01 2016        (r683)
@@ -66,72 +66,105 @@
     def tearDown(self):
         dbapi20.DatabaseAPI20Test.tearDown(self)
 
-    def test_cursortype(self):
+    def test_cursor_type(self):
+
+        class TestCursor(pgdb.Cursor):
+            pass
+
         con = self._connect()
         self.assertIs(con.cursor_type, pgdb.Cursor)
         cur = con.cursor()
         self.assertIsInstance(cur, pgdb.Cursor)
-        self.assertNotIsInstance(cur, pgdb.ListCursor)
-        cur.close()
-        con.cursor_type = pgdb.ListCursor
+        self.assertNotIsInstance(cur, TestCursor)
+        con.cursor_type = TestCursor
         cur = con.cursor()
-        self.assertIsInstance(cur, pgdb.Cursor)
-        self.assertIsInstance(cur, pgdb.ListCursor)
-        cur.close()
+        self.assertIsInstance(cur, TestCursor)
         cur = con.cursor()
-        self.assertIsInstance(cur, pgdb.ListCursor)
-        cur.close()
-        con.close()
+        self.assertIsInstance(cur, TestCursor)
         con = self._connect()
         self.assertIs(con.cursor_type, pgdb.Cursor)
         cur = con.cursor()
         self.assertIsInstance(cur, pgdb.Cursor)
-        self.assertNotIsInstance(cur, pgdb.ListCursor)
-        cur.close()
-        con.close()
+        self.assertNotIsInstance(cur, TestCursor)
+
+    def test_row_factory(self):
+
+        class TestCursor(pgdb.Cursor):
+
+            def row_factory(self, row):
+                return dict(('column %s' % desc[0], value)
+                    for desc, value in zip(self.description, row))
 
-    def test_list_cursor(self):
         con = self._connect()
-        cur = con.list_cursor()
-        self.assertIsInstance(cur, pgdb.ListCursor)
-        cur.execute("select 1, 2, 3")
+        con.cursor_type = TestCursor
+        cur = con.cursor()
+        self.assertIsInstance(cur, TestCursor)
+        res = cur.execute("select 1 as a, 2 as b")
+        self.assertIs(res, cur, 'execute() should return cursor')
         res = cur.fetchone()
-        cur.close()
-        con.close()
+        self.assertIsInstance(res, dict)
+        self.assertEqual(res, {'column a': 1, 'column b': 2})
+        cur.execute("select 1 as a, 2 as b union select 3, 4 order by 1")
+        res = cur.fetchall()
         self.assertIsInstance(res, list)
-        self.assertEqual(res, [1, 2, 3])
+        self.assertEqual(len(res), 2)
+        self.assertIsInstance(res[0], dict)
+        self.assertEqual(res[0], {'column a': 1, 'column b': 2})
+        self.assertIsInstance(res[1], dict)
+        self.assertEqual(res[1], {'column a': 3, 'column b': 4})
+
+    def test_build_row_factory(self):
+
+        class TestCursor(pgdb.Cursor):
+
+            def build_row_factory(self):
+                keys = [desc[0] for desc in self.description]
+                return lambda row: dict((key, value)
+                    for key, value in zip(keys, row))
 
-    def test_tuple_cursor(self):
         con = self._connect()
-        cur = con.tuple_cursor()
-        self.assertIsInstance(cur, pgdb.Cursor)
-        cur.execute("select 1, 2, 3")
+        con.cursor_type = TestCursor
+        cur = con.cursor()
+        self.assertIsInstance(cur, TestCursor)
+        cur.execute("select 1 as a, 2 as b")
         res = cur.fetchone()
-        cur.close()
-        con.close()
-        self.assertIsInstance(res, tuple)
-        self.assertEqual(res, (1, 2, 3))
-        self.assertRaises(AttributeError, getattr, res, '_fields')
+        self.assertIsInstance(res, dict)
+        self.assertEqual(res, {'a': 1, 'b': 2})
+        cur.execute("select 1 as a, 2 as b union select 3, 4 order by 1")
+        res = cur.fetchall()
+        self.assertIsInstance(res, list)
+        self.assertEqual(len(res), 2)
+        self.assertIsInstance(res[0], dict)
+        self.assertEqual(res[0], {'a': 1, 'b': 2})
+        self.assertIsInstance(res[1], dict)
+        self.assertEqual(res[1], {'a': 3, 'b': 4})
 
-    def test_named_tuple_cursor(self):
+    def test_cursor_with_named_columns(self):
         con = self._connect()
-        cur = con.named_tuple_cursor()
-        self.assertIsInstance(cur, pgdb.NamedTupleCursor)
-        cur.execute("select 1 as abc, 2 as de, 3 as f")
+        cur = con.cursor()
+        res = cur.execute("select 1 as abc, 2 as de, 3 as f")
+        self.assertIs(res, cur, 'execute() should return cursor')
         res = cur.fetchone()
-        cur.close()
-        con.close()
         self.assertIsInstance(res, tuple)
         self.assertEqual(res, (1, 2, 3))
         self.assertEqual(res._fields, ('abc', 'de', 'f'))
         self.assertEqual(res.abc, 1)
         self.assertEqual(res.de, 2)
         self.assertEqual(res.f, 3)
+        cur.execute("select 1 as one, 2 as two union select 3, 4 order by 1")
+        res = cur.fetchall()
+        self.assertIsInstance(res, list)
+        self.assertEqual(len(res), 2)
+        self.assertIsInstance(res[0], tuple)
+        self.assertEqual(res[0], (1, 2))
+        self.assertEqual(res[0]._fields, ('one', 'two'))
+        self.assertIsInstance(res[1], tuple)
+        self.assertEqual(res[1], (3, 4))
+        self.assertEqual(res[1]._fields, ('one', 'two'))
 
-    def test_named_tuple_cursor_with_bad_names(self):
+    def test_cursor_with_unnamed_columns(self):
         con = self._connect()
-        cur = con.named_tuple_cursor()
-        self.assertIsInstance(cur, pgdb.NamedTupleCursor)
+        cur = con.cursor()
         cur.execute("select 1, 2, 3")
         res = cur.fetchone()
         self.assertIsInstance(res, tuple)
@@ -158,53 +191,6 @@
             self.assertEqual(res._fields, ('column_0', 'column_1'))
         else:
             self.assertEqual(res._fields, ('abc', '_1'))
-        cur.close()
-        con.close()
-
-    def test_dict_cursor(self):
-        con = self._connect()
-        cur = con.dict_cursor()
-        self.assertIsInstance(cur, pgdb.DictCursor)
-        cur.execute("select 1 as abc, 2 as de, 3 as f")
-        res = cur.fetchone()
-        cur.close()
-        con.close()
-        self.assertIsInstance(res, dict)
-        self.assertEqual(res, {'abc': 1, 'de': 2, 'f': 3})
-        self.assertRaises(TypeError, res.popitem, last=True)
-
-    def test_ordered_dict_cursor(self):
-        con = self._connect()
-        cur = con.ordered_dict_cursor()
-        self.assertIsInstance(cur, pgdb.OrderedDictCursor)
-        cur.execute("select 1 as abc, 2 as de, 3 as f")
-        try:
-            res = cur.fetchone()
-        except pgdb.NotSupportedError:
-            if OrderedDict is None:
-                return
-            self.fail('OrderedDict supported by Python, but not by pgdb')
-        finally:
-            cur.close()
-            con.close()
-        self.assertIsInstance(res, dict)
-        self.assertEqual(res, {'abc': 1, 'de': 2, 'f': 3})
-        self.assertEqual(res.popitem(last=True), ('f', 3))
-
-    def test_row_factory(self):
-
-        class DictCursor(pgdb.Cursor):
-
-            def row_factory(self, row):
-                # not using dict comprehension to stay compatible with Py 2.6
-                return dict(('column %s' % desc[0], value)
-                    for desc, value in zip(self.description, row))
-
-        con = self._connect()
-        cur = DictCursor(con)
-        ret = cur.execute("select 1 as a, 2 as b")
-        self.assertTrue(ret is cur, 'execute() should return cursor')
-        self.assertEqual(cur.fetchone(), {'column a': 1, 'column b': 2})
 
     def test_colnames(self):
         con = self._connect()
@@ -226,7 +212,7 @@
         self.assertIsInstance(types, list)
         self.assertEqual(types, ['int2', 'int4', 'int8'])
 
-    def test_description_named(self):
+    def test_description_fields(self):
         con = self._connect()
         cur = con.cursor()
         cur.execute("select 123456789::int8 as col")
@@ -235,6 +221,7 @@
         self.assertEqual(len(desc), 1)
         desc = desc[0]
         self.assertIsInstance(desc, tuple)
+        self.assertEqual(len(desc), 7)
         self.assertEqual(desc.name, 'col')
         self.assertEqual(desc.type_code, 'int8')
         self.assertIsNone(desc.display_size)
_______________________________________________
PyGreSQL mailing list
[email protected]
https://mail.vex.net/mailman/listinfo.cgi/pygresql

[PyGreSQL] PyGreSQL Commit r683 - in trunk: docs module module/tests

Reply via email to