Author: cito
Date: Sun Jan 31 16:18:28 2016
New Revision: 802

Log:
Make the unescaping of bytea configurable

By default, bytea is returned unescaped in 5.0, but the old
behavior can now be restored with set_escaped_bytea().

Modified:
   trunk/docs/contents/changelog.rst
   trunk/docs/contents/pg/module.rst
   trunk/pgmodule.c
   trunk/tests/test_classic_dbwrapper.py
   trunk/tests/test_classic_functions.py

Modified: trunk/docs/contents/changelog.rst
==============================================================================
--- trunk/docs/contents/changelog.rst   Sun Jan 31 14:51:16 2016        (r801)
+++ trunk/docs/contents/changelog.rst   Sun Jan 31 16:18:28 2016        (r802)
@@ -7,103 +7,104 @@
 - The supported versions are Python 2.6 to 2.7, and 3.3 to 3.5.
 - PostgreSQL is supported in all versions from 9.0 to 9.5.
 - Changes in the DB-API 2 module (pgdb):
-  - The DB-API 2 module now always returns result rows as named tuples
-    instead of simply lists as before. The documentation explains how
-    you can restore the old behavior or use custom row objects instead.
-  - The names of the various classes used by the classic and DB-API 2
-    modules have been renamed to become simpler, more intuitive and in
-    line with the names used in the DB-API 2 documentation.
-    Since the API provides only objects of these types through constructor
-    functions, this should not cause any incompatibilities.
-  - The DB-API 2 module now supports the callproc() cursor method. Note
-    that output parameters are currently not replaced in the return value.
-  - The DB-API 2 module now supports copy operations between data streams
-    on the client and database tables via the COPY command of PostgreSQL.
-    The cursor method copy_from() can be used to copy data from the database
-    to the client, and the cursor method copy_to() can be used to copy data
-    from the client to the database.
-  - The 7-tuples returned by the description attribute of a pgdb cursor
-    are now named tuples, i.e. their elements can be also accessed by name.
-    The column names and types can now also be requested through the
-    colnames and coltypes attributes, which are not part of DB-API 2 though.
-    The type_code provided by the description attribute is still equal to
-    the PostgreSQL internal type name, but now carries some more information
-    in additional attributes. The size, precision and scale information that
-    is part of the description is now properly set for numeric types.
-  - If you pass a Python list as one of the parameters to a DB-API 2 cursor,
-    it is now automatically bound using an ARRAY constructor. If you pass a
-    Python tuple, it is bound using a ROW constructor. This is useful for
-    passing records as well as making use of the IN syntax.
-  - Inversely, when a fetch method of a DB-API 2 cursor returns a PostgreSQL
-    array, it is passed to Python as a list, and when it returns a PostgreSQL
-    composite type, it is passed to Python as a named tuple. PyGreSQL uses
-    a new fast built-in parser to achieve this. Anonymous composite types are
-    also supported, but yield only an ordinary tuple containing text strings.
+    - The DB-API 2 module now always returns result rows as named tuples
+      instead of simply lists as before. The documentation explains how
+      you can restore the old behavior or use custom row objects instead.
+    - The names of the various classes used by the classic and DB-API 2
+      modules have been renamed to become simpler, more intuitive and in
+      line with the names used in the DB-API 2 documentation.
+      Since the API provides only objects of these types through constructor
+      functions, this should not cause any incompatibilities.
+    - The DB-API 2 module now supports the callproc() cursor method. Note
+      that output parameters are currently not replaced in the return value.
+    - The DB-API 2 module now supports copy operations between data streams
+      on the client and database tables via the COPY command of PostgreSQL.
+      The cursor method copy_from() can be used to copy data from the database
+      to the client, and the cursor method copy_to() can be used to copy data
+      from the client to the database.
+    - The 7-tuples returned by the description attribute of a pgdb cursor
+      are now named tuples, i.e. their elements can be also accessed by name.
+      The column names and types can now also be requested through the
+      colnames and coltypes attributes, which are not part of DB-API 2 though.
+      The type_code provided by the description attribute is still equal to
+      the PostgreSQL internal type name, but now carries some more information
+      in additional attributes. The size, precision and scale information that
+      is part of the description is now properly set for numeric types.
+    - If you pass a Python list as one of the parameters to a DB-API 2 cursor,
+      it is now automatically bound using an ARRAY constructor. If you pass a
+      Python tuple, it is bound using a ROW constructor. This is useful for
+      passing records as well as making use of the IN syntax.
+    - Inversely, when a fetch method of a DB-API 2 cursor returns a PostgreSQL
+      array, it is passed to Python as a list, and when it returns a PostgreSQL
+      composite type, it is passed to Python as a named tuple. PyGreSQL uses
+      a new fast built-in parser to achieve this. Anonymous composite types are
+      also supported, but yield only an ordinary tuple containing text strings.
 - Changes in the classic PyGreSQL module (pg):
-  - The classic interface got two new methods get_as_list() and get_as_dict()
-    returning a database table as a Python list or dict. The amount of data
-    returned can be controlled with various parameters.
-  - A method upsert() has been added to the DB wrapper class that utilitses
-    the "upsert" feature that is new in PostgreSQL 9.5. The new method nicely
-    complements the existing get/insert/update/delete() methods.
-  - When using insert/update/upsert(), you can now pass PostgreSQL arrays as
-    lists and PostgreSQL records as tuples in the classic module.
-  - Conversely, when the query method returns a PostgreSQL array, it is passed
-    to Python as a list. PostgreSQL records are converted to named tuples as
-    well, but only if you use one of the get/insert/update/delete() methods.
-    PyGreSQL uses a new fast built-in parser to achieve this.
-  - The pkey() method of the classic interface now returns tuples instead
-    of frozenset. The order of the tuples is like in the primary key index.
-  - Like the DB-API 2 module, the classic module now also returns bytea columns
-    fetched from the database as byte strings, so you don't need to call
-    unescape_bytea() any more.
-  - A method set_jsondecode() has been added for changing or removing the
-    function that automatically decodes JSON data coming from the database.
-    By default, decoding JSON is now enabled and uses the decoder function
-    in the standard library with its default parameters.
-  - The table name that is affixed to the name of the OID column returned
-    by the get() method of the classic interface will not automatically
-    be fully qualified any more. This reduces overhead from the interface,
-    but it means you must always write the table name in the same way when
-    you call the methods using it and you are using tables with OIDs.
-    Also, OIDs are now only used when access via primary key is not possible.
-    Note that OIDs are considered deprecated anyway, and they are not created
-    by default any more in PostgreSQL 8.1 and later.
-  - The internal caching and automatic quoting of class names in the classic
-    interface has been simplified and improved, it should now perform better
-    and use less memory. Also, overhead for quoting values in the DB wrapper
-    methods has been reduced and security has been improved by passing the
-    values to libpq separately as parameters instead of inline.
-  - It is now possible to use regular type names instead of the simpler
-    type names that are used by default in PyGreSQL, without breaking any
-    of the mechanisms for quoting and typecasting, which rely on the type
-    information. This is achieved while maintaining simplicity and backward
-    compatibility by augmenting the type name string objects with all the
-    necessary information under the cover. To switch regular type names on
-    or off (this is the default), call the DB wrapper method use_regtypes().
-  - A new method query_formatted() has been added to the DB wrapper class that
-    allows using the format specifications from Python.  A flag "inline"
-    can be set to specify whether parameters should be sent to the database
-    separately or formatted into the SQL.
-  - The methods for adapting and typecasting values pertaining to PostgreSQL
-    types have been refactored and swapped out to separate classes.
+    - The classic interface got two new methods get_as_list() and get_as_dict()
+      returning a database table as a Python list or dict. The amount of data
+      returned can be controlled with various parameters.
+    - A method upsert() has been added to the DB wrapper class that utilitses
+      the "upsert" feature that is new in PostgreSQL 9.5. The new method nicely
+      complements the existing get/insert/update/delete() methods.
+    - When using insert/update/upsert(), you can now pass PostgreSQL arrays as
+      lists and PostgreSQL records as tuples in the classic module.
+    - Conversely, when the query method returns a PostgreSQL array, it is 
passed
+      to Python as a list. PostgreSQL records are converted to named tuples as
+      well, but only if you use one of the get/insert/update/delete() methods.
+      PyGreSQL uses a new fast built-in parser to achieve this.
+    - The pkey() method of the classic interface now returns tuples instead
+      of frozenset. The order of the tuples is like in the primary key index.
+    - Like the DB-API 2 module, the classic module now also returns bytea
+      columns fetched from the database as byte strings, so you don't need to
+      call unescape_bytea() any more.  This has been made configurable though,
+      and you can restore the old behavior by calling set_bytea_escaped(True).
+    - A method set_jsondecode() has been added for changing or removing the
+      function that automatically decodes JSON data coming from the database.
+      By default, decoding JSON is now enabled and uses the decoder function
+      in the standard library with its default parameters.
+    - The table name that is affixed to the name of the OID column returned
+      by the get() method of the classic interface will not automatically
+      be fully qualified any more. This reduces overhead from the interface,
+      but it means you must always write the table name in the same way when
+      you call the methods using it and you are using tables with OIDs.
+      Also, OIDs are now only used when access via primary key is not possible.
+      Note that OIDs are considered deprecated anyway, and they are not created
+      by default any more in PostgreSQL 8.1 and later.
+    - The internal caching and automatic quoting of class names in the classic
+      interface has been simplified and improved, it should now perform better
+      and use less memory. Also, overhead for quoting values in the DB wrapper
+      methods has been reduced and security has been improved by passing the
+      values to libpq separately as parameters instead of inline.
+    - It is now possible to use regular type names instead of the simpler
+      type names that are used by default in PyGreSQL, without breaking any
+      of the mechanisms for quoting and typecasting, which rely on the type
+      information. This is achieved while maintaining simplicity and backward
+      compatibility by augmenting the type name string objects with all the
+      necessary information under the cover. To switch regular type names on
+      or off (this is the default), call the DB wrapper method use_regtypes().
+    - A new method query_formatted() has been added to the DB wrapper class
+      that allows using the format specifications from Python.  A flag "inline"
+      can be set to specify whether parameters should be sent to the database
+      separately or formatted into the SQL.
+    - The methods for adapting and typecasting values pertaining to PostgreSQL
+      types have been refactored and swapped out to separate classes.
 - Changes concerning both modules:
-  - The modules now provide get_typecast() and set_typecast() methods
-    allowing to control the typecasting on the global level.  The connection
-    objects have got type caches with the same methods which give control
-    over the typecasting on the level of the current connection.
-    See the documentation on details about the type cache and the typecast
-    mechanisms provided by PyGreSQL.
-  - PyGreSQL now supports the JSON and JSONB data types, converting such
-    columns automatically to and from Python objects. If you want to insert
-    Python objects as JSON data using DB-API 2, you should wrap them in the
-    new Json() type constructor as a hint to PyGreSQL.
-  - New type helpers Literal(), Json() and Bytea() have been added.
-  - Fast parsers for the input and output syntax for PostgreSQL arrays and
-    composite types have been added to the C module. Note that you can also
-    use multi-dimensional arrays with PyGreSQL.
-  - The tty parameter and attribute of database connections has been
-    removed since it is not supported any more since PostgreSQL 7.4.
+    - The modules now provide get_typecast() and set_typecast() methods
+      allowing to control the typecasting on the global level.  The connection
+      objects have got type caches with the same methods which give control
+      over the typecasting on the level of the current connection.
+      See the documentation on details about the type cache and the typecast
+      mechanisms provided by PyGreSQL.
+    - PyGreSQL now supports the JSON and JSONB data types, converting such
+      columns automatically to and from Python objects. If you want to insert
+      Python objects as JSON data using DB-API 2, you should wrap them in the
+      new Json() type constructor as a hint to PyGreSQL.
+    - New type helpers Literal(), Json() and Bytea() have been added.
+    - Fast parsers for the input and output syntax for PostgreSQL arrays and
+      composite types have been added to the C module. Note that you can also
+      use multi-dimensional arrays with PyGreSQL.
+    - The tty parameter and attribute of database connections has been
+      removed since it is not supported any more since PostgreSQL 7.4.
 
 Version 4.2 (2016-01-21)
 ------------------------

Modified: trunk/docs/contents/pg/module.rst
==============================================================================
--- trunk/docs/contents/pg/module.rst   Sun Jan 31 14:51:16 2016        (r801)
+++ trunk/docs/contents/pg/module.rst   Sun Jan 31 16:18:28 2016        (r802)
@@ -404,6 +404,39 @@
 
 .. versionadded:: 4.2
 
+get/set_bytea_escaped -- whether bytea values are returned escaped
+------------------------------------------------------------------
+
+.. function:: get_bytea_escaped()
+
+    Check whether bytea values are returned as escaped strings
+
+    :returns: whether or not bytea objects will be returned escaped
+    :rtype: bool
+
+This function checks whether PyGreSQL returns PostgreSQL ``bytea`` values in
+escaped form or in unescaped from as byte strings.  By default, bytea values
+will be returned unescaped as byte strings, but you can change this with the
+``set_bytea_escaped()`` method.
+
+.. versionadded:: 5.0
+
+.. function:: set_bytea_escaped(on)
+
+    Set whether bytea values are returned as escaped strings
+
+    :param on: whether or not bytea objects shall be returned escaped
+
+This function can be used to specify whether PyGreSQL shall return
+PostgreSQL ``bytea`` values in escaped form or in unescaped from as byte
+strings.  By default, bytea values will be returned unescaped as byte
+strings, but you can change this by calling ``set_bytea_escaped(True)``.
+
+.. versionadded:: 5.0
+
+.. versionchanged:: 5.0
+   Bytea values had been returned in escaped form in earlier versions.
+
 get/set_namedresult -- conversion to named tuples
 -------------------------------------------------
 

Modified: trunk/pgmodule.c
==============================================================================
--- trunk/pgmodule.c    Sun Jan 31 14:51:16 2016        (r801)
+++ trunk/pgmodule.c    Sun Jan 31 16:18:28 2016        (r802)
@@ -97,6 +97,7 @@
                                *jsondecode = NULL; /* function for decoding 
json strings */
 static char decimal_point = '.'; /* decimal point used in money values */
 static int use_bool = 0; /* whether or not bool objects shall be returned */
+static int bytea_escaped = 0; /* whether bytea shall be returned escaped */
 
 static int pg_encoding_utf8 = 0;
 static int pg_encoding_latin1 = 0;
@@ -283,7 +284,7 @@
                        break;
 
                case BYTEAOID:
-                       t = PYGRES_BYTEA;
+                       t = bytea_escaped ? PYGRES_TEXT : PYGRES_BYTEA;
                        break;
 
                case JSONOID:
@@ -339,7 +340,7 @@
                        break;
 
                case BYTEAARRAYOID:
-                       t = PYGRES_BYTEA | PYGRES_ARRAY;
+                       t = (bytea_escaped ? PYGRES_TEXT : PYGRES_BYTEA) | 
PYGRES_ARRAY;
                        break;
 
                case JSONARRAYOID:
@@ -393,6 +394,7 @@
        char       *tmp_str;
        size_t          str_len;
 
+    /* this function should not be called when bytea_escaped is set */
        tmp_str = (char *)PQunescapeBytea((unsigned char*)s, &str_len);
        obj = PyBytes_FromStringAndSize(tmp_str, str_len);
        if (tmp_str)
@@ -412,6 +414,7 @@
        switch (type) /* this must be the PyGreSQL internal type */
        {
                case PYGRES_BYTEA:
+                   /* this type should not be passed when bytea_escaped is set 
*/
                        /* we need to add a null byte */
                        tmp_str = (char *) PyMem_Malloc(size + 1);
                        if (!tmp_str) return PyErr_NoMemory();
@@ -5131,6 +5134,50 @@
        return ret;
 }
 
+/* check whether bytea values are unescaped */
+static char pgGetByteaEscaped__doc__[] =
+"get_bytea_escaped() -- check whether bytea will be returned escaped";
+
+static PyObject *
+pgGetByteaEscaped(PyObject *self, PyObject * args)
+{
+       PyObject *ret = NULL;
+
+       if (PyArg_ParseTuple(args, ""))
+       {
+               ret = bytea_escaped ? Py_True : Py_False;
+               Py_INCREF(ret);
+       }
+       else
+               PyErr_SetString(PyExc_TypeError,
+                       "Function get_bytea_escaped() takes no arguments");
+
+       return ret;
+}
+
+/* set usage of bool values */
+static char pgSetByteaEscaped__doc__[] =
+"set_bytea_escaped(on) -- set whether bytea will be returned escaped";
+
+static PyObject *
+pgSetByteaEscaped(PyObject *self, PyObject * args)
+{
+       PyObject *ret = NULL;
+       int                     i;
+
+       /* gets arguments */
+       if (PyArg_ParseTuple(args, "i", &i))
+       {
+               bytea_escaped = i ? 1 : 0;
+               Py_INCREF(Py_None); ret = Py_None;
+       }
+       else
+               PyErr_SetString(PyExc_TypeError,
+                       "Function set_bytea_escaped() expects a boolean value 
as argument");
+
+       return ret;
+}
+
 /* get named result factory */
 static char pgGetNamedresult__doc__[] =
 "get_namedresult() -- get the function used for getting named results";
@@ -5675,6 +5722,10 @@
                        pgSetDecimal__doc__},
        {"get_bool", (PyCFunction) pgGetBool, METH_VARARGS, pgGetBool__doc__},
        {"set_bool", (PyCFunction) pgSetBool, METH_VARARGS, pgSetBool__doc__},
+       {"get_bytea_escaped", (PyCFunction) pgGetByteaEscaped, METH_VARARGS,
+               pgGetByteaEscaped__doc__},
+       {"set_bytea_escaped", (PyCFunction) pgSetByteaEscaped, METH_VARARGS,
+               pgSetByteaEscaped__doc__},
        {"get_namedresult", (PyCFunction) pgGetNamedresult, METH_VARARGS,
                        pgGetNamedresult__doc__},
        {"set_namedresult", (PyCFunction) pgSetNamedresult, METH_VARARGS,

Modified: trunk/tests/test_classic_dbwrapper.py
==============================================================================
--- trunk/tests/test_classic_dbwrapper.py       Sun Jan 31 14:51:16 2016        
(r801)
+++ trunk/tests/test_classic_dbwrapper.py       Sun Jan 31 16:18:28 2016        
(r802)
@@ -2837,11 +2837,15 @@
         self.assertEqual(len(r), 2)
         self.assertEqual(r[0], 3)
         r = r[1]
+        if pg.get_bytea_escaped():
+            self.assertNotEqual(r, s)
+            r = pg.unescape_bytea(r)
         self.assertIsInstance(r, bytes)
         self.assertEqual(r, s)
 
     def testInsertUpdateGetBytea(self):
         query = self.db.query
+        unescape = pg.unescape_bytea if pg.get_bytea_escaped() else None
         self.createTable('bytea_test', 'n smallint primary key, data bytea')
         # insert null value
         r = self.db.insert('bytea_test', n=0, data=None)
@@ -2857,6 +2861,9 @@
         self.assertEqual(r['n'], 0)
         self.assertIn('data', r)
         r = r['data']
+        if unescape:
+            self.assertNotEqual(r, s)
+            r = unescape(r)
         self.assertIsInstance(r, bytes)
         self.assertEqual(r, s)
         r = self.db.update('bytea_test', n=0, data=None)
@@ -2869,6 +2876,9 @@
         self.assertEqual(r['n'], 5)
         self.assertIn('data', r)
         r = r['data']
+        if unescape:
+            self.assertNotEqual(r, s)
+            r = unescape(r)
         self.assertIsInstance(r, bytes)
         self.assertEqual(r, s)
         # update as bytes
@@ -2879,6 +2889,9 @@
         self.assertEqual(r['n'], 5)
         self.assertIn('data', r)
         r = r['data']
+        if unescape:
+            self.assertNotEqual(r, s)
+            r = unescape(r)
         self.assertIsInstance(r, bytes)
         self.assertEqual(r, s)
         r = query('select * from bytea_test where n=5').getresult()
@@ -2887,6 +2900,9 @@
         self.assertEqual(len(r), 2)
         self.assertEqual(r[0], 5)
         r = r[1]
+        if unescape:
+            self.assertNotEqual(r, s)
+            r = unescape(r)
         self.assertIsInstance(r, bytes)
         self.assertEqual(r, s)
         r = self.db.get('bytea_test', dict(n=5))
@@ -2895,6 +2911,9 @@
         self.assertEqual(r['n'], 5)
         self.assertIn('data', r)
         r = r['data']
+        if unescape:
+            self.assertNotEqual(r, s)
+            r = pg.unescape_bytea(r)
         self.assertIsInstance(r, bytes)
         self.assertEqual(r, s)
 
@@ -2912,6 +2931,9 @@
         self.assertIn('n', r)
         self.assertEqual(r['n'], 7)
         self.assertIn('data', r)
+        if pg.get_bytea_escaped():
+            self.assertNotEqual(r['data'], s)
+            r['data'] = pg.unescape_bytea(r['data'])
         self.assertIsInstance(r['data'], bytes)
         self.assertEqual(r['data'], s)
         r['data'] = None
@@ -2920,7 +2942,7 @@
         self.assertIn('n', r)
         self.assertEqual(r['n'], 7)
         self.assertIn('data', r)
-        self.assertIsNone(r['data'], bytes)
+        self.assertIsNone(r['data'])
 
     def testInsertGetJson(self):
         try:
@@ -3161,6 +3183,7 @@
         self.assertIsNone(r['data'][2])
 
     def testArrayOfBytea(self):
+        unescape = pg.unescape_bytea if pg.get_bytea_escaped() else None
         self.createTable('arraytest', 'data bytea[]', oids=True)
         r = self.db.get_attnames('arraytest')
         self.assertEqual(r['data'], 'bytea[]')
@@ -3168,11 +3191,17 @@
                 b"It's all \\ kinds \x00 of\r nasty \xff stuff!\n"]
         r = dict(data=data)
         self.db.insert('arraytest', r)
+        if unescape:
+            self.assertNotEqual(r['data'], data)
+            r['data'] = [unescape(v) if v else v for v in r['data']]
         self.assertEqual(r['data'], data)
         self.assertIsInstance(r['data'][1], bytes)
         self.assertIsNone(r['data'][2])
         r['data'] = None
         self.db.get('arraytest', r)
+        if unescape:
+            self.assertNotEqual(r['data'], data)
+            r['data'] = [unescape(v) if v else v for v in r['data']]
         self.assertEqual(r['data'], data)
         self.assertIsInstance(r['data'][1], bytes)
         self.assertIsNone(r['data'][2])
@@ -3606,6 +3635,8 @@
         cls.set_option('decimal', float)
         not_bool = not pg.get_bool()
         cls.set_option('bool', not_bool)
+        not_bytea_escaped = not pg.get_bytea_escaped()
+        cls.set_option('bytea_escaped', not_bytea_escaped)
         cls.set_option('namedresult', None)
         cls.set_option('jsondecode', None)
         cls.regtypes = not DB().use_regtypes()
@@ -3617,6 +3648,7 @@
         cls.reset_option('jsondecode')
         cls.reset_option('namedresult')
         cls.reset_option('bool')
+        cls.reset_option('bytea_escaped')
         cls.reset_option('decimal')
 
     @classmethod

Modified: trunk/tests/test_classic_functions.py
==============================================================================
--- trunk/tests/test_classic_functions.py       Sun Jan 31 14:51:16 2016        
(r801)
+++ trunk/tests/test_classic_functions.py       Sun Jan 31 16:18:28 2016        
(r802)
@@ -732,6 +732,29 @@
         self.assertIsInstance(r, bool)
         self.assertIs(r, use_bool)
 
+    def testGetByteaEscaped(self):
+        r = pg.get_bytea_escaped()
+        self.assertIsInstance(r, bool)
+        self.assertIs(r, False)
+
+    def testSetByteaEscaped(self):
+        bytea_escaped = pg.get_bytea_escaped()
+        try:
+            pg.set_bytea_escaped(True)
+            r = pg.get_bytea_escaped()
+            pg.set_bytea_escaped(bytea_escaped)
+            self.assertIsInstance(r, bool)
+            self.assertIs(r, True)
+            pg.set_bytea_escaped(False)
+            r = pg.get_bytea_escaped()
+            self.assertIsInstance(r, bool)
+            self.assertIs(r, False)
+        finally:
+            pg.set_bool(bytea_escaped)
+        r = pg.get_bytea_escaped()
+        self.assertIsInstance(r, bool)
+        self.assertIs(r, bytea_escaped)
+
     def testGetNamedresult(self):
         r = pg.get_namedresult()
         self.assertTrue(callable(r))
_______________________________________________
PyGreSQL mailing list
[email protected]
https://mail.vex.net/mailman/listinfo.cgi/pygresql

Reply via email to