Author: cito
Date: Sun Jan 31 16:18:28 2016
New Revision: 802
Log:
Make the unescaping of bytea configurable
By default, bytea is returned unescaped in 5.0, but the old
behavior can now be restored with set_escaped_bytea().
Modified:
trunk/docs/contents/changelog.rst
trunk/docs/contents/pg/module.rst
trunk/pgmodule.c
trunk/tests/test_classic_dbwrapper.py
trunk/tests/test_classic_functions.py
Modified: trunk/docs/contents/changelog.rst
==============================================================================
--- trunk/docs/contents/changelog.rst Sun Jan 31 14:51:16 2016 (r801)
+++ trunk/docs/contents/changelog.rst Sun Jan 31 16:18:28 2016 (r802)
@@ -7,103 +7,104 @@
- The supported versions are Python 2.6 to 2.7, and 3.3 to 3.5.
- PostgreSQL is supported in all versions from 9.0 to 9.5.
- Changes in the DB-API 2 module (pgdb):
- - The DB-API 2 module now always returns result rows as named tuples
- instead of simply lists as before. The documentation explains how
- you can restore the old behavior or use custom row objects instead.
- - The names of the various classes used by the classic and DB-API 2
- modules have been renamed to become simpler, more intuitive and in
- line with the names used in the DB-API 2 documentation.
- Since the API provides only objects of these types through constructor
- functions, this should not cause any incompatibilities.
- - The DB-API 2 module now supports the callproc() cursor method. Note
- that output parameters are currently not replaced in the return value.
- - The DB-API 2 module now supports copy operations between data streams
- on the client and database tables via the COPY command of PostgreSQL.
- The cursor method copy_from() can be used to copy data from the database
- to the client, and the cursor method copy_to() can be used to copy data
- from the client to the database.
- - The 7-tuples returned by the description attribute of a pgdb cursor
- are now named tuples, i.e. their elements can be also accessed by name.
- The column names and types can now also be requested through the
- colnames and coltypes attributes, which are not part of DB-API 2 though.
- The type_code provided by the description attribute is still equal to
- the PostgreSQL internal type name, but now carries some more information
- in additional attributes. The size, precision and scale information that
- is part of the description is now properly set for numeric types.
- - If you pass a Python list as one of the parameters to a DB-API 2 cursor,
- it is now automatically bound using an ARRAY constructor. If you pass a
- Python tuple, it is bound using a ROW constructor. This is useful for
- passing records as well as making use of the IN syntax.
- - Inversely, when a fetch method of a DB-API 2 cursor returns a PostgreSQL
- array, it is passed to Python as a list, and when it returns a PostgreSQL
- composite type, it is passed to Python as a named tuple. PyGreSQL uses
- a new fast built-in parser to achieve this. Anonymous composite types are
- also supported, but yield only an ordinary tuple containing text strings.
+ - The DB-API 2 module now always returns result rows as named tuples
+ instead of simply lists as before. The documentation explains how
+ you can restore the old behavior or use custom row objects instead.
+ - The names of the various classes used by the classic and DB-API 2
+ modules have been renamed to become simpler, more intuitive and in
+ line with the names used in the DB-API 2 documentation.
+ Since the API provides only objects of these types through constructor
+ functions, this should not cause any incompatibilities.
+ - The DB-API 2 module now supports the callproc() cursor method. Note
+ that output parameters are currently not replaced in the return value.
+ - The DB-API 2 module now supports copy operations between data streams
+ on the client and database tables via the COPY command of PostgreSQL.
+ The cursor method copy_from() can be used to copy data from the database
+ to the client, and the cursor method copy_to() can be used to copy data
+ from the client to the database.
+ - The 7-tuples returned by the description attribute of a pgdb cursor
+ are now named tuples, i.e. their elements can be also accessed by name.
+ The column names and types can now also be requested through the
+ colnames and coltypes attributes, which are not part of DB-API 2 though.
+ The type_code provided by the description attribute is still equal to
+ the PostgreSQL internal type name, but now carries some more information
+ in additional attributes. The size, precision and scale information that
+ is part of the description is now properly set for numeric types.
+ - If you pass a Python list as one of the parameters to a DB-API 2 cursor,
+ it is now automatically bound using an ARRAY constructor. If you pass a
+ Python tuple, it is bound using a ROW constructor. This is useful for
+ passing records as well as making use of the IN syntax.
+ - Inversely, when a fetch method of a DB-API 2 cursor returns a PostgreSQL
+ array, it is passed to Python as a list, and when it returns a PostgreSQL
+ composite type, it is passed to Python as a named tuple. PyGreSQL uses
+ a new fast built-in parser to achieve this. Anonymous composite types are
+ also supported, but yield only an ordinary tuple containing text strings.
- Changes in the classic PyGreSQL module (pg):
- - The classic interface got two new methods get_as_list() and get_as_dict()
- returning a database table as a Python list or dict. The amount of data
- returned can be controlled with various parameters.
- - A method upsert() has been added to the DB wrapper class that utilitses
- the "upsert" feature that is new in PostgreSQL 9.5. The new method nicely
- complements the existing get/insert/update/delete() methods.
- - When using insert/update/upsert(), you can now pass PostgreSQL arrays as
- lists and PostgreSQL records as tuples in the classic module.
- - Conversely, when the query method returns a PostgreSQL array, it is passed
- to Python as a list. PostgreSQL records are converted to named tuples as
- well, but only if you use one of the get/insert/update/delete() methods.
- PyGreSQL uses a new fast built-in parser to achieve this.
- - The pkey() method of the classic interface now returns tuples instead
- of frozenset. The order of the tuples is like in the primary key index.
- - Like the DB-API 2 module, the classic module now also returns bytea columns
- fetched from the database as byte strings, so you don't need to call
- unescape_bytea() any more.
- - A method set_jsondecode() has been added for changing or removing the
- function that automatically decodes JSON data coming from the database.
- By default, decoding JSON is now enabled and uses the decoder function
- in the standard library with its default parameters.
- - The table name that is affixed to the name of the OID column returned
- by the get() method of the classic interface will not automatically
- be fully qualified any more. This reduces overhead from the interface,
- but it means you must always write the table name in the same way when
- you call the methods using it and you are using tables with OIDs.
- Also, OIDs are now only used when access via primary key is not possible.
- Note that OIDs are considered deprecated anyway, and they are not created
- by default any more in PostgreSQL 8.1 and later.
- - The internal caching and automatic quoting of class names in the classic
- interface has been simplified and improved, it should now perform better
- and use less memory. Also, overhead for quoting values in the DB wrapper
- methods has been reduced and security has been improved by passing the
- values to libpq separately as parameters instead of inline.
- - It is now possible to use regular type names instead of the simpler
- type names that are used by default in PyGreSQL, without breaking any
- of the mechanisms for quoting and typecasting, which rely on the type
- information. This is achieved while maintaining simplicity and backward
- compatibility by augmenting the type name string objects with all the
- necessary information under the cover. To switch regular type names on
- or off (this is the default), call the DB wrapper method use_regtypes().
- - A new method query_formatted() has been added to the DB wrapper class that
- allows using the format specifications from Python. A flag "inline"
- can be set to specify whether parameters should be sent to the database
- separately or formatted into the SQL.
- - The methods for adapting and typecasting values pertaining to PostgreSQL
- types have been refactored and swapped out to separate classes.
+ - The classic interface got two new methods get_as_list() and get_as_dict()
+ returning a database table as a Python list or dict. The amount of data
+ returned can be controlled with various parameters.
+ - A method upsert() has been added to the DB wrapper class that utilitses
+ the "upsert" feature that is new in PostgreSQL 9.5. The new method nicely
+ complements the existing get/insert/update/delete() methods.
+ - When using insert/update/upsert(), you can now pass PostgreSQL arrays as
+ lists and PostgreSQL records as tuples in the classic module.
+ - Conversely, when the query method returns a PostgreSQL array, it is
passed
+ to Python as a list. PostgreSQL records are converted to named tuples as
+ well, but only if you use one of the get/insert/update/delete() methods.
+ PyGreSQL uses a new fast built-in parser to achieve this.
+ - The pkey() method of the classic interface now returns tuples instead
+ of frozenset. The order of the tuples is like in the primary key index.
+ - Like the DB-API 2 module, the classic module now also returns bytea
+ columns fetched from the database as byte strings, so you don't need to
+ call unescape_bytea() any more. This has been made configurable though,
+ and you can restore the old behavior by calling set_bytea_escaped(True).
+ - A method set_jsondecode() has been added for changing or removing the
+ function that automatically decodes JSON data coming from the database.
+ By default, decoding JSON is now enabled and uses the decoder function
+ in the standard library with its default parameters.
+ - The table name that is affixed to the name of the OID column returned
+ by the get() method of the classic interface will not automatically
+ be fully qualified any more. This reduces overhead from the interface,
+ but it means you must always write the table name in the same way when
+ you call the methods using it and you are using tables with OIDs.
+ Also, OIDs are now only used when access via primary key is not possible.
+ Note that OIDs are considered deprecated anyway, and they are not created
+ by default any more in PostgreSQL 8.1 and later.
+ - The internal caching and automatic quoting of class names in the classic
+ interface has been simplified and improved, it should now perform better
+ and use less memory. Also, overhead for quoting values in the DB wrapper
+ methods has been reduced and security has been improved by passing the
+ values to libpq separately as parameters instead of inline.
+ - It is now possible to use regular type names instead of the simpler
+ type names that are used by default in PyGreSQL, without breaking any
+ of the mechanisms for quoting and typecasting, which rely on the type
+ information. This is achieved while maintaining simplicity and backward
+ compatibility by augmenting the type name string objects with all the
+ necessary information under the cover. To switch regular type names on
+ or off (this is the default), call the DB wrapper method use_regtypes().
+ - A new method query_formatted() has been added to the DB wrapper class
+ that allows using the format specifications from Python. A flag "inline"
+ can be set to specify whether parameters should be sent to the database
+ separately or formatted into the SQL.
+ - The methods for adapting and typecasting values pertaining to PostgreSQL
+ types have been refactored and swapped out to separate classes.
- Changes concerning both modules:
- - The modules now provide get_typecast() and set_typecast() methods
- allowing to control the typecasting on the global level. The connection
- objects have got type caches with the same methods which give control
- over the typecasting on the level of the current connection.
- See the documentation on details about the type cache and the typecast
- mechanisms provided by PyGreSQL.
- - PyGreSQL now supports the JSON and JSONB data types, converting such
- columns automatically to and from Python objects. If you want to insert
- Python objects as JSON data using DB-API 2, you should wrap them in the
- new Json() type constructor as a hint to PyGreSQL.
- - New type helpers Literal(), Json() and Bytea() have been added.
- - Fast parsers for the input and output syntax for PostgreSQL arrays and
- composite types have been added to the C module. Note that you can also
- use multi-dimensional arrays with PyGreSQL.
- - The tty parameter and attribute of database connections has been
- removed since it is not supported any more since PostgreSQL 7.4.
+ - The modules now provide get_typecast() and set_typecast() methods
+ allowing to control the typecasting on the global level. The connection
+ objects have got type caches with the same methods which give control
+ over the typecasting on the level of the current connection.
+ See the documentation on details about the type cache and the typecast
+ mechanisms provided by PyGreSQL.
+ - PyGreSQL now supports the JSON and JSONB data types, converting such
+ columns automatically to and from Python objects. If you want to insert
+ Python objects as JSON data using DB-API 2, you should wrap them in the
+ new Json() type constructor as a hint to PyGreSQL.
+ - New type helpers Literal(), Json() and Bytea() have been added.
+ - Fast parsers for the input and output syntax for PostgreSQL arrays and
+ composite types have been added to the C module. Note that you can also
+ use multi-dimensional arrays with PyGreSQL.
+ - The tty parameter and attribute of database connections has been
+ removed since it is not supported any more since PostgreSQL 7.4.
Version 4.2 (2016-01-21)
------------------------
Modified: trunk/docs/contents/pg/module.rst
==============================================================================
--- trunk/docs/contents/pg/module.rst Sun Jan 31 14:51:16 2016 (r801)
+++ trunk/docs/contents/pg/module.rst Sun Jan 31 16:18:28 2016 (r802)
@@ -404,6 +404,39 @@
.. versionadded:: 4.2
+get/set_bytea_escaped -- whether bytea values are returned escaped
+------------------------------------------------------------------
+
+.. function:: get_bytea_escaped()
+
+ Check whether bytea values are returned as escaped strings
+
+ :returns: whether or not bytea objects will be returned escaped
+ :rtype: bool
+
+This function checks whether PyGreSQL returns PostgreSQL ``bytea`` values in
+escaped form or in unescaped from as byte strings. By default, bytea values
+will be returned unescaped as byte strings, but you can change this with the
+``set_bytea_escaped()`` method.
+
+.. versionadded:: 5.0
+
+.. function:: set_bytea_escaped(on)
+
+ Set whether bytea values are returned as escaped strings
+
+ :param on: whether or not bytea objects shall be returned escaped
+
+This function can be used to specify whether PyGreSQL shall return
+PostgreSQL ``bytea`` values in escaped form or in unescaped from as byte
+strings. By default, bytea values will be returned unescaped as byte
+strings, but you can change this by calling ``set_bytea_escaped(True)``.
+
+.. versionadded:: 5.0
+
+.. versionchanged:: 5.0
+ Bytea values had been returned in escaped form in earlier versions.
+
get/set_namedresult -- conversion to named tuples
-------------------------------------------------
Modified: trunk/pgmodule.c
==============================================================================
--- trunk/pgmodule.c Sun Jan 31 14:51:16 2016 (r801)
+++ trunk/pgmodule.c Sun Jan 31 16:18:28 2016 (r802)
@@ -97,6 +97,7 @@
*jsondecode = NULL; /* function for decoding
json strings */
static char decimal_point = '.'; /* decimal point used in money values */
static int use_bool = 0; /* whether or not bool objects shall be returned */
+static int bytea_escaped = 0; /* whether bytea shall be returned escaped */
static int pg_encoding_utf8 = 0;
static int pg_encoding_latin1 = 0;
@@ -283,7 +284,7 @@
break;
case BYTEAOID:
- t = PYGRES_BYTEA;
+ t = bytea_escaped ? PYGRES_TEXT : PYGRES_BYTEA;
break;
case JSONOID:
@@ -339,7 +340,7 @@
break;
case BYTEAARRAYOID:
- t = PYGRES_BYTEA | PYGRES_ARRAY;
+ t = (bytea_escaped ? PYGRES_TEXT : PYGRES_BYTEA) |
PYGRES_ARRAY;
break;
case JSONARRAYOID:
@@ -393,6 +394,7 @@
char *tmp_str;
size_t str_len;
+ /* this function should not be called when bytea_escaped is set */
tmp_str = (char *)PQunescapeBytea((unsigned char*)s, &str_len);
obj = PyBytes_FromStringAndSize(tmp_str, str_len);
if (tmp_str)
@@ -412,6 +414,7 @@
switch (type) /* this must be the PyGreSQL internal type */
{
case PYGRES_BYTEA:
+ /* this type should not be passed when bytea_escaped is set
*/
/* we need to add a null byte */
tmp_str = (char *) PyMem_Malloc(size + 1);
if (!tmp_str) return PyErr_NoMemory();
@@ -5131,6 +5134,50 @@
return ret;
}
+/* check whether bytea values are unescaped */
+static char pgGetByteaEscaped__doc__[] =
+"get_bytea_escaped() -- check whether bytea will be returned escaped";
+
+static PyObject *
+pgGetByteaEscaped(PyObject *self, PyObject * args)
+{
+ PyObject *ret = NULL;
+
+ if (PyArg_ParseTuple(args, ""))
+ {
+ ret = bytea_escaped ? Py_True : Py_False;
+ Py_INCREF(ret);
+ }
+ else
+ PyErr_SetString(PyExc_TypeError,
+ "Function get_bytea_escaped() takes no arguments");
+
+ return ret;
+}
+
+/* set usage of bool values */
+static char pgSetByteaEscaped__doc__[] =
+"set_bytea_escaped(on) -- set whether bytea will be returned escaped";
+
+static PyObject *
+pgSetByteaEscaped(PyObject *self, PyObject * args)
+{
+ PyObject *ret = NULL;
+ int i;
+
+ /* gets arguments */
+ if (PyArg_ParseTuple(args, "i", &i))
+ {
+ bytea_escaped = i ? 1 : 0;
+ Py_INCREF(Py_None); ret = Py_None;
+ }
+ else
+ PyErr_SetString(PyExc_TypeError,
+ "Function set_bytea_escaped() expects a boolean value
as argument");
+
+ return ret;
+}
+
/* get named result factory */
static char pgGetNamedresult__doc__[] =
"get_namedresult() -- get the function used for getting named results";
@@ -5675,6 +5722,10 @@
pgSetDecimal__doc__},
{"get_bool", (PyCFunction) pgGetBool, METH_VARARGS, pgGetBool__doc__},
{"set_bool", (PyCFunction) pgSetBool, METH_VARARGS, pgSetBool__doc__},
+ {"get_bytea_escaped", (PyCFunction) pgGetByteaEscaped, METH_VARARGS,
+ pgGetByteaEscaped__doc__},
+ {"set_bytea_escaped", (PyCFunction) pgSetByteaEscaped, METH_VARARGS,
+ pgSetByteaEscaped__doc__},
{"get_namedresult", (PyCFunction) pgGetNamedresult, METH_VARARGS,
pgGetNamedresult__doc__},
{"set_namedresult", (PyCFunction) pgSetNamedresult, METH_VARARGS,
Modified: trunk/tests/test_classic_dbwrapper.py
==============================================================================
--- trunk/tests/test_classic_dbwrapper.py Sun Jan 31 14:51:16 2016
(r801)
+++ trunk/tests/test_classic_dbwrapper.py Sun Jan 31 16:18:28 2016
(r802)
@@ -2837,11 +2837,15 @@
self.assertEqual(len(r), 2)
self.assertEqual(r[0], 3)
r = r[1]
+ if pg.get_bytea_escaped():
+ self.assertNotEqual(r, s)
+ r = pg.unescape_bytea(r)
self.assertIsInstance(r, bytes)
self.assertEqual(r, s)
def testInsertUpdateGetBytea(self):
query = self.db.query
+ unescape = pg.unescape_bytea if pg.get_bytea_escaped() else None
self.createTable('bytea_test', 'n smallint primary key, data bytea')
# insert null value
r = self.db.insert('bytea_test', n=0, data=None)
@@ -2857,6 +2861,9 @@
self.assertEqual(r['n'], 0)
self.assertIn('data', r)
r = r['data']
+ if unescape:
+ self.assertNotEqual(r, s)
+ r = unescape(r)
self.assertIsInstance(r, bytes)
self.assertEqual(r, s)
r = self.db.update('bytea_test', n=0, data=None)
@@ -2869,6 +2876,9 @@
self.assertEqual(r['n'], 5)
self.assertIn('data', r)
r = r['data']
+ if unescape:
+ self.assertNotEqual(r, s)
+ r = unescape(r)
self.assertIsInstance(r, bytes)
self.assertEqual(r, s)
# update as bytes
@@ -2879,6 +2889,9 @@
self.assertEqual(r['n'], 5)
self.assertIn('data', r)
r = r['data']
+ if unescape:
+ self.assertNotEqual(r, s)
+ r = unescape(r)
self.assertIsInstance(r, bytes)
self.assertEqual(r, s)
r = query('select * from bytea_test where n=5').getresult()
@@ -2887,6 +2900,9 @@
self.assertEqual(len(r), 2)
self.assertEqual(r[0], 5)
r = r[1]
+ if unescape:
+ self.assertNotEqual(r, s)
+ r = unescape(r)
self.assertIsInstance(r, bytes)
self.assertEqual(r, s)
r = self.db.get('bytea_test', dict(n=5))
@@ -2895,6 +2911,9 @@
self.assertEqual(r['n'], 5)
self.assertIn('data', r)
r = r['data']
+ if unescape:
+ self.assertNotEqual(r, s)
+ r = pg.unescape_bytea(r)
self.assertIsInstance(r, bytes)
self.assertEqual(r, s)
@@ -2912,6 +2931,9 @@
self.assertIn('n', r)
self.assertEqual(r['n'], 7)
self.assertIn('data', r)
+ if pg.get_bytea_escaped():
+ self.assertNotEqual(r['data'], s)
+ r['data'] = pg.unescape_bytea(r['data'])
self.assertIsInstance(r['data'], bytes)
self.assertEqual(r['data'], s)
r['data'] = None
@@ -2920,7 +2942,7 @@
self.assertIn('n', r)
self.assertEqual(r['n'], 7)
self.assertIn('data', r)
- self.assertIsNone(r['data'], bytes)
+ self.assertIsNone(r['data'])
def testInsertGetJson(self):
try:
@@ -3161,6 +3183,7 @@
self.assertIsNone(r['data'][2])
def testArrayOfBytea(self):
+ unescape = pg.unescape_bytea if pg.get_bytea_escaped() else None
self.createTable('arraytest', 'data bytea[]', oids=True)
r = self.db.get_attnames('arraytest')
self.assertEqual(r['data'], 'bytea[]')
@@ -3168,11 +3191,17 @@
b"It's all \\ kinds \x00 of\r nasty \xff stuff!\n"]
r = dict(data=data)
self.db.insert('arraytest', r)
+ if unescape:
+ self.assertNotEqual(r['data'], data)
+ r['data'] = [unescape(v) if v else v for v in r['data']]
self.assertEqual(r['data'], data)
self.assertIsInstance(r['data'][1], bytes)
self.assertIsNone(r['data'][2])
r['data'] = None
self.db.get('arraytest', r)
+ if unescape:
+ self.assertNotEqual(r['data'], data)
+ r['data'] = [unescape(v) if v else v for v in r['data']]
self.assertEqual(r['data'], data)
self.assertIsInstance(r['data'][1], bytes)
self.assertIsNone(r['data'][2])
@@ -3606,6 +3635,8 @@
cls.set_option('decimal', float)
not_bool = not pg.get_bool()
cls.set_option('bool', not_bool)
+ not_bytea_escaped = not pg.get_bytea_escaped()
+ cls.set_option('bytea_escaped', not_bytea_escaped)
cls.set_option('namedresult', None)
cls.set_option('jsondecode', None)
cls.regtypes = not DB().use_regtypes()
@@ -3617,6 +3648,7 @@
cls.reset_option('jsondecode')
cls.reset_option('namedresult')
cls.reset_option('bool')
+ cls.reset_option('bytea_escaped')
cls.reset_option('decimal')
@classmethod
Modified: trunk/tests/test_classic_functions.py
==============================================================================
--- trunk/tests/test_classic_functions.py Sun Jan 31 14:51:16 2016
(r801)
+++ trunk/tests/test_classic_functions.py Sun Jan 31 16:18:28 2016
(r802)
@@ -732,6 +732,29 @@
self.assertIsInstance(r, bool)
self.assertIs(r, use_bool)
+ def testGetByteaEscaped(self):
+ r = pg.get_bytea_escaped()
+ self.assertIsInstance(r, bool)
+ self.assertIs(r, False)
+
+ def testSetByteaEscaped(self):
+ bytea_escaped = pg.get_bytea_escaped()
+ try:
+ pg.set_bytea_escaped(True)
+ r = pg.get_bytea_escaped()
+ pg.set_bytea_escaped(bytea_escaped)
+ self.assertIsInstance(r, bool)
+ self.assertIs(r, True)
+ pg.set_bytea_escaped(False)
+ r = pg.get_bytea_escaped()
+ self.assertIsInstance(r, bool)
+ self.assertIs(r, False)
+ finally:
+ pg.set_bool(bytea_escaped)
+ r = pg.get_bytea_escaped()
+ self.assertIsInstance(r, bool)
+ self.assertIs(r, bytea_escaped)
+
def testGetNamedresult(self):
r = pg.get_namedresult()
self.assertTrue(callable(r))
_______________________________________________
PyGreSQL mailing list
[email protected]
https://mail.vex.net/mailman/listinfo.cgi/pygresql