Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Peter Geoghegan
On 23 February 2011 04:36, Greg Stark gsst...@mit.edu wrote:
 This is only true for server encodings. In a client library I think
 you lose on this and do have to deal with it. I'm not sure what client
 encodings we do support that aren't ascii-supersets though, it's
 possible none of them generate quote characters this way.

I'm pretty sure all of the client encodings Tatsuo mentions are ASCII
supersets. The absence of by far the most popular non-ASCII superset
encoding, UTF-16, as a client encoding indicated that to me. It isn't
byte oriented, and Postgres is.

 I'm a bit surprised libpqxx isn't using binary mode internally though.
 This would at least avoid the problems with encoding. However I'm not
 sure things like the array binary format are really stable and
 portable enough to really use from a client library. Some datatypes
 might be dependent on the server ABI (floats -- I'm looking at you) so
 that might make it difficult or impossible.

Yes, that question is addressed here:
http://pqxx.org/development/libpqxx/wiki/BinaryTransfers .

-- 
Regards,
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Andrew Dunstan



On 02/23/2011 10:09 AM, Peter Geoghegan wrote:

On 23 February 2011 04:36, Greg Starkgsst...@mit.edu  wrote:

This is only true for server encodings. In a client library I think
you lose on this and do have to deal with it. I'm not sure what client
encodings we do support that aren't ascii-supersets though, it's
possible none of them generate quote characters this way.

I'm pretty sure all of the client encodings Tatsuo mentions are ASCII
supersets. The absence of by far the most popular non-ASCII superset
encoding, UTF-16, as a client encoding indicated that to me. It isn't
byte oriented, and Postgres is.



They are not. It's precisely because they are not that they are not 
allowed as server encodings.



cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Heikki Linnakangas

On 23.02.2011 17:16, Andrew Dunstan wrote:

On 02/23/2011 10:09 AM, Peter Geoghegan wrote:

On 23 February 2011 04:36, Greg Starkgsst...@mit.edu wrote:

This is only true for server encodings. In a client library I think
you lose on this and do have to deal with it. I'm not sure what client
encodings we do support that aren't ascii-supersets though, it's
possible none of them generate quote characters this way.

I'm pretty sure all of the client encodings Tatsuo mentions are ASCII
supersets. The absence of by far the most popular non-ASCII superset
encoding, UTF-16, as a client encoding indicated that to me. It isn't
byte oriented, and Postgres is.


They are not. It's precisely because they are not that they are not
allowed as server encodings.


To be precise, they are all ASCII supersets in the sense that a valid 
7-bit ASCII string is valid and means the same thing in all of the 
client-only encodings as well. The difference between supported 
server-encodings and those that are only supported as client_encoding is 
whether *all* bytes in a multi-byte character have the high bit set. All 
server-encodings have that property, and we rely on it in the backend. 
In the supported client-only encodings, the *first* byte of a multi-byte 
character is guaranteed to have the high bit set, but the subsequent 
bytes are not.


Even that more loose property isn't true for UTF-16, which is why we 
don't support it even as a client-only encoding.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Greg Stark
On Wed, Feb 23, 2011 at 3:16 PM, Andrew Dunstan and...@dunslane.net wrote:
 On 02/23/2011 10:09 AM, Peter Geoghegan wrote:

 On 23 February 2011 04:36, Greg Starkgsst...@mit.edu  wrote:

 This is only true for server encodings. In a client library I think
 you lose on this and do have to deal with it. I'm not sure what client
 encodings we do support that aren't ascii-supersets though, it's
 possible none of them generate quote characters this way.

 I'm pretty sure all of the client encodings Tatsuo mentions are ASCII
 supersets. The absence of by far the most popular non-ASCII superset
 encoding, UTF-16, as a client encoding indicated that to me. It isn't
 byte oriented, and Postgres is.


 They are not. It's precisely because they are not that they are not allowed
 as server encodings.


Well just as an example, BIG5 is a 16-bit encoding where the first
byte always has the high bit set. The second byte can't be a ' or 
because those aren't in the allowable range for the second byte. So it
might be safe anyways. However \ is in the allowable range so I'm not
sure.

In the case of BIG5 ascii characters are included inline so any byte
with the high bit *not* set that isn't the second byte of a two-byte
sequence is assumed to be ascii. So an ascii parser would work fine
modulo the problem above with backslashes.

But this is just a special case. Wikipedia implies it's also true for
shift-JIS but there's no guarantee it would work for other client
encodings.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 23.02.2011 17:16, Andrew Dunstan wrote:
 On 02/23/2011 10:09 AM, Peter Geoghegan wrote:
 I'm pretty sure all of the client encodings Tatsuo mentions are ASCII
 supersets.

 They are not. It's precisely because they are not that they are not
 allowed as server encodings.

 In the supported client-only encodings, the *first* byte of a multi-byte 
 character is guaranteed to have the high bit set, but the subsequent 
 bytes are not.

And to be even more specific: the problem with the client-only encodings
is that some of them allow the byte values corresponding to \ and
other escaping-critical characters to appear as non-first bytes in a
multibyte character.  This is nasty because you have to be aware of the
encoding to do escaping correctly and not break the data.  And as soon
as the server and client don't agree on what the encoding is, you have
the potential for SQL-injection security holes, not just confused data.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Merlin Moncure
On Tue, Feb 22, 2011 at 10:16 PM, Peter Geoghegan
peter.geoghega...@gmail.com wrote:
 I'm investigating the possibility of developing a utility function for
 our C++ client library, libpqxx, that produces array literals that can
 be used in prepared statements. This problem appears to be a bit of a
 tar pit, so I'm hoping that someone can help me out. My goal is to
 produce a template function that accepts arbitrarily nested standard
 library containers, that contain at the most nested level
 constants/literals of some type that can be fed into a stream, such as
 an int or a std::string.

 I'm aware that I cannot assume that types are delimited by a single
 quote, even for built-in types. I thought that I would put the onus on
 the client to specify the correct delimiter, by checking pg_type
 themselves if necessary, but default to ',' . Is this a reasonable
 approach?

 Escaping/quoting individual elements seems tricky. I have produced a
 generic and superficially well behaved implementation by using double
 quotes for constants. However, I have now opened the door to malicious
 parties injecting multiple array elements where only one is allowed,
 or causing malformed array literal errors by simply including a double
 quote of their own. It's not clear where the responsibility should
 rest for escaping constants/ensuring that constants don't contain
 double quotes. Can someone suggest a better approach? I can't very
 well use single quotes, because they are escaped/doubled up when we
 pass the array literal to something similar to PQexecPrepared(), and
 they shouldn't be - strings end up looking like this: 'has errant
 single quotes on either side'.

You can send nested arrays safely.  You just have to be very formal
about escaping *everything* both as you get it and as it goes into the
container.  This is what postgres does on the backend as it sends
arrays out the door in text.  It might be instructive to see what the
server does in terms of escaping.  Note that the way this works it's
not impossible to see 128+ consecutive backslashes when dealing with
arrays of composites.

 Since Postgres only supports encodings that are ASCII supersets, I
 don't believe that I have to consider encoding - only my clients do.

 Can someone please point me in the direction of an established client
 library/driver where all corner cases are covered, or at least enough
 of them to produce a net gain in usefulness? There may well be
 additional subtleties that have not occurred to me.

yes: libpqtypes.  it manages everything in binary.  i've been thinking
for a while that libpqtypes could be wrapped with variadic templates
or other c++ trickery.  Because libpqtypes does everything in binary,
it completely sidesteps all the escaping nastiness.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Andrew Dunstan



On 02/23/2011 10:22 AM, Heikki Linnakangas wrote:

On 23.02.2011 17:16, Andrew Dunstan wrote:

On 02/23/2011 10:09 AM, Peter Geoghegan wrote:

On 23 February 2011 04:36, Greg Starkgsst...@mit.edu wrote:

This is only true for server encodings. In a client library I think
you lose on this and do have to deal with it. I'm not sure what client
encodings we do support that aren't ascii-supersets though, it's
possible none of them generate quote characters this way.

I'm pretty sure all of the client encodings Tatsuo mentions are ASCII
supersets. The absence of by far the most popular non-ASCII superset
encoding, UTF-16, as a client encoding indicated that to me. It isn't
byte oriented, and Postgres is.


They are not. It's precisely because they are not that they are not
allowed as server encodings.


To be precise, they are all ASCII supersets in the sense that a valid 
7-bit ASCII string is valid and means the same thing in all of the 
client-only encodings as well. The difference between supported 
server-encodings and those that are only supported as client_encoding 
is whether *all* bytes in a multi-byte character have the high bit 
set. All server-encodings have that property, and we rely on it in the 
backend. In the supported client-only encodings, the *first* byte of a 
multi-byte character is guaranteed to have the high bit set, but the 
subsequent bytes are not.


Yes, that's a better explanation.



Even that more loose property isn't true for UTF-16, which is why we 
don't support it even as a client-only encoding.


The fact that UTF-16 uses nul bytes would make it particularly hard to 
handle.


There might be value in having a UTF-16 aware version of libpq that 
would translate strings into UTF-8 on the way to the server and to 
UTF-16 on the way back to the client.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Kenneth Marshall
On Wed, Feb 23, 2011 at 09:34:06AM -0600, Merlin Moncure wrote:
 On Tue, Feb 22, 2011 at 10:16 PM, Peter Geoghegan
 peter.geoghega...@gmail.com wrote:
  I'm investigating the possibility of developing a utility function for
  our C++ client library, libpqxx, that produces array literals that can
  be used in prepared statements. This problem appears to be a bit of a
  tar pit, so I'm hoping that someone can help me out. My goal is to
  produce a template function that accepts arbitrarily nested standard
  library containers, that contain at the most nested level
  constants/literals of some type that can be fed into a stream, such as
  an int or a std::string.
 
  I'm aware that I cannot assume that types are delimited by a single
  quote, even for built-in types. I thought that I would put the onus on
  the client to specify the correct delimiter, by checking pg_type
  themselves if necessary, but default to ',' . Is this a reasonable
  approach?
 
  Escaping/quoting individual elements seems tricky. I have produced a
  generic and superficially well behaved implementation by using double
  quotes for constants. However, I have now opened the door to malicious
  parties injecting multiple array elements where only one is allowed,
  or causing malformed array literal errors by simply including a double
  quote of their own. It's not clear where the responsibility should
  rest for escaping constants/ensuring that constants don't contain
  double quotes. Can someone suggest a better approach? I can't very
  well use single quotes, because they are escaped/doubled up when we
  pass the array literal to something similar to PQexecPrepared(), and
  they shouldn't be - strings end up looking like this: 'has errant
  single quotes on either side'.
 
 You can send nested arrays safely.  You just have to be very formal
 about escaping *everything* both as you get it and as it goes into the
 container.  This is what postgres does on the backend as it sends
 arrays out the door in text.  It might be instructive to see what the
 server does in terms of escaping.  Note that the way this works it's
 not impossible to see 128+ consecutive backslashes when dealing with
 arrays of composites.
 
  Since Postgres only supports encodings that are ASCII supersets, I
  don't believe that I have to consider encoding - only my clients do.
 
  Can someone please point me in the direction of an established client
  library/driver where all corner cases are covered, or at least enough
  of them to produce a net gain in usefulness? There may well be
  additional subtleties that have not occurred to me.
 
 yes: libpqtypes.  it manages everything in binary.  i've been thinking
 for a while that libpqtypes could be wrapped with variadic templates
 or other c++ trickery.  Because libpqtypes does everything in binary,
 it completely sidesteps all the escaping nastiness.
 
 merlin
 

Avoiding the escaping by using binary parameter transmission is
the best method. Shameless plug: libpqtypes is great!
I hope that it can be eventually included in the core distribution.
It is not uncommon to get It's an add-on package??? and avoidance
of pieces outside of the standard dist regardless of its value.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Andrew Dunstan



On 02/23/2011 10:40 AM, Kenneth Marshall wrote:



Can someone please point me in the direction of an established client
library/driver where all corner cases are covered, or at least enough
of them to produce a net gain in usefulness? There may well be
additional subtleties that have not occurred to me.

yes: libpqtypes.  it manages everything in binary.  i've been thinking
for a while that libpqtypes could be wrapped with variadic templates
or other c++ trickery.  Because libpqtypes does everything in binary,
it completely sidesteps all the escaping nastiness.


Avoiding the escaping by using binary parameter transmission is
the best method. Shameless plug: libpqtypes is great!
I hope that it can be eventually included in the core distribution.
It is not uncommon to get It's an add-on package??? and avoidance
of pieces outside of the standard dist regardless of its value.





Binary mode had serious limitations, such as portability.

We do need some support in libpq for constructing and deconstructing 
arrays (and probably for composites too, although that will be harder, I 
suspect).


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Andrew Chernow



Binary mode had serious limitations, such as portability.



What are the other limitations?

As far as portability is concerned, we are using it on many different 
operating systems and architectures without issue.  Even our most recent 
bump to 9.0.1 and 9.0.3 was flawless in regard to libpq/libpqtypes.



We do need some support in libpq for constructing and deconstructing
arrays (and probably for composites too, although that will be harder, I
suspect).



[sigh...]

--
Andrew Chernow
eSilo, LLC
global backup
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Peter Geoghegan
On 23 February 2011 15:34, Merlin Moncure mmonc...@gmail.com wrote:
 You can send nested arrays safely.  You just have to be very formal
 about escaping *everything* both as you get it and as it goes into the
 container.  This is what postgres does on the backend as it sends
 arrays out the door in text.  It might be instructive to see what the
 server does in terms of escaping.  Note that the way this works it's
 not impossible to see 128+ consecutive backslashes when dealing with
 arrays of composites.

Sounds tedious.

 yes: libpqtypes.  it manages everything in binary.  i've been thinking
 for a while that libpqtypes could be wrapped with variadic templates
 or other c++ trickery.  Because libpqtypes does everything in binary,
 it completely sidesteps all the escaping nastiness.

The fact that libpqtypes does everything in binary mode is
interesting, but doesn't really help me.

Variadic template support is still quite patchy, and I don't think
that it is of particular use here. My proof-of-concept implementation
uses recursive template instantiation and type traits, and just uses
C++98 features. I've attached it for your information. I might be able
to use partial template specialisation to support regular arrays too.
That hasn't been a priority, because C++ generally discourages their
use, and because it's trickier. Arrays don't know their own size,
and I want to provide a uniform, simple interface. On the other hand,
I've seen interesting things done with template specialisation on
static integral values, such as the size of arrays on the stack, so
perhaps it's possible to support arrays while having a uniform
interface.

To be clear: I don't want to take responsibility for correctly
escaping the array literal. The user has a responsibility to use a
prepared statement/explicit escaping to do that, just as they do with
a regular text value, for example. There is no additional threat of a
traditional SQL injection attack, because we cannot break out of the
array literal itself. However, within the array literal, it is
currently possible to break out of a constant/value literal using a
double quote, to perhaps inject additional values (more than
intended), or to cause malformed array literal errors. Sure, I could
write my own function to escape the constant which is wary of double
quotes, but that would have many of the same challenges as writing a
general purpose drop-in replacement for PQescapeStringConn(). It might
be just as misguided.

-- 
Regards,
Peter Geoghegan
#include sstream
#include iostream
#include string
#include vector
#include deque
#include list
#include set

#include pqxx/pqxx

using namespace std;
using namespace pqxx;

struct true_type
{
};

struct false_type
{
};

templatetypename T
struct is_container:public false_type
{
};

templatetypename T, typename U
struct is_containervectorT, U :public true_type
{
};

templatetypename T, typename U
struct is_containerdequeT, U :public true_type
{
};

templatetypename T, typename U
struct is_containerlistT, U :public true_type
{
};

templatetypename T, typename U, typename V
struct is_containersetT, U, V :public true_type
{
};

templatetypename T
void do_pg_empty_element(const T, stringstream, false_type)
{
	// do nothing; there is no element/constant
}

templatetypename container
void do_pg_empty_element(const container elm, stringstream append, true_type)
{
	typedef typename container::value_type contained_type;
	// empty inner array
	append  {;
	do_pg_empty_element(contained_type(), append, is_containercontained_type());
	append  };
}

// terminating condition: individual elements
templatetypename T
void do_pg_array_element(const T elm, stringstream append, false_type)
{
	append  \  elm  \;
}

// append outer part of nested array
templatetypename container
void do_pg_array_element(const container cnt, stringstream append, true_type)
{
	typedef typename container::const_iterator it;
	typedef typename container::value_type contained_type;

	
	it last = cnt.end();
	if(!cnt.empty())
	{
		// there is no neater way of determining if 
		// an element is the last that works across
		// std lib containers
		--last;
	}
	else
	{
		do_pg_empty_element(cnt, append, true_type());
		return;
	}

	append  {;

	for(it i = cnt.begin(); i != cnt.end(); ++i)
	{	
		do_pg_array_element(*i, append, is_containercontained_type());
		if(i != last)
			append  , ;
	}
	append  };
}

templatetypename container
string to_pg_array_constr(const container cnt)
{
	stringstream result;
	do_pg_array_element(cnt, result, true_type());
	return result.str();
}

int main()
{
	vectorstring a;
	listvectorint  b;


	for(int i = 0; i  10; ++i)
	{
		a.push_back(Peter's vector);	
		vectorint sss;
		sss.push_back(0);
		sss.push_back(1);
		sss.push_back(2);
		b.push_front(sss);			
	}
	cout  a:   to_pg_array_constr(a)  endl  endl;
	cout  b:   to_pg_array_constr(b)  endl  endl;

	connection conn(dbname=postgres);

	work test(conn, test);

	conn.prepare(unnest, SELECT 

Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Andrew Chernow

On 2/23/2011 3:06 PM, Peter Geoghegan wrote:

On 23 February 2011 15:34, Merlin Moncuremmonc...@gmail.com  wrote:

You can send nested arrays safely.  You just have to be very formal
about escaping *everything* both as you get it and as it goes into the
container.  This is what postgres does on the backend as it sends
arrays out the door in text.  It might be instructive to see what the
server does in terms of escaping.  Note that the way this works it's
not impossible to see 128+ consecutive backslashes when dealing with
arrays of composites.


Sounds tedious.



It is tedious, which is one reason why libpqtypes went binary.  There 
are some compelling performance reasons as well that affect both client 
and server.


libpqtypes was originally developed to serve a very particular need and 
wasn't aiming to be general purpose.  That came about along the way 
trying to solve the problem.  Personally, PQexec is dead to me as well 
as text results from a C/C++ app.  I see no advantage over libpqtypes in 
that context.


Unless I am missing your ultimate goal, you'd probably get what you want 
by wrapping libpqtypes.


--
Andrew Chernow
eSilo, LLC
global backup
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Andrew Dunstan



On 02/23/2011 02:21 PM, Andrew Chernow wrote:



Binary mode had serious limitations, such as portability.



What are the other limitations?

As far as portability is concerned, we are using it on many different 
operating systems and architectures without issue.  Even our most 
recent bump to 9.0.1 and 9.0.3 was flawless in regard to 
libpq/libpqtypes.



It's probably fine if you can control both ends. But there is no 
guarantee of portability, nor does it seem likely to me there ever will 
be, so I don't find your assertion terribly useful. The fact that it 
hasn't broken for you doesn't mean it can't or won't be.


The other downside I see is that binary protocols are often a lot harder 
to debug, but maybe that's just me.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Kenneth Marshall
On Wed, Feb 23, 2011 at 03:34:45PM -0500, Andrew Chernow wrote:
 On 2/23/2011 3:06 PM, Peter Geoghegan wrote:
 On 23 February 2011 15:34, Merlin Moncuremmonc...@gmail.com  wrote:
 You can send nested arrays safely.  You just have to be very formal
 about escaping *everything* both as you get it and as it goes into the
 container.  This is what postgres does on the backend as it sends
 arrays out the door in text.  It might be instructive to see what the
 server does in terms of escaping.  Note that the way this works it's
 not impossible to see 128+ consecutive backslashes when dealing with
 arrays of composites.

 Sounds tedious.


 It is tedious, which is one reason why libpqtypes went binary.  There are 
 some compelling performance reasons as well that affect both client and 
 server.

 libpqtypes was originally developed to serve a very particular need and 
 wasn't aiming to be general purpose.  That came about along the way trying 
 to solve the problem.  Personally, PQexec is dead to me as well as text 
 results from a C/C++ app.  I see no advantage over libpqtypes in that 
 context.

 Unless I am missing your ultimate goal, you'd probably get what you want by 
 wrapping libpqtypes.


The performance is one of the big reasons to use binary parameters.
Converting/packing/transmitting/unpacking/converting use a lot of
CPU resources on both the server and the client in addition to 
the larger communication resources needed by the text-based methods.

Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Andrew Chernow





It's probably fine if you can control both ends. But there is no
guarantee of portability, nor does it seem likely to me there ever will
be, so I don't find your assertion terribly useful. The fact that it
hasn't broken for you doesn't mean it can't or won't be.



All true.  If you change the protocol, libpqtypes needs to be adjusted. 
 I think that is a very fair statement.  It already toggles on server 
version around a few changes in the past ... like the money data type. 
So far, since 8.1, the number of changes to the binary protocol has put 
me to sleep :)



The other downside I see is that binary protocols are often a lot harder
to debug, but maybe that's just me.



Also very true.  However, libpqtypes addresses this by abstracting the 
end user from the binary transformation or preparation.  Instead, users 
are presented with a printf/scanf style interface.  PQexecf(conn, 
select %int4 + %int4, 4, 4)  is pretty far removed from the underlying 
byte swapping, parallel array setup for PQexecParams and other 
nastiness.  But yes, the maintainer of the library must deal with 
protocol changes and provide backward compatibility.


--
Andrew Chernow
eSilo, LLC
global backup
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Correctly producing array literals for prepared statements

2011-02-22 Thread Peter Geoghegan
I'm investigating the possibility of developing a utility function for
our C++ client library, libpqxx, that produces array literals that can
be used in prepared statements. This problem appears to be a bit of a
tar pit, so I'm hoping that someone can help me out. My goal is to
produce a template function that accepts arbitrarily nested standard
library containers, that contain at the most nested level
constants/literals of some type that can be fed into a stream, such as
an int or a std::string.

I'm aware that I cannot assume that types are delimited by a single
quote, even for built-in types. I thought that I would put the onus on
the client to specify the correct delimiter, by checking pg_type
themselves if necessary, but default to ',' . Is this a reasonable
approach?

Escaping/quoting individual elements seems tricky. I have produced a
generic and superficially well behaved implementation by using double
quotes for constants. However, I have now opened the door to malicious
parties injecting multiple array elements where only one is allowed,
or causing malformed array literal errors by simply including a double
quote of their own. It's not clear where the responsibility should
rest for escaping constants/ensuring that constants don't contain
double quotes. Can someone suggest a better approach? I can't very
well use single quotes, because they are escaped/doubled up when we
pass the array literal to something similar to PQexecPrepared(), and
they shouldn't be - strings end up looking like this: 'has errant
single quotes on either side'.

Since Postgres only supports encodings that are ASCII supersets, I
don't believe that I have to consider encoding - only my clients do.

Can someone please point me in the direction of an established client
library/driver where all corner cases are covered, or at least enough
of them to produce a net gain in usefulness? There may well be
additional subtleties that have not occurred to me.

-- 
Regards,
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-22 Thread Greg Stark
On Wed, Feb 23, 2011 at 4:16 AM, Peter Geoghegan
peter.geoghega...@gmail.com wrote:
 Since Postgres only supports encodings that are ASCII supersets, I
 don't believe that I have to consider encoding - only my clients do.


This is only true for server encodings. In a client library I think
you lose on this and do have to deal with it. I'm not sure what client
encodings we do support that aren't ascii-supersets though, it's
possible none of them generate quote characters this way.

I'm a bit surprised libpqxx isn't using binary mode internally though.
This would at least avoid the problems with encoding. However I'm not
sure things like the array binary format are really stable and
portable enough to really use from a client library. Some datatypes
might be dependent on the server ABI (floats -- I'm looking at you) so
that might make it difficult or impossible.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-22 Thread Tatsuo Ishii
 This is only true for server encodings. In a client library I think
 you lose on this and do have to deal with it. I'm not sure what client
 encodings we do support that aren't ascii-supersets though, it's
 possible none of them generate quote characters this way.

We have a clear definition what encodings are for client
only(mb/pg_wchar.h):

/* followings are for client encoding only */
PG_SJIS,/* Shift JIS 
(Winindows-932) */
PG_BIG5,/* Big5 (Windows-950) */
PG_GBK, /* GBK (Windows-936) */
PG_UHC, /* UHC (Windows-949) */
PG_GB18030, /* GB18030 */
PG_JOHAB,   /* EUC for Korean JOHAB 
*/
PG_SHIFT_JIS_2004,  /* Shift-JIS-2004 */
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers