Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility

2012-01-23 Thread Marko Kreen
On Mon, Jan 23, 2012 at 11:06 PM, Robert Haas  wrote:
> On Mon, Jan 23, 2012 at 11:15 AM, Noah Misch  wrote:
>> As I said upthread, and you appeared to agree, the protocol is independent of
>> individual data type send/recv formats.  Even if we were already adding
>> protocol v4 to PostgreSQL 9.2, having array_send() change its behavior in
>> response to the active protocol version would be wrong.
>
> Sure, it's not directly related to the active protocol version, but my
> point is that we need to decide whether we need an autonegotiation
> mechanism or some kind, or not.  We can reasonably decide that:
>
> 1. It's OK to change the binary format incompatibly, and clients must
> be prepared to deal with that, possibly assisted by a
> backward-compatibility GUC.
>
> -or else-
>
> 2. It's not OK to change the binary format incompatibility, and
> therefore we need some kind of negotiation mechanism to make sure that
> we give the new and improved format only to clients that can cope with
> it.
>
> Not being responsible from the maintenance of any PostgreSQL drivers
> whatsoever, I don't have a strong feeling about which of these is the
> case, and I'd like us to hear from the people who do.  What I do think
> is that we can't look at a GUC (however named) as a poor man's
> replacement for #2.  It's not gonna work, or at least not very well.
> If the default is off, then clients have to go through a round-trip to
> turn it on, which means that every client will have to decide whether
> to pay the price of turning it on (and possibly not recoup the
> investment) or whether to leave it off (and possibly get hosed if many
> large arrays that would have met the criteria for the optimization are
> transferred).  Furthermore, if we turn it on by default, drivers and
> applications that haven't been updated will deliver corrupted results.
>  None of that sounds very good to me.  If there are enough
> dependencies on the details of the binary format that we can't afford
> to just change it, then we'd better have a cheap and reliable way for
> clients to negotiate upward - and a GUC is not going to give us that.

Trying to solve it with startup-time negotiation, or some GUC
is a dead end, in the sense that it will actively discourage any
kind of pass-through protocol processing.  If simple protocol
processor (~pgbouncer) needs to know about some GUC,
and tune it on-the-fly, it's not a payload feature, it's a protocol feature.

Instead this should be solved with extending the per-query text/bin
flag to include version info and maybe also type groups.
Some way of saying "numerics:9.0-bin", "the-rest:8.4-text".

The groups would also solve the problem with no way
of turning on binary formats on result columns safely.

The flags could be text (~http accept), or maybe integers
for more efficiency (group code: group format ver).

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility

2012-01-23 Thread Merlin Moncure
On Mon, Jan 23, 2012 at 3:06 PM, Robert Haas  wrote:
> Not being responsible from the maintenance of any PostgreSQL drivers
> whatsoever, I don't have a strong feeling about which of these is the
> case, and I'd like us to hear from the people who do.

I'm just gonna come right out and say that GUC-based solutions are not
the way -- bytea encoding change is a perfect example of that.
IMSNHO, there's only two plausible paths ahead:

1) document the binary formats and continue with 'go at your own risk'
2) full auto-negotiation for all wire formats (text and binary).

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility

2012-01-23 Thread Robert Haas
On Mon, Jan 23, 2012 at 11:15 AM, Noah Misch  wrote:
> As I said upthread, and you appeared to agree, the protocol is independent of
> individual data type send/recv formats.  Even if we were already adding
> protocol v4 to PostgreSQL 9.2, having array_send() change its behavior in
> response to the active protocol version would be wrong.

Sure, it's not directly related to the active protocol version, but my
point is that we need to decide whether we need an autonegotiation
mechanism or some kind, or not.  We can reasonably decide that:

1. It's OK to change the binary format incompatibly, and clients must
be prepared to deal with that, possibly assisted by a
backward-compatibility GUC.

-or else-

2. It's not OK to change the binary format incompatibility, and
therefore we need some kind of negotiation mechanism to make sure that
we give the new and improved format only to clients that can cope with
it.

Not being responsible from the maintenance of any PostgreSQL drivers
whatsoever, I don't have a strong feeling about which of these is the
case, and I'd like us to hear from the people who do.  What I do think
is that we can't look at a GUC (however named) as a poor man's
replacement for #2.  It's not gonna work, or at least not very well.
If the default is off, then clients have to go through a round-trip to
turn it on, which means that every client will have to decide whether
to pay the price of turning it on (and possibly not recoup the
investment) or whether to leave it off (and possibly get hosed if many
large arrays that would have met the criteria for the optimization are
transferred).  Furthermore, if we turn it on by default, drivers and
applications that haven't been updated will deliver corrupted results.
 None of that sounds very good to me.  If there are enough
dependencies on the details of the binary format that we can't afford
to just change it, then we'd better have a cheap and reliable way for
clients to negotiate upward - and a GUC is not going to give us that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility

2012-01-23 Thread Marko Kreen
On Mon, Jan 23, 2012 at 5:34 PM, Robert Haas  wrote:
> On Mon, Jan 23, 2012 at 9:36 AM, Florian Weimer  wrote:
>> * Robert Haas:
 In this particular case, I knew that the change was coming and could
 push updated Java and Perl client libraries well before the server-side
 change hit our internal repository, but I really don't want to have to
 pay attention to such details.
>>>
>>> But if we *don't* turn this on by default, then chances are very good
>>> that it will get much less use.  That doesn't seem good either.  If
>>> it's important enough to do it at all, then IMHO it's important enough
>>> for it to be turned on by default.  We have never made any guarantee
>>> that the binary format won't change from release to release.
>>
>> The problem here are libpq-style drivers which expose the binary format
>> to the application.  The Java driver doesn't do that, but the Perl
>> driver does.  (However, both transparently decode BYTEA values received
>> in text format, which led to the compatibility issue.)
>
> I can see where that could cause some headaches... but it seems to me
> that if we take that concern seriously, it brings us right back to
> square one.  If breaking the binary format causes too much pain to
> actually go do it, then we shouldn't change it until we're breaking
> everything else anyway (i.e. new protocol version, as Tom suggested).

My suggestion - please avoid per-session-protocol.  Either something
is Postgres version-dependent or it can be toggled/tracked per request.
That means any data can either be passed through, or you need
to understand formats of Postgres version X.Y.

This kind of hints at per-request "gimme-formats-from-version-x.y"
flag for ExecuteV4 packet.  Or some equivalent of it.


Anything that cannot be processed without tracking per-session
state over whole stack (poolers/client frameworks) is major pain
to maintain.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility

2012-01-23 Thread Noah Misch
On Mon, Jan 23, 2012 at 10:34:12AM -0500, Robert Haas wrote:
> On Mon, Jan 23, 2012 at 9:36 AM, Florian Weimer  wrote:
> > * Robert Haas:
> >>> In this particular case, I knew that the change was coming and could
> >>> push updated Java and Perl client libraries well before the server-side
> >>> change hit our internal repository, but I really don't want to have to
> >>> pay attention to such details.
> >>
> >> But if we *don't* turn this on by default, then chances are very good
> >> that it will get much less use. ?That doesn't seem good either. ?If
> >> it's important enough to do it at all, then IMHO it's important enough
> >> for it to be turned on by default. ?We have never made any guarantee
> >> that the binary format won't change from release to release.
> >
> > The problem here are libpq-style drivers which expose the binary format
> > to the application. ?The Java driver doesn't do that, but the Perl
> > driver does. ?(However, both transparently decode BYTEA values received
> > in text format, which led to the compatibility issue.)
> 
> I can see where that could cause some headaches... but it seems to me
> that if we take that concern seriously, it brings us right back to
> square one.  If breaking the binary format causes too much pain to
> actually go do it, then we shouldn't change it until we're breaking
> everything else anyway (i.e. new protocol version, as Tom suggested).

As I said upthread, and you appeared to agree, the protocol is independent of
individual data type send/recv formats.  Even if we were already adding
protocol v4 to PostgreSQL 9.2, having array_send() change its behavior in
response to the active protocol version would be wrong.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility

2012-01-23 Thread Robert Haas
On Mon, Jan 23, 2012 at 9:36 AM, Florian Weimer  wrote:
> * Robert Haas:
>>> In this particular case, I knew that the change was coming and could
>>> push updated Java and Perl client libraries well before the server-side
>>> change hit our internal repository, but I really don't want to have to
>>> pay attention to such details.
>>
>> But if we *don't* turn this on by default, then chances are very good
>> that it will get much less use.  That doesn't seem good either.  If
>> it's important enough to do it at all, then IMHO it's important enough
>> for it to be turned on by default.  We have never made any guarantee
>> that the binary format won't change from release to release.
>
> The problem here are libpq-style drivers which expose the binary format
> to the application.  The Java driver doesn't do that, but the Perl
> driver does.  (However, both transparently decode BYTEA values received
> in text format, which led to the compatibility issue.)

I can see where that could cause some headaches... but it seems to me
that if we take that concern seriously, it brings us right back to
square one.  If breaking the binary format causes too much pain to
actually go do it, then we shouldn't change it until we're breaking
everything else anyway (i.e. new protocol version, as Tom suggested).

Even if you have version negotiation, it doesn't help that much in
this scenario.  Drivers that know about the new protocol can
theoretically negotiate upward, but if they expose the binary format
to their users in turn then it's a can of worms: they then need to
negotiate with their client applications to see what version the
client is OK with, and then turn around and negotiate with the server
to that level and not higher.  That strikes me as a lot of work for a
lot of people given the amount of improvement we can expect out of
this one change.

I think it's also worth noting that a GUC is not really a good
mechanism for negotiating the protocol version.  GUCs can be changed
by the local administrator on a server-wide (or per-user, or
per-database, or per-user-and-database) basis, and that's not what you
want for a protocol negotiation.  Every client will have to query the
server version and then decide whether to try to change it, and handle
errors if that fails.  All of that is going to add start-up overhead
to connections, which will need to make multiple round trips to get
everything straightened out.

I think if the only way to make this change without excessive pain is
by having a good mechanism for negotiating the protocol version, then
we need to defer it to a future release.  Now is not the time to
invent entirely new mechanisms, especially around just one example.
I'm OK with breaking it and adding a GUC for backward-compatibility,
but so far the other suggestions strike me as being not convincingly
well-enough engineered to stand the test of time, and whatever we do
here is going to be with us for quite a while.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility

2012-01-23 Thread Florian Weimer
* Robert Haas:

>> In this particular case, I knew that the change was coming and could
>> push updated Java and Perl client libraries well before the server-side
>> change hit our internal repository, but I really don't want to have to
>> pay attention to such details.
>
> But if we *don't* turn this on by default, then chances are very good
> that it will get much less use.  That doesn't seem good either.  If
> it's important enough to do it at all, then IMHO it's important enough
> for it to be turned on by default.  We have never made any guarantee
> that the binary format won't change from release to release.

The problem here are libpq-style drivers which expose the binary format
to the application.  The Java driver doesn't do that, but the Perl
driver does.  (However, both transparently decode BYTEA values received
in text format, which led to the compatibility issue.)

If you've version negotiation and you don't expose the binary format,
then all clients can use the most efficient format automatically.  If I
understand things correctly, this is where the JDBC driver is heading
(that is, automatic use of binary format).

-- 
Florian Weimer
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility

2012-01-23 Thread Robert Haas
On Mon, Jan 23, 2012 at 4:03 AM, Florian Weimer  wrote:
> * Ants Aasma:
>> I had a run in with this. JDBC driver versions < 9.0 with the default
>> configuration resulted in silent data corruption. The fix was easy, but not
>> having an useful error was what really bothered me.
>
> Same for the DBD::Pg driver.
>
> In this particular case, I knew that the change was coming and could
> push updated Java and Perl client libraries well before the server-side
> change hit our internal repository, but I really don't want to have to
> pay attention to such details.

But if we *don't* turn this on by default, then chances are very good
that it will get much less use.  That doesn't seem good either.  If
it's important enough to do it at all, then IMHO it's important enough
for it to be turned on by default.  We have never made any guarantee
that the binary format won't change from release to release.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility

2012-01-23 Thread Florian Weimer
* Ants Aasma:

> I had a run in with this. JDBC driver versions < 9.0 with the default
> configuration resulted in silent data corruption. The fix was easy, but not
> having an useful error was what really bothered me.

Same for the DBD::Pg driver.

In this particular case, I knew that the change was coming and could
push updated Java and Perl client libraries well before the server-side
change hit our internal repository, but I really don't want to have to
pay attention to such details.

-- 
Florian Weimer
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility

2012-01-21 Thread Ants Aasma
On 19 Jan 2012 21:00, "Robert Haas"  wrote:
> I agree.  It occurs to me that we recently changed the default *text*
> output format for bytea for reasons not dissimilar to those
> contemplated here.  Presumably, that's a much more disruptive change,
> and yet we've had minimal complaints because anyone who gets bitten
> can easily set bytea_output='escape' and the problem goes away.

I had a run in with this. JDBC driver versions < 9.0 with the default
configuration resulted in silent data corruption. The fix was easy, but not
having an useful error was what really bothered me.

--
Ants Aasma


Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility

2012-01-19 Thread Noah Misch
On Thu, Jan 19, 2012 at 02:00:20PM -0500, Robert Haas wrote:
> On Thu, Jan 19, 2012 at 10:37 AM, Noah Misch  wrote:
> > I agree with Merlin; the frontend/backend protocol is logically distinct 
> > from
> > the binary send/recv formats of data types. ?For one key point, the latter 
> > is
> > not exclusively core-defined; third-party extensions change their send/recv
> > formats on a different schedule. ?They can add myext.binary_format_version
> > GUCs of their own to cope in a similar way.
> 
> I agree.  It occurs to me that we recently changed the default *text*
> output format for bytea for reasons not dissimilar to those
> contemplated here.  Presumably, that's a much more disruptive change,
> and yet we've had minimal complaints because anyone who gets bitten
> can easily set bytea_output='escape' and the problem goes away.  The
> same thing seems like it would work here, only the number of people
> needing to change the parameter will probably be even smaller, because
> fewer people use binary than text.
> 
> Having said that, if we're to follow the precedent set by
> bytea_format, maybe we ought to just add
> binary_array_format={huge,ittybitty} and be done with it, rather than
> inventing a minor protocol version GUC for something that isn't really
> a protocol version change at all.  We could introduce a
> differently-named general mechanism, but I guess I'm not seeing the
> point of that either.  Just because someone has a
> backward-compatibility issue with one change of this type doesn't mean
> they have a similar issue with all of them.  So I think adding a
> special-purpose GUC is more logical and more parallel to what we've
> done in the past, and it doesn't saddle us with having to be certain
> that we've designed the mechanism generally enough to handle all the
> cases that may come later.

That makes sense.  An attraction of a single binary format version was avoiding
the "Is this worth a GUC?" conversation for each change.  However, adding a GUC
should be no more notable than bumping a binary format version.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility

2012-01-19 Thread Robert Haas
On Thu, Jan 19, 2012 at 10:37 AM, Noah Misch  wrote:
> I agree with Merlin; the frontend/backend protocol is logically distinct from
> the binary send/recv formats of data types.  For one key point, the latter is
> not exclusively core-defined; third-party extensions change their send/recv
> formats on a different schedule.  They can add myext.binary_format_version
> GUCs of their own to cope in a similar way.

I agree.  It occurs to me that we recently changed the default *text*
output format for bytea for reasons not dissimilar to those
contemplated here.  Presumably, that's a much more disruptive change,
and yet we've had minimal complaints because anyone who gets bitten
can easily set bytea_output='escape' and the problem goes away.  The
same thing seems like it would work here, only the number of people
needing to change the parameter will probably be even smaller, because
fewer people use binary than text.

Having said that, if we're to follow the precedent set by
bytea_format, maybe we ought to just add
binary_array_format={huge,ittybitty} and be done with it, rather than
inventing a minor protocol version GUC for something that isn't really
a protocol version change at all.  We could introduce a
differently-named general mechanism, but I guess I'm not seeing the
point of that either.  Just because someone has a
backward-compatibility issue with one change of this type doesn't mean
they have a similar issue with all of them.  So I think adding a
special-purpose GUC is more logical and more parallel to what we've
done in the past, and it doesn't saddle us with having to be certain
that we've designed the mechanism generally enough to handle all the
cases that may come later.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility

2012-01-19 Thread Noah Misch
Hi Mikko,

First, for everyone else: this patch provides a more-compact binary output
format for arrays.  When the array contains no NULL elements and has a
fixed-length element type, the new format saves four bytes per array element.
We could do more.  We could add support for arrays containing NULLs by way of
a nulls bitmap.  We could reduce the per-array overhead, currently 20 bytes
for a 1-D array, in addition to the per-element overhead.  Does anyone care
about these broader cases?  If so, speak now.

On Thu, Dec 01, 2011 at 04:42:43PM +0200, Mikko Tiihonen wrote:
> On 11/30/2011 06:52 PM, Merlin Moncure wrote:
>> On Mon, Nov 28, 2011 at 9:18 AM, Mikko Tiihonen
>>   wrote:
>>> Hi,
>>>
>>> As discussed few days ago it would be beneficial if we could change the v3
>>> backend<->client protocol without breaking backwards compatibility.
>>>
>>> Here is a working patch that exports a supported_binary_minor constant and a
>>> binary_minor session variable and a that can be used by clients to enable
>>> newer features.
>>>
>>> I also added an example usage where the array encoding for constant size
>>> elements is optimized if binary_minor version is new enough.
>>>
>>> I have coded the client side support for binary_minor for jdbc driver and
>>> tested that it worked. But lets first discuss if this is an acceptable path
>>> forward.

>> I think all references to
>> 'protocol' should be removed;  Binary wire formats != protocol: the
>> protocol could bump to v4 but the wire formats (by happenstance) could
>> all still continue to work -- therefore the version isn't minor (it's
>> not dependent on protocol version but only on itself).  Therefore,
>> don't much like the name 'supported_binary_minor'.  How about
>> binary_format_version?

> I was thinking that it would be possible use the new minor version to
> introduce also new protocol messages such as streaming of large data
> into database without knowing it's size beforehand.

I agree with Merlin; the frontend/backend protocol is logically distinct from
the binary send/recv formats of data types.  For one key point, the latter is
not exclusively core-defined; third-party extensions change their send/recv
formats on a different schedule.  They can add myext.binary_format_version
GUCs of their own to cope in a similar way.

Client interfaces that do not interpret individual result column data, such as
libpq, require no update for send/recv format changes.  In contrast, all
client interfaces would need changes for a new protocol message.  Most clients
doing so may as well then advertise their support unconditionally.  In
contrast, send/recv format is something individual _users_ of the same client
library may set differently.  It's reasonable to have an application that
still reads its data in send/recv format v0 even after upgrading to a version
of libpq that talks frontend/backend protocol v4.

I do think this is a great way to handle send/recv format changes.

> There needs to be decision on official policy about breaking backwards
> compatibility of postgresql clients. Is it:
>
> A) Every x.y postgres release is free to break any parts of the
>* protocol
>* text encoding
>* binary protocol
>as long as it is documented
>-> all client libraries should be coded so that they refuse to
>   support version x.y+1 or newer (they might have a option to
>   override this but there are no guarantees that it would work)
>Good: no random bugs when using old client library
>Bad: initial complaints from users before they understand that
> this is the best option for everyone

The ability to use old client libraries with new servers is more valuable than
the combined benefits of all currently-contemplated protocol improvements.

> D) My proposed compromise where there is one minor_version setting
>and code in server to support all different minor versions.
>The client requests the minor version so that all releases
>default to backwards compatible version.

This is workable.

>When there combinations starts to create too much maintenance
>overhead a new clean v4 version of protocol is specified.
>Good: Keeps full backwards compatibility
>Good: Allows introducing changes at any time
>Bad: Accumulates conditional code to server and clients until
> new version of protocol is released

We would retain support for protocol V3 for years following the first release
to support protocol V4, so think of the conditional code as lasting forever.


The patch works as advertised.  After "set binary_minor = 1", the output of
this command shrinks from 102 MiB to 63 MiB:

\copy (select array[0,1,2,3,4,5,6,7,8,9] from 
generate_series(1,100)) to mynums with binary

> --- a/src/backend/utils/misc/guc.c
> +++ b/src/backend/utils/misc/guc.c
> @@ -477,6 +477,7 @@ static intsegment_size;
>  static int   wal_block_size;
>  static int   wal_segment_size;
>  static bool integer_datetimes;
> +s