Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2012-06-17 Thread Jeff Davis
On Fri, 2011-11-04 at 13:48 +0100, Gabriele Bartolini wrote:
 This patch adds basic support of arrays in foreign keys, by allowing to 
 define a referencing column as an array of elements having the same type 
 as the referenced column in the referenced table.
 Every NOT NULL element in the referencing array is matched against the 
 referenced table.

I'm trying to find commonalities between this feature and my future
RANGE FOREIGN KEY feature (not past the hand-waving stage yet).

The first thing I observe is that my idea for range foreign keys is
almost the opposite of your idea for array FKs.

I was imagining a range FK to mean that the referencing side is
contained by the referenced side. This is the common definition in the
temporal world, because the valid period for the referencing row must be
within the valid period for the row it references (same for transaction
time). The referenced side must be a range, and the referencing side
must be either a range of the same type or the subtype of the range.

Other similar definitions exist by replacing contained by with some
other operator, though the use cases for those aren't as clear to me.

This definition works for arrays and possibly many other types
(geometry?) as well. It looks like this is orthogonal from your work,
but it does seem like it has potential for confusion in the future.

Thoughts?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Broken system timekeeping breaks the stats collector

2012-06-17 Thread Simon Riggs
On 17 June 2012 08:26, Tom Lane t...@sss.pgh.pa.us wrote:

 (1) In backend_read_statsfile, make an initial attempt to read the stats
 file and then read GetCurrentTimestamp after that.  If the local clock
 reading is less than the stats file's timestamp, we know that some sort
 of clock skew or glitch has happened, so force an inquiry message to be
 sent with the local timestamp.  But then accept the stats file anyway,
 since the skew might be small and harmless.  The reason for the forced
 inquiry message is to cause (2) to happen at the collector.

Fine, but please log this as a WARNING system time skew detected, so
we can actually see it has happened rather than just silently
accepting the situation.

It would be useful to document whether there are any other negative
effects from altering system time.

Perhaps we should do the same test at startup to see if the clock has
gone backwards then also. Perhaps we should also make note of any
major changes in time since last startup, which might help us detect
other forms of corruption.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq compression

2012-06-17 Thread Magnus Hagander
On Sat, Jun 16, 2012 at 11:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Sat, Jun 16, 2012 at 12:55 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 It's not obvious to me that we actually *need* anything except the
 ability to recognize that a null-encrypted SSL connection probably
 shouldn't be treated as matching a hostssl line; which is not something
 that requires any fundamental rearrangements, since it only requires an
 after-the-fact check of what was selected.

 Maybe I spelled it out wrong. It does require it insofar that if we
 want to use this for compression, we must *always* enable openssl on
 the connection. So the with these encryption method boils down to
 NULL encryption only or whatever other standards I have for
 encryption. We don't need the ability to change the whatever other
 standards per subnet, but we need to control the
 accept-NULL-encryption on a per subnet basis.

 After sleeping on it, I wonder if we couldn't redefine the existing
 list of acceptable ciphers option as the list of ciphers that are
 considered to provide encrypted transport.  So you'd be allowed to
 connect with SSL using any unapproved cipher (including NULL), the
 backend just considers it as equivalent to a non-SSL connection for
 pg_hba purposes.  Then no change is needed in any configuration stuff.

That seems reasonable. In looking at our current defaults, two things hit me:

Is there a reason why we don't have a parameter on the client
mirroring ssl_ciphers?

and

Shouldn't our default SSL methods include !aNULL, meaning by default
we exclude all ciphers that don't provide authentication (which means
they can be man-in-the-middle'd). AFACIT, eNULL/NULL is disabled by
default unless explicitly enabled, but aNULL isn't..

I don't think it matters from a pure security perspective since we
look inside the actual cert anyway (which shouldn't work with these
methods, I think), but it seems like a wrong default.

That, or just have DEFAULT as being the default (which in current
openssl means ALL:!aNULL:!eNULL.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-06-17 Thread Simon Riggs
On 16 June 2012 23:09, Tom Lane t...@sss.pgh.pa.us wrote:
 Marko Kreen mark...@gmail.com writes:
 Now, looking at the problem with some perspective, the solution
 is obvious: when in single-row mode, the PQgetResult() must return
 proper PGresult for that single row.  And everything else follows that.

 * PQgetRowData(): can be called instead PQgetResult() to get raw row data
  in buffer, for more efficient processing.  This is optional feature
  that provides the original row-callback promise of avoiding unnecessary
  row data copy.

 * Although PQgetRowData() makes callback API unnecessary, it is still
  fully compatible with it - the callback should not see any difference
  whether the resultset is processed in single-row mode or
  old single-PGresult mode.  Unless it wants to - it can check
  PGRES_TUPLES_OK vs. PGRES_SINGLE_TUPLE.

 I guess this raises the question of whether we ought to revert the
 row-callback patch entirely and support only this approach.  IMO
 it is (barely) not too late to do that for 9.2, if we want to.
 If we don't want to, then this is just another new feature and
 should be considered for 9.3.

 What I like about this is the greatly simpler and harder-to-misuse
 API.  The only arguable drawback is that there's still at least one
 malloc/free cycle per tuple, imposed by the creation of a PGresult
 for each one, whereas the callback approach avoids that.  But worrying
 about that could be considered to be vast overoptimization; the backend
 has certainly spent a lot more overhead than that generating the tuple.

I prefer the description of Marko's API than the one we have now.

Adopting one API in 9.2 and another in 9.3 would be fairly bad.
Perhaps we can have both?

Can we see a performance test? Add a row processor API to libpq for
better handling of large result sets. So idea is we do this many,
many times so we need to double check the extra overhead is not a
problem in cases where the dumping overhead is significant.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-06-17 Thread Marko Kreen
On Sat, Jun 16, 2012 at 7:58 PM, Marko Kreen mark...@gmail.com wrote:
 So my preference would be to simply remove the callback API
 but keep the processing and provide PQgetRowData() instead.

This is implemented in attached patch.  It also
converts dblink to use single-row API.

The patch should be applied on top of previous
single-row patch.

Both can be seen also here:

  https://github.com/markokr/postgres/commits/single-row

-- 
marko


remove-rowproc.diff.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2012-06-17 Thread Misa Simic
IMO, both approaches make sense...

From temporal point no doubt, referencing should be contained by
referenced table

From other side could be useful if in master table are elements with
simple data type, but for some set of elements there could be common
properties in another table.. What today is doable on the way to in
another table have the same data type and repeat the same properties
for each element...That would be possible with Range data type, though
it does not mean always data are in range so array is probably better
option...

However I am not sure from maintaining point of view, i,e when an
element should be removed from that common properties set - but it is
different topic :)

Kind Regards,

Misa

Sent from my Windows Phone

-Original Message-
From: Jeff Davis
Sent: 17/06/2012 08:55
To: Gabriele Bartolini
Cc: PostgreSQL-development; Marco Nenciarini
Subject: Re: [HACKERS] [PATCH] Support for foreign keys with arrays

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-06-17 Thread Marko Kreen
On Sun, Jun 17, 2012 at 2:07 PM, Simon Riggs si...@2ndquadrant.com wrote:
 I prefer the description of Marko's API than the one we have now.

 Adopting one API in 9.2 and another in 9.3 would be fairly bad.
 Perhaps we can have both?

I see no reason the keep the (public) callback API around,
except if we don't bother to remove it now.

 Can we see a performance test? Add a row processor API to libpq for
 better handling of large result sets. So idea is we do this many,
 many times so we need to double check the extra overhead is not a
 problem in cases where the dumping overhead is significant.

Not sure what do to want to performance test.

PQgetRowData() uses exactly the same pipeline
that callbacks used.  It will use few more C calls,
not sure it make sense to benchmark them.

Recent dblink change did change palloc() + copy
zero-termination dance to PQgetResult(), which
does malloc() + copy dance internally.  This malloc
vs. palloc might be benchmarkable, but it seems to
go into micro-benchmarking world as the big win came
from avoiding buffering rows.  So yeah, maybe using
PQgetRowData() might be tiny bit faster, but I don't
expect much difference.

But all this affects new users only.  The thing that affects
everybody was the 2-step row processing change
that was done during rowproc patch.

I did benchmark it, and it seems there are column-size
+ column-count patterns where new way is faster,
and some patterns where old way is faster.  But the
difference did not raise above test noise so I concluded
it is insignificant and the malloc+copy avoidance is worth it.

Ofcourse, additional any benchmarking is welcome, so feel
free to pick any situation you care about.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-06-17 Thread Simon Riggs
On 17 June 2012 19:37, Marko Kreen mark...@gmail.com wrote:
 On Sun, Jun 17, 2012 at 2:07 PM, Simon Riggs si...@2ndquadrant.com wrote:
 I prefer the description of Marko's API than the one we have now.

 Adopting one API in 9.2 and another in 9.3 would be fairly bad.
 Perhaps we can have both?

 I see no reason the keep the (public) callback API around,
 except if we don't bother to remove it now.

OK by me.

 Can we see a performance test? Add a row processor API to libpq for
 better handling of large result sets. So idea is we do this many,
 many times so we need to double check the extra overhead is not a
 problem in cases where the dumping overhead is significant.
...
 I did benchmark it, and it seems there are column-size
 + column-count patterns where new way is faster,
 and some patterns where old way is faster.  But the
 difference did not raise above test noise so I concluded
 it is insignificant and the malloc+copy avoidance is worth it.

As long as we've checked that's fine.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2012-06-17 Thread Simon Riggs
On 17 June 2012 19:16, Misa Simic misa.si...@gmail.com wrote:

 IMO, both approaches make sense...

Agreed.

It's also a good reason to do as Peter suggests and come up with a
better description than just EACH.

Do we need something like Exclusion FKs? i.e. the FK partner of
Exclusion Constraints?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Backup docs

2012-06-17 Thread Magnus Hagander
On Sun, Jun 17, 2012 at 12:13 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Magnus Hagander mag...@hagander.net writes:
 This is copied from the old documentation. It used to say It is not
 necessary to be concerned about the amount of time elapsed between
 pg_start_backup and the start of the actual backup, nor between the
 end of the backup and pg_stop_backup.

 And the whole idea was to simplify the text at the beginning ;)

 Oh I see, not your patch to fix then. I just quick read the diff, as you
 can see.

 This is copied exactly from what it is today. I'm sure it can be
 approved, but it's not the goal of this patch. Let's not let
 perfection get in the way of improvement...

 Same.

 It does, it's under standalone hot backups. The second to last part
 of the patch.

 Perfect then.

 Sorry for the noise, regards,

np, thanks for checking. Applied.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2012-06-17 Thread Kevin Grittner
Simon Riggs  wrote:
Misa Simic  wrote:
 
 IMO, both approaches make sense...

 Agreed.
 
Can someone provide a practical example of a foreign key with array
use case?  The only situations I'm able to think of right now are the
same cases where you would now use a table with primary keys of two
tables to provide a many-to-many linkage.  Does this proposed feature
handle other cases or handle this type of case better?
 
The referencing value is contained by the referenced value has many
obvious uses.  For example, in our courts data we have a statute
table which effectively has a statute cite and effective date range
for the primary key, and we have a charge table with a statute cite
and an offense date used to match it to a statute row.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2012-06-17 Thread Misa Simic
2012/6/17 Simon Riggs si...@2ndquadrant.com


 Do we need something like Exclusion FKs? i.e. the FK partner of
 Exclusion Constraints?


+1
Definatelly it would be something usefull...  Today's workaround to achieve
that with additional table, and additional column in Key is a bit
awkward...



 --
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services



Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2012-06-17 Thread Misa Simic
2012/6/17 Kevin Grittner kevin.gritt...@wicourts.gov



 Can someone provide a practical example of a foreign key with array
 use case?  The only situations I'm able to think of right now are the
 same cases where you would now use a table with primary keys of two
 tables to provide a many-to-many linkage.  Does this proposed feature
 handle other cases or handle this type of case better?



I can't imagine either other usablity... Just many-to-one linkage... or to
have many-to-many link with less rows in middle table... What is better - I
think should be measured...


Re: [HACKERS] sortsupport for text

2012-06-17 Thread Peter Geoghegan
The fly in the ointment for strxfrm() adoption may be the need to be
consistent with this earlier behaviour:

commit 656beff59033ccc5261a615802e1a85da68e8fad
Author: Tom Lane t...@sss.pgh.pa.us
Date:   Thu Dec 22 22:50:00 2005 +

Adjust string comparison so that only bitwise-equal strings are considered
equal: if strcoll claims two strings are equal, check it with strcmp, and
sort according to strcmp if not identical.  This fixes inconsistent
behavior under glibc's hu_HU locale, and probably under some other locales
as well.  Also, take advantage of the now-well-defined behavior to speed up
texteq, textne, bpchareq, bpcharne: they may as well just do a bitwise
comparison and not bother with strcoll at all.

NOTE: affected databases may need to REINDEX indexes on text columns to be
sure they are self-consistent.

Here is the relevant code:

/*
 * In some locales strcoll() can claim that nonidentical 
strings are
 * equal.  Believing that would be bad news for a number of 
reasons,
 * so we follow Perl's lead and sort equal strings according 
to
 * strcmp().
 */
if (result == 0)
result = strcmp(a1p, a2p);

I'm not sure I agree with this decision; why should we presume to know
better than the glibc locale what constitutes equality? What are the
number of reasons referred to? It's seems very likely that the main
one was the then-need to guard against poor quality qsort()
implementations that went quadratic in the face of lots of duplicates,
but we already removed a bunch of other such hacks, because of course
we now control the qsort implementation used, and have since the year
after this commit was made, 2006.

Obviously this decision was made a number of years ago now, and at
least one person went on to rely on this behaviour, so it can only be
revisited with that in mind. However, provided we are able to say
here is a compatibility ordering operator to those that complain
about this, and provided it is appropriately listed as a compatibility
issue in the 9.3 release notes, I think it would be worth reverting
this commit to facilitate strxfrm().

How many people:

A) are using hu_HU or some other locale where this can happen?

and

B) will care?

Now, I'm sure that there is another workaround too, so this doesn't
need to be a blocker even if it is absolutely unacceptable to revert -
but I have to wonder if that's worth it. People don't have any
business relying on a sort order that is consistent in any way other
than the one they actually asked for. A few people still do even as we
go blue in the face telling them not to of course, but that's fairly
normal.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Broken system timekeeping breaks the stats collector

2012-06-17 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 Fine, but please log this as a WARNING system time skew detected, so
 we can actually see it has happened rather than just silently
 accepting the situation.

I think elog(LOG) is more appropriate, same as we have for the existing
messages for related complaints.  No one backend is going to have a
complete view of the situation, and the collector itself has to use
LOG since it has no connected client at all.  So the postmaster log
is the place to look for evidence of clock trouble.

 Perhaps we should do the same test at startup to see if the clock has
 gone backwards then also.

Uh ... backwards from what?  And what difference would it make?  We
always force an immediate write of the stats file at startup anyway.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq compression

2012-06-17 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 Is there a reason why we don't have a parameter on the client
 mirroring ssl_ciphers?

Dunno, do we need one?  I am not sure what the cipher negotiation process
looks like or which side has the freedom to choose.

 That, or just have DEFAULT as being the default (which in current
 openssl means ALL:!aNULL:!eNULL.

If our default isn't the same as the underlying default, I have to
question why not.  But are you sure this ! notation will work with
all openssl versions?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq compression

2012-06-17 Thread Magnus Hagander
On Sun, Jun 17, 2012 at 11:42 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 Is there a reason why we don't have a parameter on the client
 mirroring ssl_ciphers?

 Dunno, do we need one?  I am not sure what the cipher negotiation process
 looks like or which side has the freedom to choose.

I haven't looked into the details, but it seems reasonable that
*either* side should be able to at least define a list of ciphers it
*doens't* want to talk with.

Do we need it - well, it makes sense for the client to be able to say
I won't trust 56-bit encryption before it sends over the password,
imo..


 That, or just have DEFAULT as being the default (which in current
 openssl means ALL:!aNULL:!eNULL.

 If our default isn't the same as the underlying default, I have to
 question why not.

Yeah, that's exaclty what I'm questioning here..

  But are you sure this ! notation will work with
 all openssl versions?

Uh. We have the ! notation in our default *now*. What openssl also
supports is the text DEFAULT, which is currently the equivalent of
ALL!aNULL!eNULL. The question, which is valid of course, should be
if DEFAULT works with all openssl versions.

It would seem reasonable it does, but I haven't investigated.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq compression

2012-06-17 Thread Dave Page
On Sun, Jun 17, 2012 at 4:45 PM, Magnus Hagander mag...@hagander.net wrote:
 On Sun, Jun 17, 2012 at 11:42 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 Is there a reason why we don't have a parameter on the client
 mirroring ssl_ciphers?

 Dunno, do we need one?  I am not sure what the cipher negotiation process
 looks like or which side has the freedom to choose.

 I haven't looked into the details, but it seems reasonable that
 *either* side should be able to at least define a list of ciphers it
 *doens't* want to talk with.

 Do we need it - well, it makes sense for the client to be able to say
 I won't trust 56-bit encryption before it sends over the password,
 imo..

I would certainly like to see that.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sortsupport for text

2012-06-17 Thread Tom Lane
Peter Geoghegan pe...@2ndquadrant.com writes:
 The fly in the ointment for strxfrm() adoption may be the need to be
 consistent with this earlier behaviour:

 if strcoll claims two strings are equal, check it with strcmp, and
 sort according to strcmp if not identical.

 I'm not sure I agree with this decision; why should we presume to know
 better than the glibc locale what constitutes equality?

The killer reason why it must be like that is that you can't use hash
methods on text if text equality is some unknown condition subtly
different from bitwise equality.  My recollection is that there were
some other problems as well, but I'm too lazy to search the archives
for you.

 It's seems very likely that the main
 one was the then-need to guard against poor quality qsort()
 implementations that went quadratic in the face of lots of duplicates,

No, I don't recall that that had anything to do with it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2012-06-17 Thread Gianni Ciolli
On Sun, Jun 17, 2012 at 09:58:17AM -0500, Kevin Grittner wrote:
 Simon Riggs  wrote:
 Misa Simic  wrote:
  
  IMO, both approaches make sense...
 
  Agreed.
  
 Can someone provide a practical example of a foreign key with array
 use case?  The only situations I'm able to think of right now are the
 same cases where you would now use a table with primary keys of two
 tables to provide a many-to-many linkage.  Does this proposed feature
 handle other cases or handle this type of case better?

The way I think about array foreign keys is that they represent the
aggregated form of a classical foreign key.

In the aggregated form, each row in the referencing side represents a
group of rows in the non-aggregated form.

One advantage is that constraints on each group of rows as a whole are
now possible, because they become constraints on a single row in the
aggregated form.

Example. If you have a table of points, then you can have a table of
polygons where each polygon contains an array of points. The
non-aggregated model would instead require an additional point_polygon
table which references both the point and the polygon table, because
the point - polygon relationship is many-to-many. In the aggregated
model, you can easily specify a CHECK constraint that requires each
polygon to have at least three points, while the corresponding
condition cannot be specified in the non-aggregated model.

Cheers,
Dr. Gianni Ciolli - 2ndQuadrant Italia
PostgreSQL Training, Services and Support
gianni.cio...@2ndquadrant.it | www.2ndquadrant.it

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq compression

2012-06-17 Thread Florian Pflug
On Jun16, 2012, at 17:15 , Tom Lane wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Sat, Jun 16, 2012 at 12:55 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 It's not obvious to me that we actually *need* anything except the
 ability to recognize that a null-encrypted SSL connection probably
 shouldn't be treated as matching a hostssl line; which is not something
 that requires any fundamental rearrangements, since it only requires an
 after-the-fact check of what was selected.
 
 Maybe I spelled it out wrong. It does require it insofar that if we
 want to use this for compression, we must *always* enable openssl on
 the connection. So the with these encryption method boils down to
 NULL encryption only or whatever other standards I have for
 encryption. We don't need the ability to change the whatever other
 standards per subnet, but we need to control the
 accept-NULL-encryption on a per subnet basis.
 
 After sleeping on it, I wonder if we couldn't redefine the existing
 list of acceptable ciphers option as the list of ciphers that are
 considered to provide encrypted transport.  So you'd be allowed to
 connect with SSL using any unapproved cipher (including NULL), the
 backend just considers it as equivalent to a non-SSL connection for
 pg_hba purposes.  Then no change is needed in any configuration stuff.

Would we still tell openssl to only negotiate ciphers in the configured
list of available ciphers + NULL? If not, what happens if a connection
happens to use a cipher that is actually stronger than any cipher on
the list of acceptable ciphers list? The DBA wouldn't necessarily be
aware that such a cipher even exists, since it could have been made
available by an openssl upgrade…

But if we restrict the negotiable ciphers to the configure list + NULL,
then we're good I think.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq compression

2012-06-17 Thread Tom Lane
Florian Pflug f...@phlo.org writes:
 Would we still tell openssl to only negotiate ciphers in the configured
 list of available ciphers + NULL? If not, what happens if a connection
 happens to use a cipher that is actually stronger than any cipher on
 the list of acceptable ciphers list? The DBA wouldn't necessarily be
 aware that such a cipher even exists, since it could have been made
 available by an openssl upgrade

So?  If the DBA has gone so far as to list specific ciphers, who are
we to second guess his judgment?  It's not for us to decide that cipher
X is stronger than the ones he listed.

 But if we restrict the negotiable ciphers to the configure list + NULL,
 then we're good I think.

The fly in the ointment with any of these ideas is that the configure
list is not a list of exact cipher names, as per Magnus' comment that
the current default includes tests like !aNULL.  I am not sure that
we know how to evaluate such conditions if we are applying an
after-the-fact check on the selected cipher.  Does OpenSSL expose any
API for evaluating whether a selected cipher meets such a test?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sortsupport for text

2012-06-17 Thread Peter Geoghegan
On 17 June 2012 17:01, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm not sure I agree with this decision; why should we presume to know
 better than the glibc locale what constitutes equality?

 The killer reason why it must be like that is that you can't use hash
 methods on text if text equality is some unknown condition subtly
 different from bitwise equality.

Fair enough, but I doubt that we need to revert the changes made in
this commit to texteq in addition to the changes I'd like to see in
order to be semantically self-consistent. That is because there is
often a distinction made between equality and equivalence, and we
could adopt this distinction. strcoll() could be said to be just
making a representation that its two arguments are equivalent (and not
necessarily equal) when it returns 0. This distinction is explicitly
made in the C++ standard library, and failing to understand it can
result in bugs:

http://www.cplusplus.com/reference/algorithm/equal_range/

Note the use of the word equivalent rather than equal in the text.
equal_range is a bit of a misnomer.

This distinction is important enough to have warranted an entire
subsection of the book Effective STL by Scott Meyers, a
well-respected expert on the language. This comes up more often than
you'd think - std::set::insert determines if an element already
exists (to know if it must replace it) based on equivalency (usually,
though not necessarily, defined in terms of operator ), whereas the
find algorithm finds elements based on equality (operator==).

 My recollection is that there were
 some other problems as well, but I'm too lazy to search the archives
 for you.

Fair enough. I'll search for it myself later. I'm about to head out now.

 It's seems very likely that the main
 one was the then-need to guard against poor quality qsort()
 implementations that went quadratic in the face of lots of duplicates,

 No, I don't recall that that had anything to do with it.

Oh, okay. It looked very much like the avoid equality at all costs
thing you still see some of in tuplesort.c .

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sortsupport for text

2012-06-17 Thread Tom Lane
Peter Geoghegan pe...@2ndquadrant.com writes:
 On 17 June 2012 17:01, Tom Lane t...@sss.pgh.pa.us wrote:
 The killer reason why it must be like that is that you can't use hash
 methods on text if text equality is some unknown condition subtly
 different from bitwise equality.

 Fair enough, but I doubt that we need to revert the changes made in
 this commit to texteq in addition to the changes I'd like to see in
 order to be semantically self-consistent. That is because there is
 often a distinction made between equality and equivalence, and we
 could adopt this distinction.

How exactly do you plan to shoehorn that into SQL?  You could invent
some nonstandard equivalence operator I suppose, but what will be the
value?  We aren't going to set things up in such a way that we can't
use hash join or hash aggregation in queries that use the regular =
operator.  IMO there just aren't going to be enough people who care to
use a non-default operator.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq compression

2012-06-17 Thread Euler Taveira
On 17-06-2012 12:42, Tom Lane wrote:
 If our default isn't the same as the underlying default, I have to
 question why not.  But are you sure this ! notation will work with
 all openssl versions?
 
What is all for you? It seems we don't claim support for an specific version
or later in docs or even configure. But looking at an old version (0.9.7,
2003-12-31), it seems to support ! notation.


-- 
   Euler Taveira de Oliveira - Timbira   http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq compression

2012-06-17 Thread Euler Taveira
On 17-06-2012 12:45, Magnus Hagander wrote:
 Uh. We have the ! notation in our default *now*. What openssl also
 supports is the text DEFAULT, which is currently the equivalent of
 ALL!aNULL!eNULL. The question, which is valid of course, should be
 if DEFAULT works with all openssl versions.
 
AFAICS, DEFAULT works for ancient openssl versions (~10 years ago).


-- 
   Euler Taveira de Oliveira - Timbira   http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REVIEW: Optimize referential integrity checks (todo item)

2012-06-17 Thread Gurjeet Singh
On Sat, Jun 16, 2012 at 9:50 AM, Dean Rasheed dean.a.rash...@gmail.comwrote:

Then in HEAD:
 EXPLAIN ANALYSE UPDATE fk_table SET b=b+1, c=c+1, d=d+1;

  QUERY PLAN

 ---
  Update on fk_table  (cost=0.00..2300.00 rows=10 width=26) (actual
 time=1390.037..1390.037 rows=0 loops=1)
   -  Seq Scan on fk_table  (cost=0.00..2300.00 rows=10 width=26)
 (actual time=0.010..60.841 rows=10 loops=1)
  Trigger for constraint fk_table_e_fkey: time=210.184 calls=9
  Total runtime: 1607.626 ms
 (4 rows)

 So the RI trigger is fired 9 times, for the unchanged NULL FK rows.

 With this patch, the RI trigger is not fired at all:
 EXPLAIN ANALYSE UPDATE fk_table SET b=b+1, c=c+1, d=d+1;

  QUERY PLAN

 ---
  Update on fk_table  (cost=0.00..2300.00 rows=10 width=26) (actual
 time=1489.640..1489.640 rows=0 loops=1)
   -  Seq Scan on fk_table  (cost=0.00..2300.00 rows=10 width=26)
 (actual time=0.010..66.328 rows=10 loops=1)
  Total runtime: 1489.679 ms
 (3 rows)


 Similarly, if I update the FK column in HEAD the RI trigger is fired
 for every row:
 EXPLAIN ANALYSE UPDATE fk_table SET e=e-1;

  QUERY PLAN

 ---
  Update on fk_table  (cost=0.00..1800.00 rows=10 width=26) (actual
 time=1565.148..1565.148 rows=0 loops=1)
   -  Seq Scan on fk_table  (cost=0.00..1800.00 rows=10 width=26)
 (actual time=0.010..42.725 rows=10 loops=1)
  Trigger for constraint fk_table_e_fkey: time=705.962 calls=10
  Total runtime: 2279.408 ms
 (4 rows)

 whereas with this patch it is only fired for the non-NULL FK rows that
 are changing:
 EXPLAIN ANALYSE UPDATE fk_table SET e=e-1;

  QUERY PLAN

 ---
  Update on fk_table  (cost=0.00..5393.45 rows=299636 width=26) (actual
 time=1962.755..1962.755 rows=0 loops=1)
   -  Seq Scan on fk_table  (cost=0.00..5393.45 rows=299636 width=26)
 (actual time=0.023..52.850 rows=10 loops=1)
  Trigger for constraint fk_table_e_fkey: time=257.845 calls=1
  Total runtime: 2221.912 ms
 (4 rows)


I find it interesting that 'actual time' for top level 'Update on fk_table'
is always higher in patched versions, and yet the 'Total runtime' is lower
for the patched versions. I would've expected 'Total runtime' to be
proportional to the increase in top-level row-source's 'actual time'.

Even the time consumed by Seq scans is higher in patched version, so I
think the patch's affect on performance needs to be evaluated.

Best regards,
-- 
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] libpq compression

2012-06-17 Thread Euler Taveira
On 17-06-2012 12:45, Magnus Hagander wrote:
 On Sun, Jun 17, 2012 at 11:42 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 Is there a reason why we don't have a parameter on the client
 mirroring ssl_ciphers?

 Dunno, do we need one?  I am not sure what the cipher negotiation process
 looks like or which side has the freedom to choose.
 
Both. Client sends a cipher list and the server determines which cipher is
used getting the first supported cipher in the client list.

 I haven't looked into the details, but it seems reasonable that
 *either* side should be able to at least define a list of ciphers it
 *doens't* want to talk with.
 
+1.


-- 
   Euler Taveira de Oliveira - Timbira   http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REVIEW: Optimize referential integrity checks (todo item)

2012-06-17 Thread Kevin Grittner
Gurjeet Singh  wrote:
 Dean Rasheed wrote:
 
 in HEAD:
 ... (actual time=1390.037..1390.037 rows=0 loops=1)
 Trigger for constraint fk_table_e_fkey: time=210.184 calls=9
 Total runtime: 1607.626 ms
 
 With this patch:
 ... (actual time=1489.640..1489.640 rows=0 loops=1)
 [no triggers fired]
 Total runtime: 1489.679 ms
 
 for every row:
 ... (actual time=1565.148..1565.148 rows=0 loops=1)
 Trigger for constraint fk_table_e_fkey: time=705.962 calls=10
 Total runtime: 2279.408 ms
 
 with this patch
 ... (actual time=1962.755..1962.755 rows=0 loops=1)
 Trigger for constraint fk_table_e_fkey: time=257.845 calls=1
 Total runtime: 2221.912 ms
 
 I find it interesting that 'actual time' for top level 'Update on
 fk_table' is always higher in patched versions, and yet the 'Total
 runtime' is lower for the patched versions. I would've expected
 'Total runtime' to be proportional to the increase in top-level
 row-source's 'actual time'.
 
I figured that the trigger time was counted separately.  It seems to
add up pretty well that way.  I guess the question is whether there
is a case where the increase in seqscan time is *not* compensated by
less time in the triggers.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2012-06-17 Thread Jeff Davis
On Sun, 2012-06-17 at 21:10 +0800, Simon Riggs wrote:
 Do we need something like Exclusion FKs? i.e. the FK partner of
 Exclusion Constraints?

Yes, Inclusion Constraints. I've known we need something like that
since I did Exclusion Constraints, but I haven't gotten further than
that.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REVIEW: Optimize referential integrity checks (todo item)

2012-06-17 Thread Tom Lane
Gurjeet Singh singh.gurj...@gmail.com writes:
 On Sat, Jun 16, 2012 at 9:50 AM, Dean Rasheed dean.a.rash...@gmail.comwrote:
 I find it interesting that 'actual time' for top level 'Update on fk_table'
 is always higher in patched versions, and yet the 'Total runtime' is lower
 for the patched versions. I would've expected 'Total runtime' to be
 proportional to the increase in top-level row-source's 'actual time'.
 Even the time consumed by Seq scans is higher in patched version, so I
 think the patch's affect on performance needs to be evaluated.

AFAICS, the only way that the given patch could possibly make anything
slower is that if the old value of some tested attribute is NULL, the
comparison routines used to fall out immediately; now, they will do an
additional SPI_getbinval call to extract the new value before making
any decision.  So that would account for some small increase in the
ModifyTable runtime in cases where there are a lot of null keys in FK
rows being updated, which accurately describes Dean's test case, if not
so much the real world.  I don't have a big problem with it, since the
point of the patch is to possibly save a great deal more work in exactly
these cases.

It strikes me though that we are still leaving some money on the table.
The SQL spec says clearly that no RI action need be taken when a null
PK key value is updated to non-null, and I think this is right because
there cannot possibly be any FK rows that are considered to match the
old value.  (Note that both the spec and our FK code treat the RI
equality operators as strict, even if the underlying functions aren't
really.)  So we ought to have asymmetric logic in there when making
checks on PK rows, such that null-non-null is not considered an
interesting change.  If done properly this would remove the above-
described slowdown in the PK case.

Conversely, if an FK value is changed from non-null to null, that is
either always OK (if MATCH SIMPLE, or if MATCH FULL and all the FK
columns went to null) or a certain failure (if MATCH FULL and we
have a mix of nulls and non-nulls).  There's no need to queue a
trigger event in the always OK cases, so I think we need some
asymmetric logic in the FK case as well.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REVIEW: Optimize referential integrity checks (todo item)

2012-06-17 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 I figured that the trigger time was counted separately.

Yeah, it is.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sortsupport for text

2012-06-17 Thread Peter Geoghegan
On Jun 17, 2012 5:50 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Peter Geoghegan pe...@2ndquadrant.com writes:
  On 17 June 2012 17:01, Tom Lane t...@sss.pgh.pa.us wrote:
 How exactly do you plan to shoehorn that into SQL?  You could invent
 some nonstandard equivalence operator I suppose, but what will be the
 value?  We aren't going to set things up in such a way that we can't
 use hash join or hash aggregation in queries that use the regular =
 operator.

Right, most people won't care. You may or may not want a new
Operator for equivalency. The regular operator for equality doesn't have to
and shouldn't change. It is both useful and conceptually clean to not
guarantee that a compator can be relied upon to indicate equality and not
just equivalency.


Re: [HACKERS] Pg default's verbosity?

2012-06-17 Thread Jeff Janes
On Sat, Jun 16, 2012 at 9:00 PM,  nik9...@gmail.com wrote:
 I've always used -1-f -  file.sql. It is confusing that -1 doesn't warn you 
 when it wont work though.

Yeah, I just got bitten by that one.  Definitely violates the POLA.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming-only Remastering

2012-06-17 Thread Josh Berkus
Simon,

 The major limitation was solved by repmgr close to 2 years ago now.
 So while you're correct that the patch to fix that assumed that
 archiving worked as well, it has been possible to operate happily
 without it.

repmgr is not able to remaster using only streaming replication.  It
also requires an SSH connection, as well as a bunch of other
administative setup (and compiling from source on most platforms, a not
at all insignificant obstacle).  So you haven't solved the problem,
you've just provided a somewhat less awkward packaged workaround.

It's certainly possible to devise all kinds of workarounds for the
problem; I have a few myself in Bash and Python.  What I want is to stop
using workarounds.

Without the requirement for archiving, PostgreSQL binary replication is
almost ideally simple to set up and administer.  Turn settings on in
server A and Server B, run pg_basebackup and you're replicating.  It's
like 4 steps, all but one of which can be scripted through puppet.
However, the moment you add log-shipping to the mix things get an order
of magnitude more complicated, repmgr or not.

There's really only too things standing in the way of binary replication
being completely developer-friendly.  Remastering is the big one, and
the separate recovery.conf is the small one.  We can fix both.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming-only Remastering

2012-06-17 Thread Josh Berkus

 Instead of using re-synchronization (e.g. repmgr in its relation to
 rsync), I intend to proxy and also inspect the streaming replication
 traffic and then quiesce all standbys and figure out what node is
 farthest ahead.  Once I figure out the node that is farthest ahead, if
 it is not a node that is eligible for promotion to the master, I need
 to exchange its changes to nodes that are eligible for promotion[0],
 and then promote one of those, repointing all other standbys to that
 node. This must all take place nominally within a second or thirty.
 Conceptually it is simple, but mechanically it's somewhat intense,
 especially in relation to the inconvenience of doing this incorrectly.

So you're suggesting that it would be great to be able to
double-remaster?  i.e. given OM = Original Master, 1S = standby furthest
ahead, NM = desired new master, to do:

1S --- OM --- NM

OM dies, then:

1S --- NM

until NM is caught up, then

1S --- NM

Yes?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sortsupport for text

2012-06-17 Thread Tom Lane
Peter Geoghegan pe...@2ndquadrant.com writes:
 Right, most people won't care. You may or may not want a new
 Operator for equivalency. The regular operator for equality doesn't have to
 and shouldn't change. It is both useful and conceptually clean to not
 guarantee that a compator can be relied upon to indicate equality and not
 just equivalency.

Sure, and in general we only expect that = operators mean equivalency;
a concrete example is float8 =, which on IEEE-spec machines will say
that zero and minus zero are equal.

The trick for hashing such datatypes is to be able to guarantee that
equal values hash to the same hash code, which is typically possible
as long as you know the equality rules well enough.  We could possibly
do that for text with pure-strcoll equality if we knew all the details
of what strcoll would consider equal, but we do not.

See also citext for an example of a datatype where we can manage to
treat distinct textual values as equal and still hash them.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming-only Remastering

2012-06-17 Thread Daniel Farina
On Sun, Jun 17, 2012 at 1:11 PM, Josh Berkus j...@agliodbs.com wrote:

 Instead of using re-synchronization (e.g. repmgr in its relation to
 rsync), I intend to proxy and also inspect the streaming replication
 traffic and then quiesce all standbys and figure out what node is
 farthest ahead.  Once I figure out the node that is farthest ahead, if
 it is not a node that is eligible for promotion to the master, I need
 to exchange its changes to nodes that are eligible for promotion[0],
 and then promote one of those, repointing all other standbys to that
 node. This must all take place nominally within a second or thirty.
 Conceptually it is simple, but mechanically it's somewhat intense,
 especially in relation to the inconvenience of doing this incorrectly.

 So you're suggesting that it would be great to be able to
 double-remaster?  i.e. given OM = Original Master, 1S = standby furthest
 ahead, NM = desired new master, to do:

Yeah. Although it seems like it would degenerate to single-remastering
applied a couple times, no?

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Testing 9.2 in ~production environment

2012-06-17 Thread James Cloos
I'm giving 9.2-beta2 a test simulating a production workflow.

Everything looks OK except the speed.  Most (all?) queries take about
five to six times as long as they do with 9.1.

The configurations are essentially the same, the query plans are the same.

A (hot) example, pulled semi-randomly from a run, with the names mangled
to protect the innocent:

= 9.1 =
 Nested Loop  (cost=0.00..26.92 rows=1 width=28) (actual time=0.114..0.514 
rows=19 loops=1)
   -  Index Scan using ms_pkey on ms msg  (cost=0.00..26.03 rows=1 width=20) 
(actual time=0.026..0.207 rows=19 loops=1)
 Index Cond: ((ms_id = 407) AND (ms_id = 435) AND (mb_id = 50222))
 Filter: (status = ANY ('{0,1,2}'::integer[]))
   -  Index Scan using ph_pkey on ph pm  (cost=0.00..0.87 rows=1 width=16) 
(actual time=0.010..0.010 rows=1 loops=19)
 Index Cond: (id = msg.ph_id)
 Total runtime: 0.605 ms

= 9.2 =
 Nested Loop  (cost=0.00..30.12 rows=1 width=28) (actual time=0.439..2.540 
rows=19 loops=1)
   -  Index Scan using ms_pkey on ms msg  (cost=0.00..29.18 rows=1 width=20) 
(actual time=0.155..1.157 rows=19 loops=1)
 Index Cond: ((ms_id = 407) AND (ms_id = 435) AND (mb_id = 50222))
 Filter: (status = ANY ('{0,1,2}'::integer[]))
   -  Index Scan using ph_pkey on ph pm  (cost=0.00..0.93 rows=1 width=16) 
(actual time=0.053..0.054 rows=1 loops=19)
 Index Cond: (id = msg.ph_id)
 Total runtime: 2.752 ms

All of the tables and indices for the run in question fit into ram.  The
effective cache, work mem, costs, etc were optimized in 9.0, and kept
for 9.1 and the beta.  That the plans are the same suggests that isn't
the problem, yes?

I think I recall mention from a previous beta (but goog isn't helping me
confirm) that there is some extra debugging or such enabled in the betas.

If so, and if turning that off would provide a better comparison, where
in the src should I look?

-JimC
-- 
James Cloos cl...@jhcloos.com OpenPGP: 1024D/ED7DAEA6

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sortsupport for text

2012-06-17 Thread Peter Geoghegan
On 17 June 2012 21:26, Tom Lane t...@sss.pgh.pa.us wrote:
 Sure, and in general we only expect that = operators mean equivalency;
 a concrete example is float8 =, which on IEEE-spec machines will say
 that zero and minus zero are equal.

Right; the spec says that, and we punt to the spec. No one sensible
thinks that minus zero and zero floats are not equal, by virtue of the
fact that if you compare them in C using ==, it evaluates to true.
Equivalency as I use the term can't really be talked about without
talking about sorting or less than operators.

 The trick for hashing such datatypes is to be able to guarantee that
 equal values hash to the same hash code, which is typically possible
 as long as you know the equality rules well enough.  We could possibly
 do that for text with pure-strcoll equality if we knew all the details
 of what strcoll would consider equal, but we do not.

I see. I am tentatively suggesting that we don't change the definition
of equal, but allow that equivalent text values may not be equal.

 See also citext for an example of a datatype where we can manage to
 treat distinct textual values as equal and still hash them.

I'm not talking about textually distinct values being
equal/equivalent. That doesn't quite capture it (although I suppose a
corollary of what I am trying to express is that equivalent though
non-equal values should invariably have different textual
representations in Postgres).

I think a good example that illustrates the difference between
equivalency and equality is the German ß character (esszet), which is
regarded as equivalent to 'ss' (two 's' characters). The following
German words are sorted correctly as required by de_DE.UTF-8:

Arg
Ärgerlich
Arm
Assistant
Aßlar
Assoziation

So if you take the word Aßlar here - that is equivalent to Asslar,
and so strcoll(Aßlar, Asslar) will return 0 if you have the right
LC_COLLATE (if you tried this out for yourself and found that I was
actually lying through my teeth, pretend I said Hungarian instead of
German and some really obscure character rather than ß). It seems a
bit unsatisfactory to me that a unique constraint will happily accept
these two equivalent strings because they're not bitwise identical,
when the collation was supposed to have that normalisation baked in.
At the same time, I find it intuitively obvious that Aßlar ==
Asslar is false, and at the very least I think that very few people
would not grant that it is a not unreasonable state of affairs for
both of those two things to hold, at least on the face of it
(presuming that I wasn't actually lying about the particulars of this,
which I was, since this is apparently confined to less popular locales
and I'm too lazy to research the exact details).

Now, I do realise that there is what might appear to be a tension in
what I'm saying; it is precisely the fact that we can traverse a btree
index using comparators that allows a btree to satisfy an equality
condition (or an equivalency condition; however you choose to
characterise whatever it is that the '=' operator does).

To restate the problem: The '=' operator implies equivalency and not
equality. Or does it? Look at this code from nbtutils.c's
_bt_checkkeys() function:

test = FunctionCall2Coll(key-sk_func, key-sk_collation,
 datum, 
key-sk_argument);

if (!DatumGetBool(test))
{
/*
 * Tuple fails this qual.  If it's a required qual for 
the current
 * scan direction, then we can conclude no further 
tuples will
 * pass, either.
 *
 * Note: because we stop the scan as soon as any 
required equality
 * qual fails, it is critical that equality quals be 
used for the
 * initial positioning in _bt_first() when they are 
available. See
 * comments in _bt_first().
 */
***SNIP***

The qual is verified for the index tuple itself (the test return value
lets us know if it matches) - the equality operator is actually
called, and is actually re-verified via texteq(). So what appears to
happen is that the btree code finds equivalent tuples, and then,
knowing that all pairs of equal tuples are equivalent (but not
necessarily the inverse) checks that it actually has a tuple that's
equal/satisfies the qual. Makes sense, and by this standard I'd judge
that '=' was actually an equality operator that sometimes took
advantage of equivalency purely as an implementation detail, but I'm
not familiar enough with that part of the code to have any degree of
confidence that I haven't made a leap here that I shouldn't have.

ISTM if '=' was really a mere equivalency operator, we'd only every
check (a  b  b  a) in the btree code.

It occurs to me that we might also have a new 

Re: [HACKERS] Testing 9.2 in ~production environment

2012-06-17 Thread Andres Freund
Hi,

On Monday, June 18, 2012 12:51:51 AM James Cloos wrote:
 I'm giving 9.2-beta2 a test simulating a production workflow.
 
 Everything looks OK except the speed.  Most (all?) queries take about
 five to six times as long as they do with 9.1.
 
 The configurations are essentially the same, the query plans are the same.
Is it possible that you compiled with assertions enabled? That would roughly 
fit that magnitude. SHOW debug_assertions; Should show you whether it was 
enabled.

Greetings,

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing 9.2 in ~production environment

2012-06-17 Thread Peter Eisentraut
On sön, 2012-06-17 at 18:51 -0400, James Cloos wrote:
 I think I recall mention from a previous beta (but goog isn't helping
 me confirm) that there is some extra debugging or such enabled in the
 betas.

That depends on how you built it.  Just being a beta by itself doesn't
turn on any extra debugging.
 
 If so, and if turning that off would provide a better comparison,
 where in the src should I look?

Compare the output of pg_config --configure from both installations.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] compare lower case and upper case when encoding is utf-8

2012-06-17 Thread Peter Eisentraut
On lör, 2012-06-16 at 16:21 +0800, Quan Zongliang wrote:
 I found that lower case is less than upper case when the db is
 created 
 with utf8.
 I tried below
  locale   en_US.utf8   'A''a' false
  locale   ja_JP.utf8   'A''a' true
  locale   zh_CN.utf8   'A''a' false
 Under Windows
  locale   Chinese_China   'A''a' false
 
 I am not sure it is normal or not.
 But in Chinese, the lower case should be greater than upper, same as 
 locale C.

The operating system locale determines that, so you need to look there
if you don't agree with the result.

http://wiki.postgresql.org/wiki/FAQ#Why_do_my_strings_sort_incorrectly.3F


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sortsupport for text

2012-06-17 Thread Peter Geoghegan
On 17 June 2012 23:58, Peter Geoghegan pe...@2ndquadrant.com wrote:
 We can decree that equivalency implies equality, or make all this
 internal (which, perversely, I suppose the C++ committee people
 cannot).

Sorry, that should obviously read equality implies equivalency. We
may not have to decree it, because it may already be a tacit
assumption - I'm not sure.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sortsupport for text

2012-06-17 Thread Tom Lane
Peter Geoghegan pe...@2ndquadrant.com writes:
 ISTM if '=' was really a mere equivalency operator, we'd only every
 check (a  b  b  a) in the btree code.

You're not really making a lot of sense here, or at least I'm not
grasping the distinction you want to draw.  btree indexes (and sorting
in general) require the trichotomy law to hold, so what you say above is
tautological.  The only reason we test a = b and not a  b || a  b
is that the latter is at least twice as expensive to evaluate.
The last section of src/backend/access/nbtree/README has some notes
that you might find relevant.

 Simple question: if you were to just remove the strcmp tie-breaker for
 strcoll() in varstr_cmp(), but not touch anything else, would Postgres
 exhibit objectively incorrect behaviour?

Yes, it would, and did, before we put that in; see the archives for the
discussions that led up to the patch you mentioned earlier.

 So, I may have lost sight of why I starting on about equivalency,
 which is that it sure would be nice if we could use strxfrm to prepare
 strings for sorting, which looks to be a fairly significant win.

We could still do that as long as we were willing to store the original
strings as well as the strxfrm output.  Given the memory bloat already
implied by strxfrm, I don't think that's necessarily ridiculous on its
face --- it just makes the bar a bit higher for whether this is a win.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing 9.2 in ~production environment

2012-06-17 Thread James Cloos
 AF == Andres Freund and...@2ndquadrant.com writes:

AF Is it possible that you compiled with assertions enabled? That would 
roughly 
AF fit that magnitude. SHOW debug_assertions; Should show you whether it was 
AF enabled.

Thanks, but SHOW debug_assertions reports off.

-JimC
-- 
James Cloos cl...@jhcloos.com OpenPGP: 1024D/ED7DAEA6

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Testing 9.2 in ~production environment

2012-06-17 Thread James Cloos
 PE == Peter Eisentraut pete...@gmx.net writes:

PE That depends on how you built it.  Just being a beta by itself doesn't
PE turn on any extra debugging.

OK.  So either I misremembered or it was something no longer done.

PE That depends on how you built it.

Its a Gentoo box; both were build from their ebuilds, with the same gcc,
flags, etc.

PE Compare the output of pg_config --configure from both installations.

The only differences are 9.1 vs 9.2 in the paths.

Thanks,

-JimC
-- 
James Cloos cl...@jhcloos.com OpenPGP: 1024D/ED7DAEA6

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 10/16] Introduce the concept that wal has a 'origin' node

2012-06-17 Thread Steve Singer

On 12-06-13 01:27 PM, Andres Freund wrote:

The previous mail contained a patch with a mismerge caused by reording
commits. Corrected version attached.

Thanks to Steve Singer for noticing this quickly.



Attached is a more complete review of this patch.

I agree that we will need to identify the node a change originated at.  
We will not only want this for multi-master support but it might also be 
very helpful once we introduce things like cascaded replicas. Using a 16 
bit integer for this purpose makes sense to me.


This patch (with the previous numbered patches already applied), still 
doesn't compile.


gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
-Wformat-security -fno-strict-aliasing -fwrapv -I../../../../src/include 
-D_GNU_SOURCE   -c -o xact.o xact.c

xact.c: In function 'xact_redo_commit':
xact.c:4678: error: 'xl_xact_commit' has no member named 'origin_lsn'
make[4]: *** [xact.o] Error 1

Your complete patch set did compile.  origin_lsn gets added as part of 
your 12'th patch.  Managing so many related patches is going to be a 
pain. but it beats one big patch.  I don't think this patch actually 
requires the origin_lsn change.



Code Review
-
src/backend/utils/misc/guc.c
@@ -1598,6 +1600,16 @@ static struct config_int ConfigureNamesInt[] =
 },

 {
+{multimaster_node_id, PGC_POSTMASTER, REPLICATION_MASTER,
+gettext_noop(node id for multimaster.),
+NULL
+},
+ guc_replication_origin_id,
+InvalidMultimasterNodeId, InvalidMultimasterNodeId, 
MaxMultimasterNodeId,

+NULL, assign_replication_node_id, NULL
+},

I'd rather see us refer to this as the 'node id for logical replication' 
over the multimaster node id.  I think that terminology will be less 
controversial.  Multi-master means different things to different people 
and it is still unclear what forms of multi-master we will have 
in-core.  For example,  most people don't consider slony to be 
multi-master replication.  If a future version of slony were to feed off 
logical replication (instead of triggers) then I think it would need 
this node id to determine which node a particular change has come from.


The description inside the gettext call should probably be Sets the 
node id for . to be consistent with the description of the rest of 
the GUC's


BootStrapXLOG in xlog.c
creates a XLogRecord structure and shouldit  set xl_origin_id to the  
InvalidMultimasterNodeId?


WriteEmptyXLOG in pg_resetxlog.c might also should set xl_origin_id to a 
well defined value.  I think InvalidMultimasterNodeId should be safe 
even for a no-op record in from a node that actually has a node_id set 
on real records.


backend/replication/logical/logical.c:
XLogRecPtr current_replication_origin_lsn = {0, 0};

This variable isn't used/referenced by this patch it probably belongs as 
part of the later patch.



Steve


Andres








Re: [HACKERS] SQL standard changed behavior of ON UPDATE SET NULL/SET DEFAULT?

2012-06-17 Thread Tom Lane
I wrote:
 Anybody have DB2, or something else that might be thought to be pretty
 close to spec-compliant?

Remarkably enough, the DB2 10.1 manuals at www.ibm.com say that it
doesn't support ON UPDATE SET NULL or ON UPDATE SET DEFAULT.  I'm
disappointed in them :-(.  But anyway it seems that we'll not get
that much guidance from looking at other SQL implementations, and
what precedents there are suggest that people are using the set-all-
the-columns interpretation.

After reflection it seems clear to me that set-all-the-columns is
in fact an improvement for the SET DEFAULT case, regardless of match
style.  If we set only some of them, you get a mishmash of old and
new column values which is rather unlikely to match any row of the
referenced table.  If we always set all of them, then (at least for
constant default values) only one fallback entry is required in
the referenced table.  This can be seen in my example script upthread,
where I had to make a bogus referenceable entry 11, 0 to prevent
an RI failure on the MATCH SIMPLE update.  Having just the one fallback
entry 0, 0 definitely seems saner from an application standpoint.

I'm less sold on set-all-the-columns for the MATCH SIMPLE SET NULL
case.  In this match style, setting any referencing column to null
is sufficient to prevent an RI failure, and it could be argued that
zapping all of them discards data that might be useful.  But it does
have the advantage of predictability.

From an implementation standpoint, set-all-the-columns is definitely
easier to deal with: we won't need ri_OneKeyEqual at all any more,
and RI_FKey_setnull_upd no longer has the problem of having to deal
with variant plans depending on which columns it needs to zap.
So I'm attracted to it on that basis, but I don't want to let
implementation concerns drive the decision.

On balance I think we ought to switch to set-all-the-columns, though
only in 9.3+ --- a back-patched behavioral change doesn't seem like a
good idea.

Any objections, or anyone want to do more research before we decide?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] compare lower case and upper case when encoding is utf-8

2012-06-17 Thread Quan Zongliang

On 2012/6/18 7:13, Peter Eisentraut wrote:

On lör, 2012-06-16 at 16:21 +0800, Quan Zongliang wrote:

I found that lower case is less than upper case when the db is
created
with utf8.
I tried below
  locale   en_US.utf8   'A''a' false
  locale   ja_JP.utf8   'A''a' true
  locale   zh_CN.utf8   'A''a' false
Under Windows
  locale   Chinese_China   'A''a' false

I am not sure it is normal or not.
But in Chinese, the lower case should be greater than upper, same as
locale C.

The operating system locale determines that, so you need to look there
if you don't agree with the result.

http://wiki.postgresql.org/wiki/FAQ#Why_do_my_strings_sort_incorrectly.3F



I see, thank you.

Quan Zongliang



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Tab completion of function arguments not working in all cases

2012-06-17 Thread Josh Kupershmidt
[Hope it's OK if I move this thread to -hackers, as part of CF review.]

On Sat, Jun 9, 2012 at 2:40 AM, Dean Rasheed dean.a.rash...@gmail.com wrote:
 Hi,

 I noticed this while testing 9.2, but it seems to go back to at least
 8.3. Tab completion of function arguments doesn't work if the function
 is schema-qualified or double-quoted. So for example,

  DROP FUNCTION my_function ( TAB

 completes the functions arguments, but

  DROP FUNCTION my_schema.my_function ( TAB

 doesn't offer any completions, and nor does

  DROP FUNCTION my function ( TAB

+1 for the idea. I find the existing behavior rather confusing,
particularly the fact that a schema-qualified function name will be
tab-completed, i.e. this works.

  DROP FUNCTION my_schema.myTAB

but then, as your second example above shows, no completions are
subsequently offered for the function arguments.

As a side note unrelated to this patch, I also dislike how function
name tab-completions will not fill in the opening parenthesis, which
makes for unnecessary work for the user, as one then has to type the
parenthesis and hit tab again to get possible completions for the
function arguments. The current behavior causes:
  DROP FUNCTION my_fTAB

which completes to:
  DROP FUNCTION my_function

enter parenthesis, and hit tab:
  DROP FUNCTION my_function(TAB

which, if there is only one match, could complete to:
  DROP FUNCTION my_function(integer)

when the last three steps could have been consolidated with better
tab-completion. Perhaps this could be a TODO.

 The attached patch fixes these problems by introducing a new macro
 COMPLETE_WITH_ARG, similar to the existing COMPLETE_WITH_ATTR, which
 seems to be the nearest analogous code that covers all the edge cases.

Anyway, on to the review of the patch itself:

* Submission *

Patch applies cleanly to git head, and regression tests are not
expected for tab-completion enhancements.

* Features  Usability *

I've verified that tab-completing of the first argument to functions
for DROP FUNCTION and ALTER FUNCTION commands for the most part works
as expected. The one catch I noticed was that
Query_for_list_of_arguments wasn't restricting its results to
currently-visible functions, so with a default search_path, if you
have these two functions defined:

  public.doppelganger(text)
  my_schema.doppelganger(bytea)

and then try:

  DROP FUNCTION doppelganger(TAB

you get tab-completions for both text) and bytea(, when you
probably expected only the former. That's easy to fix though, please
see attached v2 patch.

* Coding *

The new macro COMPLETE_WITH_ARG seems fine. The existing code used
malloc() directly for DROP FUNCTION and ALTER FUNCTION
(tab-complete.c, around lines 867 and 2190), which AIUI is frowned
upon in favor of pg_malloc(). The patch avoids this ugliness by using
the new COMPLETE_WITH_ARG macro, so that's a nice fixup.

Overall, a nice fix for an overlooked piece of the tab-completion machinery.

Josh


tab-complete.funcargs.v2.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL standard changed behavior of ON UPDATE SET NULL/SET DEFAULT?

2012-06-17 Thread Tom Lane
I wrote:
 On balance I think we ought to switch to set-all-the-columns, though
 only in 9.3+ --- a back-patched behavioral change doesn't seem like a
 good idea.

And here is a draft patch for that.  I was interested to find that the
documentation already claims that all columns are set in the relevant
cases (so the docs changes here are just wordsmithing and
clarification).

regards, tom lane

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index ea840fb8468f7d1d65b572d9880d74dd0178e143..013dc7c4dacdc56bbdfd001f6a0b615ecbead84b 100644
*** a/doc/src/sgml/ddl.sgml
--- b/doc/src/sgml/ddl.sgml
*** CREATE TABLE t1 (
*** 735,741 
 /para
  
 para
! A table can contain more than one foreign key constraint.  This is
  used to implement many-to-many relationships between tables.  Say
  you have tables about products and orders, but now you want to
  allow one order to contain possibly many products (which the
--- 735,741 
 /para
  
 para
! A table can have more than one foreign key constraint.  This is
  used to implement many-to-many relationships between tables.  Say
  you have tables about products and orders, but now you want to
  allow one order to contain possibly many products (which the
*** CREATE TABLE order_items (
*** 827,837 
  row(s) referencing it should be automatically deleted as well.
  There are two other options:
  literalSET NULL/literal and literalSET DEFAULT/literal.
! These cause the referencing columns to be set to nulls or default
  values, respectively, when the referenced row is deleted.
  Note that these do not excuse you from observing any constraints.
  For example, if an action specifies literalSET DEFAULT/literal
! but the default value would not satisfy the foreign key, the
  operation will fail.
 /para
  
--- 827,838 
  row(s) referencing it should be automatically deleted as well.
  There are two other options:
  literalSET NULL/literal and literalSET DEFAULT/literal.
! These cause the referencing column(s) in the referencing row(s)
! to be set to nulls or their default
  values, respectively, when the referenced row is deleted.
  Note that these do not excuse you from observing any constraints.
  For example, if an action specifies literalSET DEFAULT/literal
! but the default value would not satisfy the foreign key constraint, the
  operation will fail.
 /para
  
*** CREATE TABLE order_items (
*** 839,851 
  Analogous to literalON DELETE/literal there is also
  literalON UPDATE/literal which is invoked when a referenced
  column is changed (updated).  The possible actions are the same.
 /para
  
 para
  Since a commandDELETE/command of a row from the referenced table
  or an commandUPDATE/command of a referenced column will require
  a scan of the referencing table for rows matching the old value, it
! is often a good idea to index the referencing columns.  Because this
  is not always needed, and there are many choices available on how
  to index, declaration of a foreign key constraint does not
  automatically create an index on the referencing columns.
--- 840,869 
  Analogous to literalON DELETE/literal there is also
  literalON UPDATE/literal which is invoked when a referenced
  column is changed (updated).  The possible actions are the same.
+ In this case, literalCASCADE/ means that the updated values of the
+ referenced column(s) should be copied into the referencing row(s).
 /para
  
 para
+ Normally, a referencing row need not satisfy the foreign key constraint
+ if any of its referencing columns are null.  If literalMATCH FULL/
+ is added to the foreign key declaration, a referencing row escapes
+ satisfying the constraint only if all its referencing columns are null
+ (so a mix of null and non-null values is guaranteed to fail a
+ literalMATCH FULL/ constraint).  If you don't want referencing rows
+ to be able to avoid satisfying the foreign key constraint, declare the
+ referencing column(s) as literalNOT NULL/.
+/para
+ 
+para
+ A foreign key must reference columns that either are a primary key or
+ form a unique constraint.  This means that the referenced columns always
+ have an index (the one underlying the primary key or unique constraint);
+ so checks on whether a referencing row has a match will be efficient.
  Since a commandDELETE/command of a row from the referenced table
  or an commandUPDATE/command of a referenced column will require
  a scan of the referencing table for rows matching the old value, it
! is often a good idea to index the referencing columns too.  Because this
  is not always needed, and there are many choices available on how
  to index, declaration of a 

Re: [HACKERS] [COMMITTERS] pgsql: New SQL functons pg_backup_in_progress() and pg_backup_start_tim

2012-06-17 Thread Daniel Farina
On Fri, Jun 15, 2012 at 8:49 PM, Magnus Hagander mag...@hagander.net wrote:
 I agree that pg_backup_in_progress() is confusing, if it returns false while
 you're running pg_basebackup. In the doc changes you proposed, you call the
 pg_start/stop_backup() a low level API for taking backups. That's not
 suitable for a function name, but I think we should work on that, and find a
 better term that works.

 Backup mode? Filesystem backup mode?

 We already have backup mode, and it covers both of them really. And
 filesystem backup mode is also what pg_basebackup does - it takes a
 filesystem backup...

 The easiest one I can think of is the manual backup mode, but in the
 other thread Simon didn't like that term.

Let me make things a bit worse since people are trying to figure out
nomenclature and positioning in the documentation, especially taking
consideration of pg_basebackup:

I think that the exclusive nature of the pg_(start|stop)_backup mode
(to use the original terminology under reconsideration) is quite
harmful, related to what was raised in
http://archives.postgresql.org/pgsql-hackers/2009-11/msg00024.php (I
revisited this in
http://archives.postgresql.org/pgsql-hackers/2011-11/msg01696.php)

After mulling over this some more, I am less on the fence about how
unfortunate it is that Postgres cannot restart when doing an
exclusive base backup: I think it is a severe anti-feature that
should gradually be retired.  pg_basebackup has the better contract
(whereby some information is very carefully inserted into the backup
to trigger archive recovery), and pg_(start|stop)_backup has a worse
one. There are more people performing archiving than there are writing
archiving tools, and the latter category should just be expected to
carefully get this right as pg_basebackup does.  Tragically,
pg_basebackup's archiving technique does not meet my requirements (and
it's a non-trivial optimization that I'm not sure makes sense in every
case, so I'm not sure it should be added), so those of us with other
archivers are left with workarounds like moving the backup file around
during the backup process.

Such a move would render the notion of a backup in progress or
single backup start time more or less obsolete.  That's not to say
that more reporting in the meantime shouldn't be added, because
changing the archiving contract will take time, and meanwhile people
are going to have to use the old contract between the archiving
software of choice and Postgres for quite a while.  However, I think
the eventual deprecation of exclusive backup mode is where things
should go, and maybe this will change someone's perception of how this
should be represented in documentation.

Finally, this complexity goes away (or rather, is moved, but hopefully
made more cohesive) if one can delegate all WAL persistence to other
software.  The presence of backup_label most basically affects whether
one wishes to recover via restore_command or the pg_xlog directory,
which only mattered insomuch that the archiver was an asynchronous
form of replication and pg_xlog was nominally a synchronous one.  It's
becoming less clear to me that these are deserving of being so
distinct in the future: they're all sources of WAL, and with syncrep
and group-commit already available, we're might be in a position to
remove some surface area and duplicated concepts in tooling. Slowly.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers