date:20111019

Re: [HACKERS] loss of transactions in streaming replication

2011-10-19 Thread Fujii Masao

On Wed, Oct 19, 2011 at 11:28 AM, Robert Haas robertmh...@gmail.com wrote:
 Convince me.  :-)

Yeah, I try.

 My reading of the situation is that you're talking about a problem
 that will only occur if, while the master is in the process of
 shutting down, a network error occurs.

No. This happens even if a network error doesn't occur. I can
reproduce the issue by doing the following:

1. Set up streaming replication master and standby with archive
   setting.
2. Run pgbench -i
3. Shuts down the master with fast mode.

Then I can see that the latest WAL file in the master's pg_xlog
doesn't exist in the standby's one. The WAL record which was
lost was the shutdown checkpoint one.

When smart or fast shutdown is requested, the master tries to
write and send the WAL switch (if archiving is enabled) and
shutdown checkpoint record. Because of the problem I described,
the WAL switch record arrives at the standby but the shutdown
checkpoint does not.

 I am not sure it's a good idea
 to convolute the code to handle that case, because (1) there are going
 to be many similar situations where nothing within our power is
 sufficient to prevent WAL from failing to make it to the standby and

Shutting down the master is not a rare case. So I think it's worth
doing something.

 (2) for this marginal improvement, you're giving up including
 PQerrorMessage(streamConn) in the error message that ultimately gets
 omitted, which seems like a substantial regression as far as
 debuggability is concerned.

I think that it's possible to include PQerrorMessage() in the error
message. Will change the patch.

 Even if we do decide that we want the
 change in behavior, I see no compelling reason to back-patch it.
 Stable releases are supposed to be stable, not change behavior because
 we thought of something we like better than what we originally
 released.

The original behavior, in 9.0, is that all outstanding WAL are
replicated to the standby when the master shuts down normally.
But ISTM the behavior was changed unexpectedly in 9.1. So
I think that it should be back-patched to 9.1 to revert the behavior
to the original.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Online base backup from the hot-standby

2011-10-19 Thread Jun Ishiduka


 As I suggested in the reply to Simon, I think that the change of FPW
 should be WAL-logged separately from that of HS parameters. ISTM
 packing them in one WAL record makes XLogReportParameters()
 quite confusing. Thought?

I updated a patch for what you have suggested (that the change of FPW
should be WAL-logged separately from that of HS parameters).

I want to base on this patch if there are no other opinions.

Regards.



Jun Ishizuka
NTT Software Corporation
TEL：045-317-7018
E-Mail: ishizuka@po.ntts.co.jp



standby_online_backup_09base-07fpw.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Silent failure with invalid hba_file setting

2011-10-19 Thread Thom Brown

On 19 October 2011 05:50, Peter Eisentraut pete...@gmx.net wrote:
 On tis, 2011-10-18 at 18:38 -0400, Tom Lane wrote:
  The problem with this is you cannot get into the database as it acts
  as if it did find the hba file but found it empty.

 Well, an actually empty pg_hba.conf file would have the same problem,
 and it's pretty hard to see any situation where it would be useful to
 start the postmaster and not let it accept any connections.  Should we
 add a check to consider it an error if the file doesn't contain at least
 one HBA record?

 If you try to connect and it doesn't find a record, it will tell you.

Yes, but then the user could end up pulling their hair out trying to
figure out why it's not matching any of the rules in the pg_hba.conf
file, when it's not being used at all.  Because there would have been
no indication that it failed to find the file in question when the
service started, the user may, rightly or wrongly, assume that the
file was being read, but they had somehow misconfigured the file.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] (patch) regression diffs on collate.linux.utf8 test

2011-10-19 Thread Peter Eisentraut

On tis, 2011-10-18 at 21:47 -0700, Jeff Davis wrote:
 On Tue, 2011-10-18 at 22:25 +0300, Peter Eisentraut wrote:
  Presumably because Jeff doesn't have that particular locale installed.
  locale -a would clarify that.
 
 $ locale -a |grep -i tr
 tr_CY.utf8
 tr_TR.utf8
 
 So, yes, I only have the UTF8 version. I didn't realize they were
 different -- do you happen to know what package I need for just plain
 tr_TR?

dpkg-reconfigure locales

or

apt-get install locales-all


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] (patch) regression diffs on collate.linux.utf8 test

2011-10-19 Thread Peter Eisentraut

On ons, 2011-10-19 at 01:10 -0400, Tom Lane wrote:
 Jeff Davis pg...@j-davis.com writes:
  On Tue, 2011-10-18 at 22:25 +0300, Peter Eisentraut wrote:
  Presumably because Jeff doesn't have that particular locale installed.
  locale -a would clarify that.
 
  $ locale -a |grep -i tr
  tr_CY.utf8
  tr_TR.utf8
 
  So, yes, I only have the UTF8 version.
 
 Wow, that's interesting.  Digging around on my Fedora box, I can't find
 any suggestion that it's even possible to subdivide the locale settings
 like that.  I only see one source file for tr_TR --- that's
 /usr/share/i18n/locales/tr_TR --- and it looks like all the stuff under 
 /usr/share/i18n/locales/ is compiled into one big run-time file
 /usr/lib/locale/locale-archive.

It has always been the case on Debian that it doesn't blindly install
all 600+ locales provided by glibc.  Instead, the OS installer picks the
ones that you are likely to use, and generates those from source at the
time the locales package is installed.  (So the locales package
contains the source for all glibc locales, but not the binary form.)

In fact, here is the output from a vanilla Debian stable installation:

$ locale -a
C
en_US.utf8
POSIX

I suspect, and this is Ubuntu-specific, so I don't have direct
experience with it, that what happened is that when you install the
langpack packages that Jeff mentioned, it triggers the compilation of
the respective associated locales.  But as you can see, apparently only
utf8 locales are generated by default, nowadays.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Collecting statistics on CSV file data

2011-10-19 Thread Etsuro Fujita


Hi,

(2011/10/18 16:32), Leonardo Francalanci wrote:

New API AnalyzeForeignTable


I didn't look at the patch, but I'm using CSV foreign tables with named pipes
to get near-realtime KPI calculated by postgresql. Of course, pipes can be
read just once, so I wouldn't want an automatic analyze of foreign tables...


The patch does not analyze on foreign tables automatically. (The issue 
of auto-analyze on foreign tables has been discussed. Please refer to [1].)


[1] http://archives.postgresql.org/pgsql-hackers/2011-09/msg00992.php

Best regards,
Etsuro Fujita

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] Object access hooks with arguments support (v1)

2011-10-19 Thread Kohei KaiGai

2011/10/18 Robert Haas robertmh...@gmail.com:
 In the example table creation, heap_create_with_catalog() is invoked
 by 5 routines, however, 3 of them are just internal usages, so it is not
 preferable to apply permission checks on table creation

 Some wit once made the remark that if a function has 10 arguments, it
 has 11 arguments, meaning that functions with very large numbers of
 arguments typically indicate a poor choice of abstraction boundary.
 That's pretty much what I think is going on with
 heap_create_with_catalog().  I think maybe some refactoring is in
 order there, though I am not sure quite what.

Sorry, are you complained about the number of arguments in
heap_create_with_catalog? or hooks of security checks?

If we try to rework access control logic around DefineRelation and
OpenIntoRel to host both of DAC and MAC, it will takes a long
arguments list because an entry of pg_class catalog is not
constructed at the timing, so we need to inform the routine
all the parameters of new relation required to both DAC and MAC.
Thus, it takes a long argument list, however, number of arguments
of heap_create_with_catalog will not increased so much. (Maybe,
it additionally requires a private data to deliver security labels to be
assigned on the new relation.)

If we relocate logics of DAC into heap_create_with_catalog(), it will
need additional two arguments, though it has 19 arguments right now.
It is not a strange idea to inform routines that modified catalogs
whether this context needs permission checks, or not, because
CreateTrigger has a flag of isInternal to skip permission check.

 BTW, it seems to me SELECT INTO should also check insert permission
 on DAC level, because it become an incorrect assumption that owner of
 table has insert permission on its creation time.
   :
 You could make the argument that there's no real security hole there,
 because the user could give himself those same privileges right back.
 You could also make the argument that a SELECT .. INTO is not the same
 as an INSERT and therefore INSERT privileges shouldn't be checked.  I
 think there are valid arguments on both sides, but my main point is
 that we shouldn't have core do it one way and sepgsql do it the other
 way.  That's going to be impossible to maintain and doesn't really
 make any logical sense anyway.

My point is that we should minimize the code complexity, redundancy
or others between DAC and MAC implementation, however, it is not
possible to get their different to zero due to the differences of their
standpoint.

Yes, I never say SELECT INTO without DAC checks cause actual
security hole, because owner can change its access permissions by
himself, unlike MAC.
Please suppose that there are two security labels: confidential table
(TC) and public table (PT). Typical MAC policy (including selinux)
does not allow users who can read from tables with TC to write out
data into tables with PT, because PT is readable for public as literal,
so confidential data may be leaked to public domain in the result.

It is a significant characteristic of MAC; called as data-flow-control.
So, it damages significant part of its worth, if MAC could not distinct
CREATE TABLE from SELECT INTO in PostgreSQL, and it is the
reason why I strongly requires a flag of contextual information here.

Although you say it is not possible to maintain, doesn't the above
introduction help us to understand nothing why MAC needs to
distinct SELECT INTO from CREATE TABLE?, and why it needs
a flag to distinct two cases?

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separating bgwriter and checkpointer

2011-10-19 Thread Fujii Masao

On Tue, Oct 18, 2011 at 10:18 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Any reason or objection to committing this patch?

The checkpointer doesn't call pgstat_send_bgwriter(), but it should.
Otherwise, some entries in pg_stat_bgwriter will never be updated.

If we adopt the patch, checkpoint is performed by checkpointer. So
it looks odd that information related to checkpoint exist in
pg_stat_bgwriter. We should move them to new catalog even if
it breaks the compatibility?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-10-19 Thread Kohei KaiGai

2011/10/19 Tom Lane t...@sss.pgh.pa.us:
 Robert Haas robertmh...@gmail.com writes:
 On Sun, Oct 16, 2011 at 4:46 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 I tried to reproduce the scenario with enough small from/join_collapse_limit
 (typically 1), but it allows to push down qualifiers into the least scan 
 plan.

 Hmm, you're right.  LIMIT 10 prevents qual pushdown, but
 hitting from_collapse_limit/join_collapse_limit apparently doesn't.  I
 could have sworn I've seen this work the other way, but I guess not.

 No, the collapse_limit variables are entirely unrelated to subquery
 flattening, or to qual pushdown for that matter.  They only restrict the
 number of join paths we consider.  And we will attempt to push down
 quals into an unflattened subquery, too, if it looks safe.  See
 subquery_is_pushdown_safe, qual_is_pushdown_safe, etc in allpaths.c.

I tried to observe the behavior with a bit modification of is_simple_subquery
that become to return 'false' always.
(It is a simulation if and when a view with security_barrier would be given.)

The expected behavior is to keep sub-query without flatten.
However, the externally provided qualifiers are correctly pushed down.

Do we need to focus on the code around above functions rather than
distribute_qual_to_rels, to prevent undesirable pushing-down across
security barrier?

postgres=# CREATE VIEW v1 AS SELECT * FROM t1 WHERE a  100;
CREATE VIEW
postgres=# CREATE VIEW v2 AS SELECT * FROM t2 JOIN t3 ON x = s;
CREATE VIEW
postgres=# EXPLAIN SELECT * FROM v1 WHERE b = 'bbb';
 QUERY PLAN

 Seq Scan on t1  (cost=0.00..28.45 rows=2 width=36)
   Filter: ((a  100) AND (b = 'bbb'::text))
(2 rows)
postgres=# EXPLAIN SELECT * FROM v2 WHERE t = 'ttt';
   QUERY PLAN

 Hash Join  (cost=25.45..52.73 rows=37 width=72)
   Hash Cond: (t2.x = t3.s)
   -  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
   -  Hash  (cost=25.38..25.38 rows=6 width=36)
 -  Seq Scan on t3  (cost=0.00..25.38 rows=6 width=36)
   Filter: (t = 'ttt'::text)
(6 rows)

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [PATCH] Deferrable unique constraints vs join removal -- bug?

2011-10-19 Thread Marti Raudsepp

Hi list,

PostgreSQL 9.0 introduced the join removal feature for cases where
an left join can't return more than one output row. This feature
relies on checking whether a UNIQUE constraint exists on the joined
column in the referenced table.

However, there was another new feature in 9.0: deferrable unique
constraints. It seems that the join removal code currently doesn't
check whether the constraint is deferrable or not. Since deferrable
unique constraints may contain duplicate rows in the course of a
transaction, join removal changes the results returned from a query.

This probably doesn't affect many real-world applications, but it
seems wrong that a performance feature can affect results returned by
a query.

Test case:

create table uniq (i int unique deferrable initially deferred);
begin;
insert into uniq values(1),(1);
select count(*) from uniq a left join uniq b using (i);
 count
---
 2

An inner join performs as expected:
marti=# select count(*) from uniq a inner join uniq b using (i);
 count
---
 4


Attached is a patch with an attempt to fix it. I'm not at all sure
this is the right approach, but it seems the cheapest way to make sure
is walk through *all* rows in pg_constraint for the given table, since
there is no index on pg_constraint.conindid. I'm not sure whether the
cost of this check outweighs the usefulness of this patch.

Catalog changes are a no-no for backported patches, right?
Or should I just go ahead and create this index, and not worry about
backporting?

I'm also adding lots of includes to this file. Maybe
unique_index_is_consistent() should be moved to another file instead?

While passing by, I also added an unrelated check to
check_functional_grouping() for the validity of the constraint. This
isn't necessary for now (unique constraints are always valid), but
seems useful just in case this changes in the future.

Regards,
Marti
From 23f33298485ecff80f01feb0dbd349cca2b38032 Mon Sep 17 00:00:00 2001
From: Marti Raudsepp ma...@juffo.org
Date: Tue, 18 Oct 2011 21:31:42 +0300
Subject: [PATCH] Don't allow join removal for deferrable unique constraints

This used to be allowed, but could return incorrect results if the
deferrable constraint is invalid in the middle of a transaction.
---
 src/backend/catalog/pg_constraint.c   |4 +-
 src/backend/optimizer/path/indxpath.c |  101 -
 src/test/regress/expected/join.out|   39 +
 src/test/regress/sql/join.sql |   24 
 4 files changed, 164 insertions(+), 4 deletions(-)

diff --git a/src/backend/catalog/pg_constraint.c b/src/backend/catalog/pg_constraint.c
index 6997994..7dbaca0 100644
--- a/src/backend/catalog/pg_constraint.c
+++ b/src/backend/catalog/pg_constraint.c
@@ -842,8 +842,8 @@ check_functional_grouping(Oid relid,
 		/* Only PK constraints are of interest for now, see comment above */
 		if (con-contype != CONSTRAINT_PRIMARY)
 			continue;
-		/* Constraint must be non-deferrable */
-		if (con-condeferrable)
+		/* Constraint must be non-deferrable and valid */
+		if (con-condeferrable || !con-convalidated)
 			continue;
 
 		/* Extract the conkey array, ie, attnums of PK's columns */
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 940efb3..6d46e75 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -17,10 +17,14 @@
 
 #include math.h
 
+#include access/genam.h
+#include access/heapam.h
 #include access/skey.h
 #include access/sysattr.h
+#include catalog/indexing.h
 #include catalog/pg_am.h
 #include catalog/pg_collation.h
+#include catalog/pg_constraint.h
 #include catalog/pg_operator.h
 #include catalog/pg_opfamily.h
 #include catalog/pg_type.h
@@ -34,9 +38,12 @@
 #include optimizer/var.h
 #include utils/builtins.h
 #include utils/bytea.h
+#include utils/fmgroids.h
 #include utils/lsyscache.h
 #include utils/pg_locale.h
 #include utils/selfuncs.h
+#include utils/tqual.h
+#include storage/lock.h
 
 
 #define IsBooleanOpfamily(opfamily) \
@@ -116,6 +123,8 @@ static bool matches_any_index(RestrictInfo *rinfo, RelOptInfo *rel,
   Relids outer_relids);
 static List *find_clauses_for_join(PlannerInfo *root, RelOptInfo *rel,
 	  Relids outer_relids, bool isouterjoin);
+static bool unique_index_is_consistent(Oid reloid, Oid indexoid,
+Oid *conoid);
 static bool match_boolean_index_clause(Node *clause, int indexcol,
 		   IndexOptInfo *index);
 static bool match_special_index_operator(Expr *clause,
@@ -2248,6 +2257,71 @@ find_clauses_for_join(PlannerInfo *root, RelOptInfo *rel,
 }
 
 /*
+ * unique_constraint_is_consistent
+ *	  Make sure that a unique index's respective constraint is validated and
+ *	  not deferrable. Sets *conoid to the found constraint OID, or InvalidOid
+ *	  if not found.
+ *
+ * This is expensive; there is on index on pg_constraint.conindid, so we have
+ * to scan all constraints on the relation.
+ */
+static bool

Re: [HACKERS] Separating bgwriter and checkpointer

2011-10-19 Thread Dickson S. Guedes

2011/10/18 Simon Riggs si...@2ndquadrant.com:
 On Wed, Oct 5, 2011 at 8:02 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Wed, Oct 5, 2011 at 5:10 AM, Dickson S. Guedes lis...@guedesoft.net 
 wrote:

 Ah ok! I started reviewing the v4 patch version, this is my comments:

 ...

 Well, all the tests was running with the default postgresql.conf in my
 laptop but I'll setup a more real world environment to test for
 performance regression. Until now I couldn't notice any significant
 difference in TPS before and after patch in a small environment. I'll
 post something soon.

 Great testing, thanks. Likely will have no effect in non-I/O swamped
 environment, but no regression expected either.


 Any reason or objection to committing this patch?

I didn't see any performance regression (as expected) in the
environments that I tested. About the code, I prefer someone with more
experience to review it.

Thanks.
-- 
Dickson S. Guedes
mail/xmpp: gue...@guedesoft.net - skype: guediz
http://guedesoft.net - http://www.postgresql.org.br

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] loss of transactions in streaming replication

2011-10-19 Thread Fujii Masao

On Wed, Oct 19, 2011 at 3:31 PM, Fujii Masao masao.fu...@gmail.com wrote:
 (2) for this marginal improvement, you're giving up including
 PQerrorMessage(streamConn) in the error message that ultimately gets
 omitted, which seems like a substantial regression as far as
 debuggability is concerned.

 I think that it's possible to include PQerrorMessage() in the error
 message. Will change the patch.

Attached is the updated version of the patch. When walreceiver fails to
send data to WAL stream, it emits WARNING with the message including
PQerrorMessage(), and also it emits the following DETAIL message:

Walreceiver process will be terminated after all available data
have been received from WAL stream.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
*** a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
--- b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
***
*** 49,55  static char *recvBuf = NULL;
  static bool libpqrcv_connect(char *conninfo, XLogRecPtr startpoint);
  static bool libpqrcv_receive(int timeout, unsigned char *type,
   char **buffer, int *len);
! static void libpqrcv_send(const char *buffer, int nbytes);
  static void libpqrcv_disconnect(void);
  
  /* Prototypes for private functions */
--- 49,55 
  static bool libpqrcv_connect(char *conninfo, XLogRecPtr startpoint);
  static bool libpqrcv_receive(int timeout, unsigned char *type,
   char **buffer, int *len);
! static bool libpqrcv_send(const char *buffer, int nbytes);
  static void libpqrcv_disconnect(void);
  
  /* Prototypes for private functions */
***
*** 404,417  libpqrcv_receive(int timeout, unsigned char *type, char **buffer, int *len)
  /*
   * Send a message to XLOG stream.
   *
!  * ereports on error.
   */
! static void
  libpqrcv_send(const char *buffer, int nbytes)
  {
  	if (PQputCopyData(streamConn, buffer, nbytes) = 0 ||
  		PQflush(streamConn))
! 		ereport(ERROR,
  (errmsg(could not send data to WAL stream: %s,
! 		PQerrorMessage(streamConn;
  }
--- 404,429 
  /*
   * Send a message to XLOG stream.
   *
!  * Return true if successfully, false otherwise.
   */
! static bool
  libpqrcv_send(const char *buffer, int nbytes)
  {
+ 	/*
+ 	 * Even if we could not send data to WAL stream, we don't emit ERROR
+ 	 * just yet. There might be still data available in the receive buffer. We
+ 	 * emit ERROR after all available data have been received and flushed to
+ 	 * disk. Otherwise, such outstanding data would be lost.
+ 	 */
  	if (PQputCopyData(streamConn, buffer, nbytes) = 0 ||
  		PQflush(streamConn))
! 	{
! 		ereport(WARNING,
  (errmsg(could not send data to WAL stream: %s,
! 		PQerrorMessage(streamConn)),
!  errdetail(Walreceiver process will be terminated after all available data have been received from WAL stream.)));
! 		return false;
! 	}
! 
! 	return true;
  }
*** a/src/backend/replication/walreceiver.c
--- b/src/backend/replication/walreceiver.c
***
*** 96,101  static struct
--- 96,104 
  static StandbyReplyMessage reply_message;
  static StandbyHSFeedbackMessage feedback_message;
  
+ /* Did previous attempt to send data to WAL stream fail? */
+ static bool walrcv_send_error = false;
+ 
  /*
   * About SIGTERM handling:
   *
***
*** 328,333  WalReceiverMain(void)
--- 331,345 
  		else
  		{
  			/*
+ 			 * If we didn't receive anything new after we had failed to send data
+ 			 * to WAL stream, we can guarantee that there is no data available in
+ 			 * the receive buffer. So we at last emit ERROR.
+ 			 */
+ 			if (walrcv_send_error)
+ ereport(ERROR,
+ 		(errmsg(terminating walreceiver process due to failure to send data to WAL stream)));
+ 
+ 			/*
  			 * We didn't receive anything new, but send a status update to the
  			 * master anyway, to report any progress in applying WAL.
  			 */
***
*** 627,632  XLogWalRcvSendReply(void)
--- 639,651 
  	   wal_receiver_status_interval * 1000))
  		return;
  
+ 	/*
+ 	 * If previous attempt to send data to WAL stream has failed, there is
+ 	 * nothing to do here because this attempt would fail again.
+ 	 */
+ 	if (walrcv_send_error)
+ 		return;
+ 
  	/* Construct a new message */
  	reply_message.write = LogstreamResult.Write;
  	reply_message.flush = LogstreamResult.Flush;
***
*** 641,647  XLogWalRcvSendReply(void)
  	/* Prepend with the message type and send it. */
  	buf[0] = 'r';
  	memcpy(buf[1], reply_message, sizeof(StandbyReplyMessage));
! 	walrcv_send(buf, sizeof(StandbyReplyMessage) + 1);
  }
  
  /*
--- 660,666 
  	/* Prepend with the message type and send it. */
  	buf[0] = 'r';
  	memcpy(buf[1], reply_message, sizeof(StandbyReplyMessage));
! 	walrcv_send_error = !walrcv_send(buf, sizeof(StandbyReplyMessage) + 1);
  }
  
  /*
***
*** 682,687  XLogWalRcvSendHSFeedback(void)
--- 701,714 
  		return;

Re: [HACKERS] Separating bgwriter and checkpointer

2011-10-19 Thread Dickson S. Guedes

2011/10/19 Fujii Masao masao.fu...@gmail.com:
 On Tue, Oct 18, 2011 at 10:18 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Any reason or objection to committing this patch?

 The checkpointer doesn't call pgstat_send_bgwriter(), but it should.
 Otherwise, some entries in pg_stat_bgwriter will never be updated.

Yes, checkpoints_req, checkpoints_timed and buffer_checkpoint are not
being updated with this patch.

 If we adopt the patch, checkpoint is performed by checkpointer. So
 it looks odd that information related to checkpoint exist in
 pg_stat_bgwriter. We should move them to new catalog even if
 it breaks the compatibility?

Splitting pg_stat_bgwriter into pg_stat_bgwriter and
pg_stat_checkpointer will break something internal?

With this modification we'll see applications like monitoring tools
breaking, but they could use a view to put data back together in a
compatible way, IMHO.

-- 
Dickson S. Guedes
mail/xmpp: gue...@guedesoft.net - skype: guediz
http://guedesoft.net - http://www.postgresql.org.br

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] loss of transactions in streaming replication

2011-10-19 Thread Robert Haas

On Wed, Oct 19, 2011 at 2:31 AM, Fujii Masao masao.fu...@gmail.com wrote:
 My reading of the situation is that you're talking about a problem
 that will only occur if, while the master is in the process of
 shutting down, a network error occurs.

 No. This happens even if a network error doesn't occur. I can
 reproduce the issue by doing the following:

 1. Set up streaming replication master and standby with archive
   setting.
 2. Run pgbench -i
 3. Shuts down the master with fast mode.

 Then I can see that the latest WAL file in the master's pg_xlog
 doesn't exist in the standby's one. The WAL record which was
 lost was the shutdown checkpoint one.

 When smart or fast shutdown is requested, the master tries to
 write and send the WAL switch (if archiving is enabled) and
 shutdown checkpoint record. Because of the problem I described,
 the WAL switch record arrives at the standby but the shutdown
 checkpoint does not.

Oh, that's not good.

 The original behavior, in 9.0, is that all outstanding WAL are
 replicated to the standby when the master shuts down normally.
 But ISTM the behavior was changed unexpectedly in 9.1. So
 I think that it should be back-patched to 9.1 to revert the behavior
 to the original.

Which commit broke this?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separating bgwriter and checkpointer

2011-10-19 Thread Robert Haas

On Wed, Oct 19, 2011 at 8:43 AM, Dickson S. Guedes lis...@guedesoft.net wrote:
 2011/10/19 Fujii Masao masao.fu...@gmail.com:
 On Tue, Oct 18, 2011 at 10:18 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Any reason or objection to committing this patch?

 The checkpointer doesn't call pgstat_send_bgwriter(), but it should.
 Otherwise, some entries in pg_stat_bgwriter will never be updated.

 Yes, checkpoints_req, checkpoints_timed and buffer_checkpoint are not
 being updated with this patch.

 If we adopt the patch, checkpoint is performed by checkpointer. So
 it looks odd that information related to checkpoint exist in
 pg_stat_bgwriter. We should move them to new catalog even if
 it breaks the compatibility?

 Splitting pg_stat_bgwriter into pg_stat_bgwriter and
 pg_stat_checkpointer will break something internal?

 With this modification we'll see applications like monitoring tools
 breaking, but they could use a view to put data back together in a
 compatible way, IMHO.

I don't really see any reason to break the monitoring view just
because we did some internal refactoring.  I'd rather have backward
compatibility.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] (PATCH) Adding CORRESPONDING to Set Operations

2011-10-19 Thread Kerem Kat

Adding CORRESPONDING to Set Operations
Initial patch, filename: corresponding_clause_v2.patch

This patch adds CORRESPONDING clause to set operations according to
SQL20nn standard draft as Feature F301, CORRESPONDING in query
expressions

Corresponding clause either contains a BY(...) clause or not. If it
doesn't have a BY(...) clause the usage is as follows.

SELECT 1 a, 2 b, 3 c UNION CORRESPONDING 4 b, 5 d, 6 c, 7 f;

with output:
b c
-
2 3
4 6

i.e. matching column names are filtered and are only output from the
whole set operation clause.

If we introduce a BY(...) clause, then column names are further
intersected with that BY clause:

SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY(b) 4 b, 5 d, 6 c, 7 f;

with output:

b
--
2
4

This patch compiles and tests successfully with master branch.

It has been tested only on Pardus Linux i686 ( Kernel 2.6.37.6 #1 SMP
i686 i686 i386 GNU/Linux)

This patch includes documentation and add one regression file.

This patch addresses the following TODO item:
SQL Commands: Add CORRESPONDING BY to UNION/INTERSECT/EXCEPT


Best Regards,

Kerem KAT
*** a/doc/src/sgml/queries.sgml
--- b/doc/src/sgml/queries.sgml
***
*** 1225,1230 
--- 1225,1233 
 primaryEXCEPT/primary
/indexterm
indexterm zone=queries-union
+primaryCORRESPONDING/primary
+   /indexterm
+   indexterm zone=queries-union
 primaryset union/primary
/indexterm
indexterm zone=queries-union
***
*** 1241,1249 
 The results of two queries can be combined using the set operations
 union, intersection, and difference.  The syntax is
  synopsis
! replaceablequery1/replaceable UNION optionalALL/optional replaceablequery2/replaceable
! replaceablequery1/replaceable INTERSECT optionalALL/optional replaceablequery2/replaceable
! replaceablequery1/replaceable EXCEPT optionalALL/optional replaceablequery2/replaceable
  /synopsis
 replaceablequery1/replaceable and
 replaceablequery2/replaceable are queries that can use any of
--- 1244,1252 
 The results of two queries can be combined using the set operations
 union, intersection, and difference.  The syntax is
  synopsis
! replaceablequery1/replaceable UNION optionalALL/optional optionalCORRESPONDING optionalBY (replaceableselect_list/replaceable)/optional/optional replaceablequery2/replaceable
! replaceablequery1/replaceable INTERSECT optionalALL/optional optionalCORRESPONDING optionalBY (replaceableselect_list/replaceable)/optional/optional replaceablequery2/replaceable
! replaceablequery1/replaceable EXCEPT optionalALL/optional optionalCORRESPONDING optionalBY (replaceableselect_list/replaceable)/optional/optional replaceablequery2/replaceable
  /synopsis
 replaceablequery1/replaceable and
 replaceablequery2/replaceable are queries that can use any of
***
*** 1283,1288 
--- 1286,1299 
/para
  
para
+ literalCORRESPONDING/ returns all columns that are in both replaceablequery1/ and replaceablequery2/ with the same name.
+   /para
+ 
+   para
+ literalCORRESPONDING BY/ returns all columns in the column list that are also in both replaceablequery1/ and replaceablequery2/ with the same name.
+   /para
+ 
+   para
 In order to calculate the union, intersection, or difference of two
 queries, the two queries must be quoteunion compatible/quote,
 which means that they return the same number of columns and
*** a/doc/src/sgml/sql.sgml
--- b/doc/src/sgml/sql.sgml
***
*** 859,865 
  [ WHERE replaceable class=PARAMETERcondition/replaceable ]
  [ GROUP BY replaceable class=PARAMETERexpression/replaceable [, ...] ]
  [ HAVING replaceable class=PARAMETERcondition/replaceable [, ...] ]
! [ { UNION | INTERSECT | EXCEPT } [ ALL ] replaceable class=PARAMETERselect/replaceable ]
  [ ORDER BY replaceable class=parameterexpression/replaceable [ ASC | DESC | USING replaceable class=parameteroperator/replaceable ] [ NULLS { FIRST | LAST } ] [, ...] ]
  [ LIMIT { replaceable class=PARAMETERcount/replaceable | ALL } ]
  [ OFFSET replaceable class=PARAMETERstart/replaceable ]
--- 859,865 
  [ WHERE replaceable class=PARAMETERcondition/replaceable ]
  [ GROUP BY replaceable class=PARAMETERexpression/replaceable [, ...] ]
  [ HAVING replaceable class=PARAMETERcondition/replaceable [, ...] ]
! [ { UNION | INTERSECT | EXCEPT } [ ALL ] [ CORRESPONDING [ BY ( replaceable class=PARAMETERexpression/replaceable ) ] ] replaceable class=PARAMETERselect/replaceable ]
  [ ORDER BY replaceable class=parameterexpression/replaceable [ ASC | DESC | USING replaceable class=parameteroperator/replaceable ] [ NULLS { FIRST | LAST } ] [, ...] ]
  [ LIMIT { replaceable class=PARAMETERcount/replaceable | ALL } ]
  [ OFFSET replaceable class=PARAMETERstart/replaceable ]
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
***
*** 2507,2512 
--- 2507,2513

Re: [HACKERS] Separating bgwriter and checkpointer

2011-10-19 Thread Fujii Masao

On Wed, Oct 19, 2011 at 9:45 PM, Robert Haas robertmh...@gmail.com wrote:
 I don't really see any reason to break the monitoring view just
 because we did some internal refactoring.  I'd rather have backward
 compatibility.

Fair enough.

The patch doesn't change any document, but at least the description
of pg_stat_bgwriter seems to need to be changed.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] Object access hooks with arguments support (v1)

2011-10-19 Thread Robert Haas

On Wed, Oct 19, 2011 at 6:18 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 2011/10/18 Robert Haas robertmh...@gmail.com:
 In the example table creation, heap_create_with_catalog() is invoked
 by 5 routines, however, 3 of them are just internal usages, so it is not
 preferable to apply permission checks on table creation

 Some wit once made the remark that if a function has 10 arguments, it
 has 11 arguments, meaning that functions with very large numbers of
 arguments typically indicate a poor choice of abstraction boundary.
 That's pretty much what I think is going on with
 heap_create_with_catalog().  I think maybe some refactoring is in
 order there, though I am not sure quite what.

 Sorry, are you complained about the number of arguments in
 heap_create_with_catalog? or hooks of security checks?

I'm just saying that heap_create_with_catalog() is big and complex and
does a lot of different things.  The fact it does security checks even
though it's also sometimes called as an internal function strikes me
as a modularity violation.  Normally I would expect to have a
high-level routine (which is invoked more or less directly from SQL)
that does security checks and then calls a low-level routine (that
actually does the work) if they pass.  Other parts of the code that
want to perform the same operation without the security checks can
call the low-level function directly.  But that's not the way it's set
up here.

 Yes, I never say SELECT INTO without DAC checks cause actual
 security hole, because owner can change its access permissions by
 himself, unlike MAC.
 Please suppose that there are two security labels: confidential table
 (TC) and public table (PT). Typical MAC policy (including selinux)
 does not allow users who can read from tables with TC to write out
 data into tables with PT, because PT is readable for public as literal,
 so confidential data may be leaked to public domain in the result.

 It is a significant characteristic of MAC; called as data-flow-control.
 So, it damages significant part of its worth, if MAC could not distinct
 CREATE TABLE from SELECT INTO in PostgreSQL, and it is the
 reason why I strongly requires a flag of contextual information here.

 Although you say it is not possible to maintain, doesn't the above
 introduction help us to understand nothing why MAC needs to
 distinct SELECT INTO from CREATE TABLE?, and why it needs
 a flag to distinct two cases?

Sure.  But what is going to happen when someone needs to modify that
code in a year or two?  In order to understand what to do with the
object access hook, they're going to need to understand all those
details you just mentioned.  And those details do not exist in the
patch as submitted.  Worse, let's suppose we'd committed a patch like
the one you have here before foreign tables went in.  Then, whoever
added foreign tables would have needed to know to add the foreign
table server name to the object access hook invocation, because
apparently sepgsql needs that.  How would they know they needed to do
that?  I've committed practically all of the sepgsql-related patches,
and I would not have known that, so it seems likely to me that other
people adding new functionality to the server wouldn't know it either.

When someone comes along in another year or two and adds materialized
views, will they need to pass some additional data to the object
access hook?  Probably, but I bet you're the only one who can quickly
figure out what it is.  That's no good.  We're not going to make
changes to PostgreSQL core that only you can maintain, and that are
likely to be broken by future commits.  At this point I feel pretty
good that someone can look at the stuff that we've done so far with
SECURITY LABEL and the object access hooks, and if they add a new
object type, they can make those things apply to the new object type
too by copying what's already there, without making any reference to
the sepgsql code.  There's a clear abstraction boundary, and people
who are working on one side of the boundary do not need to understand
the details of what happens on the other side of the boundary.

In this particular case, I think it might be reasonable to change the
DAC behavior, so that a CREATE TABLE AS SELECT or SELECT INTO requires
insert privileges on the new table as well as permission to create it
in the first place.  I don't particularly see any reason to require
different privileges for CREATE TABLE followed by INSERT .. SELECT
than what we require when the two commands are rolled into one.  Prior
to 9.0, this case couldn't arise, because we didn't have default
privileges, so I'm inclined to think that the way it works now is a
historical accident rather than a deliberate design decision.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: [HACKERS] loss of transactions in streaming replication

2011-10-19 Thread Fujii Masao

On Wed, Oct 19, 2011 at 9:44 PM, Robert Haas robertmh...@gmail.com wrote:
 The original behavior, in 9.0, is that all outstanding WAL are
 replicated to the standby when the master shuts down normally.
 But ISTM the behavior was changed unexpectedly in 9.1. So
 I think that it should be back-patched to 9.1 to revert the behavior
 to the original.

 Which commit broke this?

d3d414696f39e2b57072fab3dd4fa11e465be4ed
b186523fd97ce02ffbb7e21d5385a047deeef4f6

The former introduced problematic libpqrcv_send() (which was my mistake...),
and the latter is the first user of it.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separating bgwriter and checkpointer

2011-10-19 Thread Simon Riggs

On Wed, Oct 19, 2011 at 3:29 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Wed, Oct 19, 2011 at 9:45 PM, Robert Haas robertmh...@gmail.com wrote:
 I don't really see any reason to break the monitoring view just
 because we did some internal refactoring.  I'd rather have backward
 compatibility.

 Fair enough.

 The patch doesn't change any document, but at least the description
 of pg_stat_bgwriter seems to need to be changed.

Thanks for the review.

Will follow up on suggestions.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] new compiler warnings

2011-10-19 Thread Greg Stark

On Tue, Oct 18, 2011 at 6:01 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The chunks are sent indivisibly, because they are less than the pipe
 buffer size.  Read the pipe man page.  It's guaranteed that the write
 will either succeed or fail as a whole, not write a partial message.
 If we cared to retry a failure, there would be some point in checking
 the return code.

It sounds to me like we should check for a short write and if it
occurs we should generate an error to for the administrator so he
knows his kernel isn't meeting Postgres's expectations and things
might not work correctly.

How to write a log message about the logging infrastructure being
broken is a bit tricky but it seems to me this is a general problem we
need a solution for. We need some kind of fallback for problems with
the postmaster or other important messages that are either so severe
or just so specific that they prevent the normal logging mechanism
from working.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in walsender when calling out to do_pg_stop_backup (and others?)

2011-10-19 Thread Florian Pflug

On Oct19, 2011, at 17:47 , Greg Jaskiewicz wrote:
 On 15 Oct 2011, at 11:31, Florian Pflug wrote:
 
 Ok, here's a first cut.
 
 So I looked at the patch, and first thing that pops out, 
 is lack of the volatile keyword before the ClientConnectionLostPending 
 variable is defined. Is that done on purpose ? Is that on purpose ?

That's on purpose. volatile is only necessary for variables which are either 
accessed from within signal handlers or which live in shared memory. Neither is 
true for ClientConnectionLostPending, so non-volatile should be fine.

 Otherwise the patch itself looks ok. 
 I haven't tested the code, just reviewed the patch itself. And it obviously 
 needs testing, should be easy to follow your original problem description. 

Yeah, further testing is on my todo list. The interesting case is probably what 
happens if the connection is dropped while there's already a cancellation 
request pending. And also the other way around - a cancellation request 
arriving after we've already discovered that the connection is gone.

 Btw, I just tried to do it through commitfest.postgresql.org , but before I 
 get my head around on how to add myself to the reviewer list there - I 
 thought I'll just send this response here.

You just need to click Edit Patch and put your name into the Reviewer field. 
You do need a postgres community account to edit patches, but the signup 
process is quite quick and painless AFAIR.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] DROP statement reworks

2011-10-19 Thread Robert Haas

On Thu, Oct 13, 2011 at 12:46 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 And, also I added regression test cases to detect these code paths,
 because some of object types does not cover the case when it was
 dropped.

These regression tests seem busted to me.  First, I applied the part 2
patch.  The regression tests failed.  Then, I applied the part 3
patch.  Then they passed.  So far so good.  Then, I took the
regression test portion of the part 2 and part 3 patches and applied
just those.  That also fails.

Can we come up with a set of regression tests that:

- passes on unmodified master
- still passes with the part 2 patch applied
- also passes with both the part 2 and part 3 patches applied

AIUI, this patch isn't supposed to be changing any behavior, just
consolidating the code.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in walsender when calling out to do_pg_stop_backup (and others?)

2011-10-19 Thread Greg Jaskiewicz


On 19 Oct 2011, at 17:54, Florian Pflug wrote:

 On Oct19, 2011, at 17:47 , Greg Jaskiewicz wrote:
 On 15 Oct 2011, at 11:31, Florian Pflug wrote:
 
 Ok, here's a first cut.
 
 So I looked at the patch, and first thing that pops out, 
 is lack of the volatile keyword before the ClientConnectionLostPending 
 variable is defined. Is that done on purpose ? Is that on purpose ?
 
 That's on purpose. volatile is only necessary for variables which are either 
 accessed from within signal handlers or which live in shared memory. Neither 
 is true for ClientConnectionLostPending, so non-volatile should be fine.
Ok, cool.
I'm aware of the reasons behind volatile, just noticed that some other flags 
used in similar way are marked as such. At the end of the day, this is just a 
hint to the compiler anyway. 

 
 Otherwise the patch itself looks ok. 
 I haven't tested the code, just reviewed the patch itself. And it obviously 
 needs testing, should be easy to follow your original problem description. 
 
 Yeah, further testing is on my todo list. The interesting case is probably 
 what happens if the connection is dropped while there's already a 
 cancellation request pending. And also the other way around - a cancellation 
 request arriving after we've already discovered that the connection is gone.
 
 Btw, I just tried to do it through commitfest.postgresql.org , but before I 
 get my head around on how to add myself to the reviewer list there - I 
 thought I'll just send this response here.
 
 You just need to click Edit Patch and put your name into the Reviewer 
 field. You do need a postgres community account to edit patches, but the 
 signup process is quite quick and painless AFAIR.


Ok, clicking 'edit patch' sounds a bit big. Probably better would be, to just 
be able to click on some sort of I'm in button/checkbox type of thing. 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] loss of transactions in streaming replication

2011-10-19 Thread Robert Haas

On Wed, Oct 19, 2011 at 10:41 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Wed, Oct 19, 2011 at 9:44 PM, Robert Haas robertmh...@gmail.com wrote:
 The original behavior, in 9.0, is that all outstanding WAL are
 replicated to the standby when the master shuts down normally.
 But ISTM the behavior was changed unexpectedly in 9.1. So
 I think that it should be back-patched to 9.1 to revert the behavior
 to the original.

 Which commit broke this?

 d3d414696f39e2b57072fab3dd4fa11e465be4ed
 b186523fd97ce02ffbb7e21d5385a047deeef4f6

 The former introduced problematic libpqrcv_send() (which was my mistake...),
 and the latter is the first user of it.

OK, so this is an artifact of the changes to make libpq communication
bidirectional.  But I'm still confused about where the error is coming
from.  In your OP, you wrote In 9.2dev and 9.1, when walreceiver
detects an error while sending data to WAL stream, it always emits
ERROR even if there are data available in the receive buffer.  So
that implied to me that this is only going to trigger if you have a
shutdown together with an awkwardly-timed error.  But your scenario
for reproducing this problem doesn't seem to involve an error.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] new compiler warnings

2011-10-19 Thread Kevin Grittner

Tom Lane t...@sss.pgh.pa.us wrote:
 
 Unfortunately, the problem we're dealing with here is exactly that
 we can't write to stderr.  So it's a bit hard to see what we can
 usefully do to report that we have a problem (short of crashing,
 which isn't a net improvement).
 
Are you sure that crashing on an assertion-enabled build *isn't* a
net improvement?  It sounds like we're pretty convinced this is a
can't happen situation -- if it does happen either the API is not
honoring its contract or we've badly misinterpreted the contract.
It might allow us to catch bugs in development or testing (where
cassert builds are used) before they mangle production server logs.
 
I have a hard time understanding the argument against an Assert in
this case.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] synchronized snapshots

2011-10-19 Thread Tom Lane

Joachim Wieland j...@mcknight.de writes:
 [ synchronized-snapshots patch ]

Looking through this code, it strikes me that SET TRANSACTION SNAPSHOT
is fundamentally incompatible with SERIALIZABLE READ ONLY DEFERRABLE
mode.  That mode assumes that you should be able to just take a new
snapshot, repeatedly, until you get one that's safe.  With the patch
as written, if the supplied snapshot is unsafe, GetSafeSnapshot()
will just go into an infinite loop.

AFAICS we should just throw an error if SET TRANSACTION SNAPSHOT is done
in a transaction with those properties.  Has anyone got another
interpretation?  Would it be better to silently ignore the DEFERRABLE
property?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] synchronized snapshots

2011-10-19 Thread Heikki Linnakangas


On 19.10.2011 19:17, Tom Lane wrote:

Joachim Wielandj...@mcknight.de  writes:

[ synchronized-snapshots patch ]


Looking through this code, it strikes me that SET TRANSACTION SNAPSHOT
is fundamentally incompatible with SERIALIZABLE READ ONLY DEFERRABLE
mode.  That mode assumes that you should be able to just take a new
snapshot, repeatedly, until you get one that's safe.  With the patch
as written, if the supplied snapshot is unsafe, GetSafeSnapshot()
will just go into an infinite loop.

AFAICS we should just throw an error if SET TRANSACTION SNAPSHOT is done
in a transaction with those properties.  Has anyone got another
interpretation?  Would it be better to silently ignore the DEFERRABLE
property?


An error seems appropriate to me.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] new compiler warnings

2011-10-19 Thread Tom Lane

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Unfortunately, the problem we're dealing with here is exactly that
 we can't write to stderr.  So it's a bit hard to see what we can
 usefully do to report that we have a problem (short of crashing,
 which isn't a net improvement).
 
 Are you sure that crashing on an assertion-enabled build *isn't* a
 net improvement?

No, it isn't.

 It sounds like we're pretty convinced this is a
 can't happen situation -- if it does happen either the API is not
 honoring its contract or we've badly misinterpreted the contract.

If it happens, the result would be that the syslogger process gets
out-of-sync with the pipe data stream and starts to emit bizarre garbage
(probably what you'd see is weirdly interleaved chunks of messages from
different backends).  There have been no such reports from the field
AFAIR.  Even if it did happen, I don't think that crashing the backend
is an improvement --- keep in mind that for the first fifteen years of
Postgres' existence, we just tolerated that sort of thing as a matter of
course.  Lastly, crashing and restarting the backends *won't fix it*.
The syslogger will still be out of sync, and will stay that way until
random chance causes it to think that a message boundary falls where
there really is one.  (Of course, if the assert takes out the postmaster
instead of a backend, it'll be fixed by the manual intervention that
will be required to get things going again.)

We have many better things to spend our time on than worrying about the
hypothetical, unsupported-by-any-evidence-whatsoever risk that the
kernel doesn't meet the POSIX standard on this point.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in walsender when calling out to do_pg_stop_backup (and others?)

2011-10-19 Thread Florian Pflug

On Oct19, 2011, at 18:05 , Greg Jaskiewicz wrote:
 On 19 Oct 2011, at 17:54, Florian Pflug wrote:
 
 On Oct19, 2011, at 17:47 , Greg Jaskiewicz wrote:
 On 15 Oct 2011, at 11:31, Florian Pflug wrote:
 
 Ok, here's a first cut.
 
 So I looked at the patch, and first thing that pops out, 
 is lack of the volatile keyword before the ClientConnectionLostPending 
 variable is defined. Is that done on purpose ? Is that on purpose ?
 
 That's on purpose. volatile is only necessary for variables which are either 
 accessed from within signal handlers or which live in shared memory. Neither 
 is true for ClientConnectionLostPending, so non-volatile should be fine.
 Ok, cool.
 I'm aware of the reasons behind volatile, just noticed that some other flags 
 used in similar way are marked as such. At the end of the day, this is just a 
 hint to the compiler anyway. 

All the other flags which indicate cancellation reasons are set from signal 
handers, I believe. We could of course mark as ClientConnectionLostPending as 
volatile just to be consistent. Not sure whether that's a good idea, or not. It 
might prevent a mistake should we ever add code to detect lost connections 
asynchronously (i.e., from somewhere else than pq_flush). And the cost is 
probably negligible, because CHECK_FOR_INTERRUPTS tests for InterruptPending 
before calling ProcessInterrupts(), so we only pay the cost of volatile if 
there's actually an interrupt pending. But I still think it's better to add 
qualifies such a volatile only when really necessary. A comment about why it 
*isn't* volatile is probably in order, though, so I'll add that in the next 
version of the patch.

best regards,
Florian Pflug

PS: Thanks for the review. It's very much appreciated!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in walsender when calling out to do_pg_stop_backup (and others?)

2011-10-19 Thread Greg Jaskiewicz


On 15 Oct 2011, at 11:31, Florian Pflug wrote:
 
 Ok, here's a first cut.

So I looked at the patch, and first thing that pops out, 
is lack of the volatile keyword before the ClientConnectionLostPending variable 
is defined. Is that done on purpose ? Is that on purpose ?


Otherwise the patch itself looks ok. 
I haven't tested the code, just reviewed the patch itself. And it obviously 
needs testing, should be easy to follow your original problem description. 


Btw, I just tried to do it through commitfest.postgresql.org , but before I get 
my head around on how to add myself to the reviewer list there - I thought I'll 
just send this response here.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Log crashed backend's query v3

2011-10-19 Thread Robert Haas

On Thu, Oct 6, 2011 at 10:15 PM, gabrielle gor...@gmail.com wrote:
 On Wed, Oct 5, 2011 at 5:14 PM, Marti Raudsepp ma...@juffo.org wrote:
 I think you intended to use the Waiting on Author status -- that
 leaves the commitfest entry open. I will re-open the commitfest entry
 myself, I hope that's OK.

 No worries, and yeah, I picked the wrong checkbox. :)

 Here is version 3 of the patch.

 Looks good, and performs as advertised.  Thanks!

I think it would be safer to write this so that
pgstat_get_crashed_backend_activity writes its answer into a
statically allocated buffer and returns a pointer to that buffer,
rather than using palloc.  I think the current coding might lead to a
memory leak in the postmaster, but even if it doesn't, it seems better
to make this as simple as possible.

Also, how about having CreateSharedBackendStatus store the length of
the backend activity buffer in a global somewhere, instead of
repeating the calculation here?

For this:

if (*(activity) == '\0')
return command string not enabled;

I'd suggest that we instead return command string not found, and
avoid making judgements about how things got that way.

Other than that, it looks good to me.  It's almost making me cry
thinking about how much time this would have saved me debugging server
crashes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] synchronized snapshots

2011-10-19 Thread Florian Pflug

On Oct19, 2011, at 18:17 , Tom Lane wrote:
 Joachim Wieland j...@mcknight.de writes:
 [ synchronized-snapshots patch ]
 
 Looking through this code, it strikes me that SET TRANSACTION SNAPSHOT
 is fundamentally incompatible with SERIALIZABLE READ ONLY DEFERRABLE
 mode.  That mode assumes that you should be able to just take a new
 snapshot, repeatedly, until you get one that's safe.  With the patch
 as written, if the supplied snapshot is unsafe, GetSafeSnapshot()
 will just go into an infinite loop.
 
 AFAICS we should just throw an error if SET TRANSACTION SNAPSHOT is done
 in a transaction with those properties.  Has anyone got another
 interpretation?  Would it be better to silently ignore the DEFERRABLE
 property?

Hm, both features are meant to be used by pg_dump, so think we should
make the combination work. It'd say SET TRANSACTION SNAPSHOT should throw
an error only if the transaction is marked READ ONLY DEFERRABLE *and*
the provided snapshot isn't safe.

This allows a deferrable snapshot to be used on a second connection (
by e.g. pg_dump), and still be marked as DEFERRABLE. If we throw an
error unconditionally, the second connection has to import the snapshot
without marking it DEFERRABLE, which I think has consequences for
performance. AFAIR, the SERIALIZABLE implementation is able to skip
almost all (or all? Kevin?) SIREAD lock acquisitions in DEFERRABLE READ
ONLY transaction, because those cannot participate in dangerous (i.e.
non-serializable) dependency structures.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in walsender when calling out to do_pg_stop_backup (and others?)

2011-10-19 Thread Greg Jaskiewicz


On 19 Oct 2011, at 18:28, Florian Pflug wrote:
 All the other flags which indicate cancellation reasons are set from signal 
 handers, I believe. We could of course mark as ClientConnectionLostPending as 
 volatile just to be consistent. Not sure whether that's a good idea, or not. 
 It might prevent a mistake should we ever add code to detect lost connections 
 asynchronously (i.e., from somewhere else than pq_flush). And the cost is 
 probably negligible, because CHECK_FOR_INTERRUPTS tests for InterruptPending 
 before calling ProcessInterrupts(), so we only pay the cost of volatile if 
 there's actually an interrupt pending. But I still think it's better to add 
 qualifies such a volatile only when really necessary. A comment about why it 
 *isn't* volatile is probably in order, though, so I'll add that in the next 
 version of the patch.
 
Makes sense. 

I had to ask, because it sticks out. And indeed there is a possibility that 
someone will one day will try to use from signal handler, etc. 


 best regards,
 Florian Pflug
 
 PS: Thanks for the review. It's very much appreciated!
No problem, Got inspired by the talk I attended here at the pgconf.eu. Seems 
like a good idea to do these things, I have years of experience as developer 
and doing (relatively) well thanks to PostgreSQL - makes sense to contribute 
something back. 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] SSI implementation question

2011-10-19 Thread Tom Lane

Is it really necessary for GetSerializableTransactionSnapshotInt to
acquire an empty SERIALIZABLEXACT before it acquires a snapshot?
If so, why?  The proposed synchronized-snapshots feature will mean
that the allegedly-new snapshot actually was taken some time before,
so it seems to me that either this is not necessary or we cannot use
a synchronized snapshot in a serializable xact.

In the same vein, why is it necessary to be holding
SerializableXactHashLock (exclusively, yet) while acquiring the
snapshot?  That seems rather bad from a concurrency standpoint, and
again it's going to be pretty meaningless if we're just installing a
pre-existing snapshot.

The reason these things came to mind is that I want to refactor the code
so that the SSI-specific work in GetSerializableTransactionSnapshotInt
is done by a function that is handed an already-taken snapshot, because
I cannot stomach what Joachim did to the APIs of GetSnapshotData and
allied functions.  But refactor or no refactor, it seems like installing
a pre-existing snapshot may be breaking some assumptions here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] synchronized snapshots

2011-10-19 Thread Tom Lane

Florian Pflug f...@phlo.org writes:
 On Oct19, 2011, at 18:17 , Tom Lane wrote:
 AFAICS we should just throw an error if SET TRANSACTION SNAPSHOT is done
 in a transaction with those properties.  Has anyone got another
 interpretation?  Would it be better to silently ignore the DEFERRABLE
 property?

 Hm, both features are meant to be used by pg_dump, so think we should
 make the combination work. It'd say SET TRANSACTION SNAPSHOT should throw
 an error only if the transaction is marked READ ONLY DEFERRABLE *and*
 the provided snapshot isn't safe.

Um, no, I don't think so.  It would be sensible for the leader
transaction to use READ ONLY DEFERRABLE and then export the snapshot it
got (possibly after waiting).  It doesn't follow that the child
transactions should use DEFERRABLE too.  They're not going to wait.

 This allows a deferrable snapshot to be used on a second connection (
 by e.g. pg_dump), and still be marked as DEFERRABLE. If we throw an
 error unconditionally, the second connection has to import the snapshot
 without marking it DEFERRABLE, which I think has consequences for
 performance.

No, I don't believe that either.  AIUI the performance benefit comes if
the snapshot is recognized as safe.  DEFERRABLE only means to keep
retrying until you get a safe one.  This is nonsense when you're
importing the snapshot.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Deferrable unique constraints vs join removal -- bug?

2011-10-19 Thread Robert Haas

On Wed, Oct 19, 2011 at 7:35 AM, Marti Raudsepp ma...@juffo.org wrote:
 This probably doesn't affect many real-world applications, but it
 seems wrong that a performance feature can affect results returned by
 a query.

 Test case:

 create table uniq (i int unique deferrable initially deferred);
 begin;
 insert into uniq values(1),(1);
 select count(*) from uniq a left join uniq b using (i);
  count
 ---
     2

Yuck.  Well, that's certainly a bug.  What's weird is that I thought
we had put logic into the join removal code to ignore deferrable
constraints.  Apparently not.  I think maybe what we should do is add
an immediate field to IndexOptInfo, mirroring the existing unique
flag, and have get_relation_info() populate it from indimmediate, and
then make relation_has_unique_index() disqualify any non-immediate
index.

has_unique_index() arguably needs a similar fix, although at present
that appears to be used for only statistic purposes, so maybe it's OK.
  A comment update might be a good idea, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] synchronized snapshots

2011-10-19 Thread Robert Haas

On Wed, Oct 19, 2011 at 1:02 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 On Oct19, 2011, at 18:17 , Tom Lane wrote:
 AFAICS we should just throw an error if SET TRANSACTION SNAPSHOT is done
 in a transaction with those properties.  Has anyone got another
 interpretation?  Would it be better to silently ignore the DEFERRABLE
 property?

 Hm, both features are meant to be used by pg_dump, so think we should
 make the combination work. It'd say SET TRANSACTION SNAPSHOT should throw
 an error only if the transaction is marked READ ONLY DEFERRABLE *and*
 the provided snapshot isn't safe.

 Um, no, I don't think so.  It would be sensible for the leader
 transaction to use READ ONLY DEFERRABLE and then export the snapshot it
 got (possibly after waiting).  It doesn't follow that the child
 transactions should use DEFERRABLE too.  They're not going to wait.

 This allows a deferrable snapshot to be used on a second connection (
 by e.g. pg_dump), and still be marked as DEFERRABLE. If we throw an
 error unconditionally, the second connection has to import the snapshot
 without marking it DEFERRABLE, which I think has consequences for
 performance.

 No, I don't believe that either.  AIUI the performance benefit comes if
 the snapshot is recognized as safe.  DEFERRABLE only means to keep
 retrying until you get a safe one.  This is nonsense when you're
 importing the snapshot.

I think the requirement is that we need to do the appropriate push-ups
so that the people who import the snapshot know that it's safe, and
that the SSI stuff can all be skipped.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] (patch) regression diffs on collate.linux.utf8 test

2011-10-19 Thread Jeff Davis

On Wed, 2011-10-19 at 11:44 +0300, Peter Eisentraut wrote:
 On tis, 2011-10-18 at 21:47 -0700, Jeff Davis wrote:
  On Tue, 2011-10-18 at 22:25 +0300, Peter Eisentraut wrote:
   Presumably because Jeff doesn't have that particular locale installed.
   locale -a would clarify that.
  
  $ locale -a |grep -i tr
  tr_CY.utf8
  tr_TR.utf8
  
  So, yes, I only have the UTF8 version. I didn't realize they were
  different -- do you happen to know what package I need for just plain
  tr_TR?
 
 dpkg-reconfigure locales

Did that, and still:

# locale -a|grep -i tr
tr_CY.utf8
tr_TR.utf8

 apt-get install locales-all

# aptitude install locales-all
No candidate version found for locales-all
No candidate version found for locales-all
No packages will be installed, upgraded, or removed.
0 packages upgraded, 0 newly installed, 0 to remove and 80 not upgraded.
Need to get 0 B of archives. After unpacking 0 B will be used.
 
Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] (patch) regression diffs on collate.linux.utf8 test

2011-10-19 Thread Jeff Davis

On Wed, 2011-10-19 at 10:10 -0700, Jeff Davis wrote:
  dpkg-reconfigure locales

I had to manually do

# locale-gen tr_TR

to make it generate tr_TR.ISO-8859-9, and now it works.

I'm not sure what we should do, exactly, but I expect that others who
attempt to run the test on ubuntu (and maybe debian) might get confused.
I'd be fine with leaving it alone though, too.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Deferrable unique constraints vs join removal -- bug?

2011-10-19 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 Yuck.  Well, that's certainly a bug.  What's weird is that I thought
 we had put logic into the join removal code to ignore deferrable
 constraints.

Yeah, I thought we had too.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI implementation question

2011-10-19 Thread Kevin Grittner

Tom Lane t...@sss.pgh.pa.us wrote:
 
 Is it really necessary for GetSerializableTransactionSnapshotInt
 to acquire an empty SERIALIZABLEXACT before it acquires a
 snapshot?  If so, why?  The proposed synchronized-snapshots
 feature will mean that the allegedly-new snapshot actually was
 taken some time before, so it seems to me that either this is not
 necessary or we cannot use a synchronized snapshot in a
 serializable xact.
 
Hmm.  If the intent is that each serializable transaction sharing
the snapshot is a separate logical transaction, it *might* hold -- I
have to think about that a bit.  If the intent is that the work of
one logical transaction is being split across processes, then SSI
doesn't hold up without somehow tying all of the processes to a
single SERIALIZABLEXACT; and then the direct access to
MySerializableXact falls apart.
 
 In the same vein, why is it necessary to be holding
 SerializableXactHashLock (exclusively, yet) while acquiring the
 snapshot?  That seems rather bad from a concurrency standpoint,
 and again it's going to be pretty meaningless if we're just
 installing a pre-existing snapshot.
 
That seems like something which probably could and should be fixed,
especially since SerializableXactHashLock can become a bottleneck in
9.1 and will be much more of a problem with the elimination of other
bottlenecks in 9.2.
 
 The reason these things came to mind is that I want to refactor
 the code so that the SSI-specific work in
 GetSerializableTransactionSnapshotInt is done by a function that
 is handed an already-taken snapshot, because I cannot stomach what
 Joachim did to the APIs of GetSnapshotData and allied functions. 
 But refactor or no refactor, it seems like installing a
 pre-existing snapshot may be breaking some assumptions here.
 
I guess the other thing to look out for is whether it could possibly
move PredXact-SxactGlobalXmin backwards.  I would have to check for
other possible problems.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] synchronized snapshots

2011-10-19 Thread Kevin Grittner

Robert Haas robertmh...@gmail.com wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 
 This allows a deferrable snapshot to be used on a second
 connection (by e.g. pg_dump), and still be marked as DEFERRABLE.
 If we throw an error unconditionally, the second connection has
 to import the snapshot without marking it DEFERRABLE, which I
 think has consequences for performance.

 No, I don't believe that either.  AIUI the performance benefit
 comes if the snapshot is recognized as safe.  DEFERRABLE only
 means to keep retrying until you get a safe one.
 
Right, there are other circumstances in which a READ ONLY
transaction's snapshot may be recognized as safe, and it can opt out
of all the additional SSI logic.  As you say, DEFERRABLE means we
*wait* for that.
 
 This is nonsense when you're importing the snapshot.
 
Agreed.
 
 I think the requirement is that we need to do the appropriate
 push-ups so that the people who import the snapshot know that it's
 safe, and that the SSI stuff can all be skipped.
 
If the snapshot was safe in the first process, it will be safe for
any others with which it is shared.  Basically, a SERIALIZABLE READ
ONLY DEFERRABLE transaction waits for a snapshot which, as a READ
ONLY transaction, can't see any serialization anomalies.  It can run
exactly like a REPEATABLE READ transaction.  In fact, it would be OK
from a functional perspective if the first transaction in pg_dump
got a safe snapshot through DEFERRABLE techniques and then shared it
with REPEATABLE READ transactions.
 
I don't know which is the best way to implement this, but it should
be fine to skip the DEFERRABLE logic for secondary users, as long as
they are READ ONLY.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI implementation question

2011-10-19 Thread Kevin Grittner

Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 If the intent is that each serializable transaction sharing
 the snapshot is a separate logical transaction, it *might* hold --
 
I think the rules have to be that the snapshot provided to a
serializable transaction must be provided by an active serializable
transaction.  That prevents the serializable global xmin from moving
backwards; which is not allowed except during recovery processing of
prepared transactions.  Each transaction using the snapshot is a
logically separate transaction -- they just have a shared view of
the state of the data.
 
 If the intent is that the work of one logical transaction is being
 split across processes, then SSI doesn't hold up without somehow
 tying all of the processes to a single SERIALIZABLEXACT; and then
 the direct access to MySerializableXact falls apart.
 
Except, as discussed on a separate, concurrent thread, that a READ
ONLY transaction might find its snapshot to be safe -- at which
point it no longer uses a SERIALIZABLEXACT.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Large C files

2011-10-19 Thread Alvaro Herrera


Excerpts from David Fetter's message of lun oct 17 03:00:19 -0300 2011:
 On Fri, Oct 14, 2011 at 07:36:32PM +0100, Peter Geoghegan wrote:
  This evening, David Fetter described a problem to me that he was
  having building the Twitter FDW. It transpired that it was down to its
  dependence on an #include that was recently judged to be
  redundant.This seems like another reason to avoid using pgrminclude -
  even if we can be certain that the #includes are redundant within
  Postgres, we cannot be sure that they are redundant in third party
  code.
 
 Perhaps something that tested some third-party code
 automatically...say, doesn't the new buildfarm stuff allow running
 some arbitrary code?

I think you could run your own buildfarm member and add whatever steps
you wanted (we have one building JDBC).  I'm not sure that we want that
though -- it'd start polluting the greenness of our buildfarm with
failures from code outside of our control.

I mean, if the third party code fails to compile, surely it's the third
party devs that care.

I fail to see why this is such a big deal.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] synchronized snapshots

2011-10-19 Thread Florian Pflug

On Oct19, 2011, at 19:49 , Kevin Grittner wrote:
 Robert Haas robertmh...@gmail.com wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Florian Pflug f...@phlo.org writes:
 
 This allows a deferrable snapshot to be used on a second
 connection (by e.g. pg_dump), and still be marked as DEFERRABLE.
 If we throw an error unconditionally, the second connection has
 to import the snapshot without marking it DEFERRABLE, which I
 think has consequences for performance.
 
 No, I don't believe that either.  AIUI the performance benefit
 comes if the snapshot is recognized as safe.  DEFERRABLE only
 means to keep retrying until you get a safe one.
 
 Right, there are other circumstances in which a READ ONLY
 transaction's snapshot may be recognized as safe, and it can opt out
 of all the additional SSI logic.  As you say, DEFERRABLE means we
 *wait* for that.

Oh, cool. I thought the opt-out only works for explicitly DEFERRABLE
transactions.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_dumpall Sets Roll default_tablespace Before Creating Tablespaces

2011-10-19 Thread David E. Wheeler

Hackers,

We've just found an issue with pg_dumpall in 9.1.1 where a dump starts with 
lines like these:

ALTER ROLE dude WITH NOSUPERUSER INHERIT NOCREATEROLE NOCREATEDB LOGIN 
PASSWORD 'md5bdd7f8e73a214981b1519212b02a5530' VALID UNTIL 'infinity';
ALTER ROLE dude SET default_tablespace TO 'users';

And later in the file has lines like this:

CREATE TABLESPACE users OWNER postgres LOCATION 
'/data/postgres/pg_tblspc/users';

Unsurprisingly, perhaps, this results in errors such as:

ERROR:  invalid value for parameter default_tablespace: users

Seems to me that default_tablespace should only be set after tablespaces are 
created, no?

This is wreaking havoc with our ability to run pg_upgrade, FWIW.

Best,

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI implementation question

2011-10-19 Thread Tom Lane

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Is it really necessary for GetSerializableTransactionSnapshotInt
 to acquire an empty SERIALIZABLEXACT before it acquires a
 snapshot?  If so, why?  The proposed synchronized-snapshots
 feature will mean that the allegedly-new snapshot actually was
 taken some time before, so it seems to me that either this is not
 necessary or we cannot use a synchronized snapshot in a
 serializable xact.
 
 Hmm.  If the intent is that each serializable transaction sharing
 the snapshot is a separate logical transaction, it *might* hold -- I
 have to think about that a bit.  If the intent is that the work of
 one logical transaction is being split across processes, then SSI
 doesn't hold up without somehow tying all of the processes to a
 single SERIALIZABLEXACT; and then the direct access to
 MySerializableXact falls apart.

No, the intention of the synchronized-snapshots feature is just to be
able to start multiple transactions using exactly the same snapshot.
They're independent after that.  The aspect of it that is worrying me
is that if xact A starts, gets a snapshot and publishes it, and then
xact B starts and wants to adopt that snapshot, then

(1) other transactions may have started or ended meanwhile; does that
break any of SSI's assumptions?

(2) as things stand, xact A need not be running in serializable mode ---
if B is serializable, does *that* break any assumptions?

We already have to have an interlock to ensure that GlobalXmin doesn't
go backwards, by means of requiring A to still be running at the instant
B adopts the snapshot and sets its MyProc-xmin accordingly.  However,
there is not currently anything that guarantees that A is still running
by the time B does GetSerializableTransactionSnapshotInt, shortly later.
So if your answer to question (1) involves an assumption that A is still
running, we're going to have to figure out how to arrange that without
deadlocking on ProcArrayLock vs SerializableXactHashLock.  Which might
be another good reason for changing predicate.c so that we don't hold
the latter while taking a snapshot ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] synchronized snapshots

2011-10-19 Thread Kevin Grittner

Florian Pflug f...@phlo.org wrote:
 
 Oh, cool. I thought the opt-out only works for explicitly
 DEFERRABLE transactions.
 
Basically, if there is no serializable read-write transaction active
which overlaps the read-only transaction and also overlaps a
serializable transaction which wrote something and committed in time
to be visible to the read-only transaction, then the read-only
transaction's snapshot is safe and it can stop worrying about SSI
logic.  If these conditions happen to exist when a read-only
transaction is starting, it never needs to set up for SSI; it can
run just like a REPEATABLE READ transaction and still be safe from
serialization anomalies.  We make some effort to spot the transition
to this state while a read-only transaction is running, allowing it
to drop out of SSI while running.
 
The fact that a read-only transaction can often skip some or all of
the SSI overhead (beyond determining that opting out is safe) is why
declaring transactions to be READ ONLY when possible is #1 on my
list of performance considerations for SSI.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI implementation question

2011-10-19 Thread Dan Ports

On Wed, Oct 19, 2011 at 12:56:58PM -0400, Tom Lane wrote:
 Is it really necessary for GetSerializableTransactionSnapshotInt to
 acquire an empty SERIALIZABLEXACT before it acquires a snapshot?
 If so, why? 

*That* isn't necessary, no. It is necessary, however, to acquire the
snapshot while SerializableXactHashLock is held. There are a couple
reasons for this: the sxact's lastCommitBeforeSnapshot needs to match
the snapshot, SxactGlobalXmin needs to be set to the correct value,
etc. That's why the call to GetSnapshotData happens from where it does

 The proposed synchronized-snapshots feature will mean
 that the allegedly-new snapshot actually was taken some time before,
 so it seems to me that either this is not necessary or we cannot use
 a synchronized snapshot in a serializable xact.

There are definitely potential problems here. If the original snapshot
doesn't belong to an active serializable transaction, we may have
discarded the state we need to do SSI, e.g. we might have already
cleaned up SIREAD locks from concurrent committed transactions.

I assume the answer here is going to have to be to either refuse to
start a serializable transaction if that's the case, or make saving a
snapshot inhibit some of the SSI cleanup.

 In the same vein, why is it necessary to be holding
 SerializableXactHashLock (exclusively, yet) while acquiring the
 snapshot?  That seems rather bad from a concurrency standpoint, and
 again it's going to be pretty meaningless if we're just installing a
 pre-existing snapshot.

Yes, it's bad. I'm working on a design to address
SerializableXactHashLock contention, but there needs to be some locking
here for the reasons I mentioned above. I think the best we can do here
is to acquire a lock in shared mode when registering a serializable
transaction and in exclusive mode when committing. (Which is what you'd
expect, I guess; it's the same story as ProcArrayLock, and for most of
the same reasons.) Obviously, we'll also want to minimize the amount of
work we're doing while holding that lock.

Dan

-- 
Dan R. K. Ports  MIT CSAILhttp://drkp.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI implementation question

2011-10-19 Thread Tom Lane

Dan Ports d...@csail.mit.edu writes:
 On Wed, Oct 19, 2011 at 12:56:58PM -0400, Tom Lane wrote:
 The proposed synchronized-snapshots feature will mean
 that the allegedly-new snapshot actually was taken some time before,
 so it seems to me that either this is not necessary or we cannot use
 a synchronized snapshot in a serializable xact.

 There are definitely potential problems here. If the original snapshot
 doesn't belong to an active serializable transaction, we may have
 discarded the state we need to do SSI, e.g. we might have already
 cleaned up SIREAD locks from concurrent committed transactions.

 I assume the answer here is going to have to be to either refuse to
 start a serializable transaction if that's the case, or make saving a
 snapshot inhibit some of the SSI cleanup.

We can easily mark an exported snapshot with the IsoLevel of the
transaction that made it, and then for example refuse to adopt a
less-than-serializable snapshot into a serializable transaction.
So that aspect can be had if we need it.  But we still have a race
condition with the patch as it stands, because there is a window for the
original xact to terminate before GetSerializableTransactionSnapshotInt
runs.  It sounds like we have to fix that.

 In the same vein, why is it necessary to be holding
 SerializableXactHashLock (exclusively, yet) while acquiring the
 snapshot?  That seems rather bad from a concurrency standpoint, and
 again it's going to be pretty meaningless if we're just installing a
 pre-existing snapshot.

 Yes, it's bad. I'm working on a design to address
 SerializableXactHashLock contention, but there needs to be some locking
 here for the reasons I mentioned above. I think the best we can do here
 is to acquire a lock in shared mode when registering a serializable
 transaction and in exclusive mode when committing. (Which is what you'd
 expect, I guess; it's the same story as ProcArrayLock, and for most of
 the same reasons.) Obviously, we'll also want to minimize the amount of
 work we're doing while holding that lock.

I wonder whether it would be prudent to set the synchronized-snapshots
patch aside until you've finished that work (assuming you're actively
working on it).  It's evidently going to require some nontrivial changes
in predicate.c, and I don't think the feature should take precedence
over SSI performance improvement.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dumpall Sets Roll default_tablespace Before Creating Tablespaces

2011-10-19 Thread Tom Lane

David E. Wheeler da...@kineticode.com writes:
 We've just found an issue with pg_dumpall in 9.1.1 where a dump starts with 
 lines like these:

 ALTER ROLE dude WITH NOSUPERUSER INHERIT NOCREATEROLE NOCREATEDB LOGIN 
 PASSWORD 'md5bdd7f8e73a214981b1519212b02a5530' VALID UNTIL 'infinity';
 ALTER ROLE dude SET default_tablespace TO 'users';

I'm beginning to think that the correct solution to these problems is to
greatly restrict what you can set in ALTER ROLE/DATABASE SET.  Or at
least to document that if you use it, you get to keep both pieces after
you break pg_dump.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dumpall Sets Roll default_tablespace Before Creating Tablespaces

2011-10-19 Thread David E. Wheeler

On Oct 19, 2011, at 2:13 PM, Tom Lane wrote:

ALTER ROLE dude WITH NOSUPERUSER INHERIT NOCREATEROLE NOCREATEDB LOGIN 
 PASSWORD 'md5bdd7f8e73a214981b1519212b02a5530' VALID UNTIL 'infinity';
ALTER ROLE dude SET default_tablespace TO 'users';
 
 I'm beginning to think that the correct solution to these problems is to
 greatly restrict what you can set in ALTER ROLE/DATABASE SET.  Or at
 least to document that if you use it, you get to keep both pieces after
 you break pg_dump.

Sorry, get to keep what two pieces?

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI implementation question

2011-10-19 Thread Dan Ports

On Wed, Oct 19, 2011 at 04:36:41PM -0400, Tom Lane wrote:
 No, the intention of the synchronized-snapshots feature is just to be
 able to start multiple transactions using exactly the same snapshot.
 They're independent after that.  The aspect of it that is worrying me
 is that if xact A starts, gets a snapshot and publishes it, and then
 xact B starts and wants to adopt that snapshot, then
 
 (2) as things stand, xact A need not be running in serializable mode ---
 if B is serializable, does *that* break any assumptions?

[taking these in opposite order]

Yes, I think that's going to be a problem. The obvious case where it's
clearly not going to work is if A is older than the oldest active
serializable xact (i.e. SxactGlobalXmin would have to move backwards).
It's probably possible to make it work when that's not the case, but I
think it's better to require A to be serializable -- if nothing else,
it's a far simpler rule to document!

There is another case that could be problematic, if A was READ ONLY,
and B isn't. It sounds to me like that would also be a reasonable thing
to forbid.

 (1) other transactions may have started or ended meanwhile; does that
 break any of SSI's assumptions?

Mostly, no, if A is still running. There's one case that needs to be
handled a bit carefully, but shouldn't be a problem: if A was
SERIALIZABLE READ ONLY, and its snapshot was found to be safe, then
it's actually running (safely) at REPEATABLE READ. If we start a new
read-only transaction at the same snapshot, we need to make it run at
REPEATABLE READ if it requests SERIALIZABLE.

 We already have to have an interlock to ensure that GlobalXmin doesn't
 go backwards, by means of requiring A to still be running at the instant
 B adopts the snapshot and sets its MyProc-xmin accordingly.  However,
 there is not currently anything that guarantees that A is still running
 by the time B does GetSerializableTransactionSnapshotInt, shortly later.
 So if your answer to question (1) involves an assumption that A is still
 running, we're going to have to figure out how to arrange that without
 deadlocking on ProcArrayLock vs SerializableXactHashLock.

Yep, I think we're going to have to do that. I haven't had a chance to
look at the synchronized snapshots patch yet, so I can't (yet) offer
any suggestions about how to implement it.

 Which might
 be another good reason for changing predicate.c so that we don't hold
 the latter while taking a snapshot ...

It'd be great if we could do that, but I don't really see it being
possible...

Dan

-- 
Dan R. K. Ports  MIT CSAILhttp://drkp.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI implementation question

2011-10-19 Thread Kevin Grittner

Dan Ports d...@csail.mit.edu wrote:
 
 the sxact's lastCommitBeforeSnapshot needs to match the snapshot,
 SxactGlobalXmin needs to be set to the correct value, etc. That's
 why the call to GetSnapshotData happens from where it does
 
Oh, right.  I knew I was forgetting something.  What if that was
captured as part of building a snapshot?  That seems like it would
be a trivial cost compared to other snapshot-building activity, and
might give us a way to get this out from under the
SerializableXactHashLock locking.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dumpall Sets Roll default_tablespace Before Creating Tablespaces

2011-10-19 Thread Robert Haas

On Wed, Oct 19, 2011 at 5:13 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 David E. Wheeler da...@kineticode.com writes:
 We've just found an issue with pg_dumpall in 9.1.1 where a dump starts with 
 lines like these:

     ALTER ROLE dude WITH NOSUPERUSER INHERIT NOCREATEROLE NOCREATEDB LOGIN 
 PASSWORD 'md5bdd7f8e73a214981b1519212b02a5530' VALID UNTIL 'infinity';
     ALTER ROLE dude SET default_tablespace TO 'users';

 I'm beginning to think that the correct solution to these problems is to
 greatly restrict what you can set in ALTER ROLE/DATABASE SET.  Or at
 least to document that if you use it, you get to keep both pieces after
 you break pg_dump.

This is another instance of the general principle that we need to
create all the objects first, and then set their properties.  I
believe you came up with one counterexample where we needed to set the
GUC first in order to be able to create the object, but ISTM most of
them are going the other way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-10-19 Thread Robert Haas

On Wed, Oct 19, 2011 at 6:35 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 2011/10/19 Tom Lane t...@sss.pgh.pa.us:
 Robert Haas robertmh...@gmail.com writes:
 On Sun, Oct 16, 2011 at 4:46 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 I tried to reproduce the scenario with enough small 
 from/join_collapse_limit
 (typically 1), but it allows to push down qualifiers into the least scan 
 plan.

 Hmm, you're right.  LIMIT 10 prevents qual pushdown, but
 hitting from_collapse_limit/join_collapse_limit apparently doesn't.  I
 could have sworn I've seen this work the other way, but I guess not.

 No, the collapse_limit variables are entirely unrelated to subquery
 flattening, or to qual pushdown for that matter.  They only restrict the
 number of join paths we consider.  And we will attempt to push down
 quals into an unflattened subquery, too, if it looks safe.  See
 subquery_is_pushdown_safe, qual_is_pushdown_safe, etc in allpaths.c.

 I tried to observe the behavior with a bit modification of is_simple_subquery
 that become to return 'false' always.
 (It is a simulation if and when a view with security_barrier would be given.)

 The expected behavior is to keep sub-query without flatten.
 However, the externally provided qualifiers are correctly pushed down.

 Do we need to focus on the code around above functions rather than
 distribute_qual_to_rels, to prevent undesirable pushing-down across
 security barrier?

 postgres=# CREATE VIEW v1 AS SELECT * FROM t1 WHERE a  100;
 CREATE VIEW
 postgres=# CREATE VIEW v2 AS SELECT * FROM t2 JOIN t3 ON x = s;
 CREATE VIEW
 postgres=# EXPLAIN SELECT * FROM v1 WHERE b = 'bbb';
                     QUERY PLAN
 
  Seq Scan on t1  (cost=0.00..28.45 rows=2 width=36)
   Filter: ((a  100) AND (b = 'bbb'::text))
 (2 rows)
 postgres=# EXPLAIN SELECT * FROM v2 WHERE t = 'ttt';
                           QUERY PLAN
 
  Hash Join  (cost=25.45..52.73 rows=37 width=72)
   Hash Cond: (t2.x = t3.s)
   -  Seq Scan on t2  (cost=0.00..22.30 rows=1230 width=36)
   -  Hash  (cost=25.38..25.38 rows=6 width=36)
         -  Seq Scan on t3  (cost=0.00..25.38 rows=6 width=36)
               Filter: (t = 'ttt'::text)
 (6 rows)

Well, there's clearly some way to prevent pushdown from happening,
because sticking a LIMIT in there does the trick...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI implementation question

2011-10-19 Thread Tom Lane

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Dan Ports d...@csail.mit.edu wrote:
 the sxact's lastCommitBeforeSnapshot needs to match the snapshot,
 SxactGlobalXmin needs to be set to the correct value, etc. That's
 why the call to GetSnapshotData happens from where it does
 
 Oh, right.  I knew I was forgetting something.  What if that was
 captured as part of building a snapshot?  That seems like it would
 be a trivial cost compared to other snapshot-building activity, and
 might give us a way to get this out from under the
 SerializableXactHashLock locking.

But aren't the values you need to fetch protected by
SerializableXactHashLock?  Having to take an additional LWLock is
*not* a trivial cost.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI implementation question

2011-10-19 Thread Kevin Grittner

Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Dan Ports d...@csail.mit.edu wrote:
 the sxact's lastCommitBeforeSnapshot needs to match the
 snapshot, SxactGlobalXmin needs to be set to the correct value,
 etc. That's why the call to GetSnapshotData happens from where
 it does
  
 Oh, right.  I knew I was forgetting something.  What if that was
 captured as part of building a snapshot?  That seems like it
 would be a trivial cost compared to other snapshot-building
 activity, and might give us a way to get this out from under the
 SerializableXactHashLock locking.
 
 But aren't the values you need to fetch protected by
 SerializableXactHashLock?  Having to take an additional LWLock is
 *not* a trivial cost.
 
I was thinking that this would become a more general commit
sequence number and could be bundled into the snapshot.  It would
also allow cleaning up the funny double-increment we did as a
workaround for this not being incremented at the actual moment of
commit.  There was actually a patch floated to do it that way near
the end of the 9.1 development cycle.  I imagine that's probably
suffered major bitrot because of Robert's 9.2 work, but the general
idea is the same.
 
I agree that if it can't fit under existing locks at commit and
snapshot build times, it isn't feasible.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dumpall Sets Roll default_tablespace Before Creating Tablespaces

2011-10-19 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Wed, Oct 19, 2011 at 5:13 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm beginning to think that the correct solution to these problems is to
 greatly restrict what you can set in ALTER ROLE/DATABASE SET.  Or at
 least to document that if you use it, you get to keep both pieces after
 you break pg_dump.

 This is another instance of the general principle that we need to
 create all the objects first, and then set their properties.  I
 believe you came up with one counterexample where we needed to set the
 GUC first in order to be able to create the object, but ISTM most of
 them are going the other way.

Well, a general principle for which we already know one counterexample
isn't general enough for me.  The problem that I'm having here is that
it's not clear that there is any general solution, short of pg_dumpall
having variable-by-variable knowledge of which GUCs to set when, and
maybe even that wouldn't be good enough.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-10-19 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 Well, there's clearly some way to prevent pushdown from happening,
 because sticking a LIMIT in there does the trick...

I already pointed you at subquery_is_pushdown_safe ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI implementation question

2011-10-19 Thread Dan Ports

On Wed, Oct 19, 2011 at 05:04:52PM -0400, Tom Lane wrote:
 I wonder whether it would be prudent to set the synchronized-snapshots
 patch aside until you've finished that work (assuming you're actively
 working on it).  It's evidently going to require some nontrivial changes
 in predicate.c, and I don't think the feature should take precedence
 over SSI performance improvement.

I wouldn't hold the patch up on my account. Improving the SSI locking
situation looks to be a fairly substantial project. I've been drawing
up a plan to fix it, but I'm also travelling for most of the next two
weeks and probably won't be able to do any serious hacking on it until
I'm back to the office.

Dan

-- 
Dan R. K. Ports  MIT CSAILhttp://drkp.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] ProcessStandbyHSFeedbackMessage can make global xmin go backwards

2011-10-19 Thread Tom Lane

ProcessStandbyHSFeedbackMessage has a race condition: it thinks it can
call GetOldestXmin and then the result will politely hold still while
it considers what to do next.  But in fact, whoever has the oldest xmin
could exit their transaction, allowing the global minimum to advance.
If a VACUUM process then inspects the ProcArray, it could compute an
oldest xmin that is newer than the value that
ProcessStandbyHSFeedbackMessage installs just afterwards.  So much
for keeping the data the standby wanted.

AFAICS we have to do all the logic about choosing the new value for
MyProc-xmin while holding ProcArrayLock, which IMO means that it should
go into a function in procarray.c.  The fact that walsender.c is taking
ProcArrayLock and whacking MyProc-xmin around is already a horrid
violation of modularity, even if it weren't incorrect.

Also, it seems like using GetOldestXmin is quite wrong here anyway; we
don't really want a result modified by vacuum_defer_cleanup_age do we?
It looks to me like that would result in vacuum_defer_cleanup_age being
applied twice: the walsender process sets its xmin the defer age into
the past, and then subsequent calls of GetOldestXmin compute a result
that is the defer age further back.  IOW this is an independent
mechanism that also results in the computed global xmin going backwards.

(Now that we have a standby feedback mechanism, I'm a bit tempted to
propose getting rid of vacuum_defer_cleanup_age altogether, rather than
hacking things to avoid the above.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dumpall Sets Roll default_tablespace Before Creating Tablespaces

2011-10-19 Thread Florian Pflug

On Oct20, 2011, at 00:09 , Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Wed, Oct 19, 2011 at 5:13 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm beginning to think that the correct solution to these problems is to
 greatly restrict what you can set in ALTER ROLE/DATABASE SET.  Or at
 least to document that if you use it, you get to keep both pieces after
 you break pg_dump.
 
 This is another instance of the general principle that we need to
 create all the objects first, and then set their properties.  I
 believe you came up with one counterexample where we needed to set the
 GUC first in order to be able to create the object, but ISTM most of
 them are going the other way.
 
 Well, a general principle for which we already know one counterexample
 isn't general enough for me.  The problem that I'm having here is that
 it's not clear that there is any general solution, short of pg_dumpall
 having variable-by-variable knowledge of which GUCs to set when, and
 maybe even that wouldn't be good enough.

This whole issue reminds me of the situation we had before pg_dump
had the smarts to traverse the object dependency graph and emit schema
objects in the correct order. (pg_dump gained that ability somewhere
around 7.3 or 7.4 if memory serves correctly)

So here's a wild idea. Could we somehow make use of the dependency
machinery to solve this once and for all? Maybe we could record the
dependency between per role and/or database GUC settings and the
referenced objects.

Or we could add a flag FORCE to ALTER ROLE/DATABASE SET for pg_dump's
benefit which would skip all validity checks on the value (except it being
of the correct type, maybe).

Taking this even further, why do we bother with non-immutable (i.e.,
depending on the database's contents) checks during ALTER ROLE/DATABASET SET
at all? If we don't record a dependency on referenced schema objects,
nothing prevents that object from being dropped *after* the ALTER ROLE/DATABASE
SET occurred...

If we're trying to protect against typos in settings such as default_tablespace,
a WARNING ought to be sufficient.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI implementation question

2011-10-19 Thread Dan Ports

I think it would be fairly sensible to push some of this into
ProcArray, actually. The commit sequence numbers just involve assigning/
incrementing a global counter when taking a snapshot and finishing a
transaction -- that's not too much work, doesn't require any additional
locking beyond ProcArrayLock, and isn't too tied to SSI. (I could
imagine it being useful for other purposes, though I'm not going to
make that argument too seriously without actually having one in mind.)

SxactGlobalXmin and WritableSxactCount are obviously more SSI-specific,
but I think we can come up with something reasonable to do with them.

The part that's harder is building the list of potential conflicts
that's used to identify safe snapshots for r/o transactions. That
(currently) has to happen atomically taking the snapshot. We'll
probably have to do this in some significantly different way, but I
haven't quite worked out what it is yet.

On the bright side, if we can address these three issues, we shouldn't
need to take SerializableXactHashLock at all when starting a
transaction. (Realistically, we might have to take it or some other
lock shared to handle one of them -- but I really want starting a
serializable xact to not take any exclusive locks.)

Dan

-- 
Dan R. K. Ports  MIT CSAILhttp://drkp.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dumpall Sets Roll default_tablespace Before Creating Tablespaces

2011-10-19 Thread Tom Lane

Florian Pflug f...@phlo.org writes:
 Taking this even further, why do we bother with non-immutable (i.e.,
 depending on the database's contents) checks during ALTER ROLE/DATABASET SET
 at all?

Yeah, I was wondering about that one too.  It would not solve all the
problems here, but skipping validity checks would fix some of them.
The trouble of course is what happens if the value is found to be bad
when you try to use it ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dumpall Sets Roll default_tablespace Before Creating Tablespaces

2011-10-19 Thread Florian Pflug

On Oct20, 2011, at 01:19 , Tom Lane wrote:
 Florian Pflug f...@phlo.org writes:
 Taking this even further, why do we bother with non-immutable (i.e.,
 depending on the database's contents) checks during ALTER ROLE/DATABASET SET
 at all?
 
 Yeah, I was wondering about that one too.  It would not solve all the
 problems here, but skipping validity checks would fix some of them.
 The trouble of course is what happens if the value is found to be bad
 when you try to use it ...

Presumably we'd detect that during logon, because the GUC assignment
hook will quite probably complain. I'd vote for emitting a warning in
that case. This is also what we due currently if we fail to set the
GUC to the desired value due to permission issues

postgres=# create role r1 login;
CREATE ROLE
postgres=# create role r2;
CREATE ROLE
postgres=# alter role r1 set role = r2;
ALTER ROLE
postgres=# \connect - r1
WARNING:  permission denied to set role r2
WARNING:  permission denied to set role r2
You are now connected to database postgres as user r1.

(Dunno why that WARNING appears twice)

Since an ALTER DATABASE/ROLE SET doesn't prevent the user from overriding
the value, ignoring invalid settings shouldn't be a security risk.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Update on documentation builds on OSX w/ macports

2011-10-19 Thread Florian Pflug

I've recently gotten annoyed again by the sorry state of docbook
SGML support in macports, and finally decided to do something
about it (OK, the fact that I recently deleted my Ubuntu VM,
forgetting that one of the reasons I had it was to be able to
build our docs has something to do with it also...)

I've patched the ports for openjade, iso8879 and docbook-dsssl,
and added a new port for docbook-sgml-4.2. These patches are sitting
in the macports trac now, waiting to be applied.

In the mean time, the modified ports are all contained in the
attached tar.bz2, should any of ye fellow OSX users want to try them
out before that.

Simply extract that archive, and add
  file://Absolute path to the extracted archive
to /opt/local/etc/macports/sources.conf. After that, 
  port install openjade docbook-sgml-4.2
should give you a working docbook SGML environment.

Should openjade fail to build, try
  port install openjade -universal
instead. On my machine, with XCode 4.2 installed, the universal variant
of openjade fails to build for some reason.

Many thanks to Bernd Helmle for his blog entry on the subject of
docbook SGML and macports[1]. Without that, I probably would have
created a new Ubuntu VM instead of playing with this.

best regards,
Florian Pflug

[1] http://psoos.blogspot.com/2009/09/building-postgresql-documentation-on.html


macports.docbook-sgml-4.2.tar.bz2
Description: BZip2 compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXECUTE tab completion

2011-10-19 Thread Josh Kupershmidt

On Mon, Sep 26, 2011 at 5:03 PM, Andreas Karlsson andr...@proxel.se wrote:
 Magnus's patch for adding tab completion of views to the TABLE statement
 reminded me of a minor annoyance of mine -- that EXECUTE always completes
 with PROCEDURE as if it would have been part of CREATE TRIGGER ... EXECUTE
 even when it is the first word of the line.

+1

 Attached is a simple patch which adds completion of prepared statement names
 to the EXECUTE statement.

 What could perhaps be added is that if you press tab again after completing
 the prepared statement name you might want to see a single ( appear. Did
 not add that though since EXECUTE foo(); is not valid syntax (while
 EXECUTE foo(1); is) and I did not feel the extra lines of code to add a
 query to check if the number of expected parameters is greater than 0 were
 worth it.

Yeah, that doesn't seem worth the trouble. The patch looks fine to me;
it doesn't break the existing EXECUTE completion intended for CREATE
TRIGGER and seems to work OK on a few examples I tried.

I guess the only quibble I can see is that the two comment lines might
be better written together, to mimic the neighboring comment styles,
as in:
/* EXECUTE */
/* must not match CREATE TRIGGER ... EXECUTE PROCEDURE */
else if ...

Incidentally, I was wondering what the heck was up with a clause like this:
else if (pg_strcasecmp(prev_wd, EXECUTE) == 0 
 pg_strcasecmp(prev2_wd, EXECUTE) == 0)

though that looks to be some strange quirk of previous_word()'s behavior.

Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXECUTE tab completion

2011-10-19 Thread Tom Lane

Josh Kupershmidt schmi...@gmail.com writes:
 Incidentally, I was wondering what the heck was up with a clause like this:
 else if (pg_strcasecmp(prev_wd, EXECUTE) == 0 
  pg_strcasecmp(prev2_wd, EXECUTE) == 0)

Hmm, maybe || was meant not  ?  It seems pretty unlikely that the
above test would ever trigger on valid SQL input.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXECUTE tab completion

2011-10-19 Thread Josh Kupershmidt

On Wed, Oct 19, 2011 at 10:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Josh Kupershmidt schmi...@gmail.com writes:
 Incidentally, I was wondering what the heck was up with a clause like this:
     else if (pg_strcasecmp(prev_wd, EXECUTE) == 0 
              pg_strcasecmp(prev2_wd, EXECUTE) == 0)

 Hmm, maybe || was meant not  ?  It seems pretty unlikely that the
 above test would ever trigger on valid SQL input.

Well, changing '' to '||' breaks the stated comment of the patch, namely:
  /* must not match CREATE TRIGGER ... EXECUTE PROCEDURE */

I assume this is an accepted quirk of previous_word() since we have
this existing similar code:

/* DROP, but watch out for DROP embedded in other commands */
/* complete with something you can drop */
else if (pg_strcasecmp(prev_wd, DROP) == 0 
 pg_strcasecmp(prev2_wd, DROP) == 0)

and the patch does seem to auto-complete a beginning EXECUTE
correctly. We could probably use a comment somewhere explaining this
quirk.

Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Update on documentation builds on OSX w/ macports

2011-10-19 Thread Dan Ports

On Thu, Oct 20, 2011 at 02:02:09AM +0200, Florian Pflug wrote:
 I've patched the ports for openjade, iso8879 and docbook-dsssl,
 and added a new port for docbook-sgml-4.2. These patches are sitting
 in the macports trac now, waiting to be applied.

I'll try to take a look at them in the next couple days (with my
MacPorts hat on), unless someone beats me to it.

Dan

-- 
Dan R. K. Ports  MIT CSAILhttp://drkp.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] DROP statement reworks

2011-10-19 Thread Robert Haas

On Wed, Oct 19, 2011 at 3:30 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 part-1: only regression test of DROP [IF EXISTS] on unmodified master

Committed.  Which I just noticed broke the build farm.  Will go fix that now...

 part-2: drop statement reworks that uses T_DropStmt

Committed with some changes.

 part-3: drop statement reworks for other object classes

This is going to need some rebasing.

 Unfortunately, the part-3 had to change regression test portion a bit,
 because ...
  - Unmodified master does not print , skipping when we tried to
   drop non-existence operator class with IF EXISTS.

OK, we should fix that.

  - Unmodified master raised an error, not notice, when we tried to
   drop non-existence operator family with IF EXISTS.
   Is it a bug to be fixed, isn't it?

Yeah, that seems like a bug.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] DROP statement reworks

2011-10-19 Thread Robert Haas

On Wed, Oct 19, 2011 at 11:33 PM, Robert Haas robertmh...@gmail.com wrote:
 Committed.  Which I just noticed broke the build farm.  Will go fix that 
 now...

Wow, that was a lot of pretty colors.  Or not so pretty colors.

Seems to be turning green again now, so I'm going to bed.  Will check
it again in the morning.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

74 matches

Mail list logo