date:20110121

Re: [HACKERS] REVIEW: EXPLAIN and nfiltered

2011-01-21 Thread Hitoshi Harada

2011/1/21 Florian Pflug :
> On Jan21, 2011, at 03:29 , Tom Lane wrote:
>> "Kevin Grittner"  writes:
 Robert Haas  wrote:
 Oh, you mean water that had some things you didn't want taken out
 of it?
>>
>>> Right -- God only knows the number of things were filtered out to
>>> leave me with filtered water.  What's "filtered" in this case is what
>>> was passed through, not what was removed.
>>
>> I think it's pretty common to use the phrase "filtered out" to identify
>> the stuff that gets removed by the filter, as opposed to what gets
>> through.  So we could possibly use "Rows Filtered Out: nnn".  I still
>> think that's more awkward than "Rows Removed: nnn" though.
>
> "Rows Skipped: nnn", maybe?

+1. Very straightforward to me.

Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] auto-sizing wal_buffers

2011-01-21 Thread Jaime Casanova

On Sat, Jan 22, 2011 at 12:33 AM, Robert Haas  wrote:
> On Sat, Jan 15, 2011 at 11:52 AM, Greg Smith  wrote:
>> Where I was expecting that setting to be "4" instead for 32kB.  So there's
>> probably some minor bug left in where I inserted this into the
>> initialization sequence.
>
> So I exposed the relevant formatting logic from guc.c as a separate function

i have read this very breafly, so not much comment... just a few questions...

why is this better than using XLOG_BUFFER_MIN? (the same for the 8
buffers assigned just above of it)

+   else if (XLOGbuffers < 4)
+   XLOGbuffers = 4;


also this
+   Assert(XLOGbuffers > 0);
maybe should be
Assert(XLOGbuffers >= XLOG_BUFFER_MIN);


while you move the code, why didn't you keep this comment?
-   /*
-* Use int64 arithmetic to avoid 
overflows in units
-* conversion.
-*/



-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] auto-sizing wal_buffers

2011-01-21 Thread Robert Haas

On Sat, Jan 15, 2011 at 11:52 AM, Greg Smith  wrote:
> Where I was expecting that setting to be "4" instead for 32kB.  So there's
> probably some minor bug left in where I inserted this into the
> initialization sequence.

So I started taking a look at this patch tonight with an eye to
committing it, but ended up having to whack it around fairly hard to
fix the problem described above: the problem is, I believe, that the
initialization code in question doesn't run in every backend.  My
first thought was "gee, it's a bad idea for us to be manipulating the
value of the GUC variable directly, I should use an assign_hook".
But of course that turns out to be a bad idea, because there's no
guarantee that wal_buffers will be initialized after shared_buffers.
So I put it back the way you had it, and jiggered things so that the
copy of the value that's actually used for memory allocation gets
stored in XLogCtl.  That made it possible to define a show_hook that
DTRT, but that wasn't entirely straightforward either: the show_hook
has to deliver the finished string, not just a replacement integer for
guc.c to format.  So I exposed the relevant formatting logic from
guc.c as a separate function (duplicating that code didn't seem
smart), and now it all seems to work.  See attached, which also
includes some other rewriting of code and docs that I hope constitute
improvements.

Barring screams of agony^W^W^Whelpful suggestions for how to code this
more neatly, I'll go ahead and commit this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 8e2a2c5..bb67131 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1638,12 +1638,24 @@ SET ENABLE_SEQSCAN TO OFF;
   
   

-The amount of memory used in shared memory for WAL data.  The
-default is 64 kilobytes (64kB).  The setting need only
-be large enough to hold the amount of WAL data generated by one
-typical transaction, since the data is written out to disk at
-every transaction commit.  This parameter can only be set at server
-start.
+The amount of shared memory used for storing WAL data.  The
+default setting of -1 selects a size equal to 1/32 of
+shared_buffers (about 3%), but not less than
+64kB or more than than the size of one WAL segment,
+typically 16MB.  This value may be manually
+changed if it is too large or too small, but any positive value
+less than 32kB will be treated as
+32kB.
+   
+
+   
+The contents of these buffers are written out to disk at every
+transaction commit, so extremely large values are unlikely to
+provide a significant benefit.  However, setting this value to at
+least a few megabytes can improve write performance on a busy
+server where many clients are committing at once.  The auto-tuning
+selected by the default setting of -1 should give reasonable
+results in most cases.

 

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 5b6a230..a5df3bb 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -69,7 +69,8 @@
 /* User-settable parameters */
 int			CheckPointSegments = 3;
 int			wal_keep_segments = 0;
-int			XLOGbuffers = 8;
+int			wal_buffers = -1;/* raw GUC setting */
+int			XLOGbuffers = -1;/* buffers actually allocated */
 int			XLogArchiveTimeout = 0;
 bool		XLogArchiveMode = false;
 char	   *XLogArchiveCommand = NULL;
@@ -381,6 +382,7 @@ typedef struct XLogCtlData
 	int			XLogCacheBlck;	/* highest allocated xlog buffer index */
 	TimeLineID	ThisTimeLineID;
 	TimeLineID	RecoveryTargetTLI;
+	int			wal_buffers;	/* so backends can see auto-tuned value */
 
 	/*
 	 * archiveCleanupCommand is read from recovery.conf but needs to be in
@@ -4778,6 +4780,47 @@ GetSystemIdentifier(void)
 }
 
 /*
+ * Auto-tune the number of XLOG buffers.
+ *
+ * When wal_buffers == -1, we auto-tune to about 3% of shared_buffers, with
+ * a maximum of one XLOG segment and a minimum of 8 blocks (the default value
+ * prior to PostgreSQL 9.1, when auto-tuning was added).  We also clamp the
+ * manually set minimum to four blocks (prior to PostgreSQL 9.1, a smaller
+ * value would have been out of range, but since the minimum is now -1 for
+ * auto-tuning, we just silently treat such values as a request for the
+ * minimum).
+ */
+static void
+XLOGTuneNumBuffers()
+{
+	XLOGbuffers = wal_buffers;
+
+	if (XLOGbuffers == -1)
+	{
+		XLOGbuffers = NBuffers / 32;
+		if (XLOGbuffers > XLOG_SEG_SIZE / XLOG_BLCKSZ)
+			XLOGbuffers = XLOG_SEG_SIZE / XLOG_BLCKSZ;
+		if (XLOGbuffers < 8)
+			XLOGbuffers = 8;
+	}
+	else if (XLOGbuffers < 4)
+		XLOGbuffers = 4;
+}
+
+/*
+ * Show the number of WAL buffers; we want this to display the auto-tuned value
+ * from sha

Re: [HACKERS] WIP: RangeTypes

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 2:30 PM, Jeff Davis  wrote:
> On Fri, 2011-01-21 at 12:31 -0500, Robert Haas wrote:
>> On Thu, Jan 20, 2011 at 4:29 AM, Jeff Davis  wrote:
>> > New patch. I added a lot of generic range functions, and a lot of
>> > operators.
>> >
>> > There is still more work to do, this is just an updated patch. The
>> > latest can be seen on the git repository, as well:
>>
>> So is this 9.2 material at this point?
>
> Regardless of whether it's eligible to be in 9.1, I plan to keep working
> on it.
>
> I would appreciate some overall feedback during this commitfest. Much of
> the code is there, so it would be helpful if we could settle issues like
> representation, functionality, interface, catalog, API, grammar, and
> naming. Otherwise, those issues will just be a reason to bounce it from
> commitfest-next, as well.

Agreed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI and Hot Standby

2011-01-21 Thread Kääriäinen Anssi

"When I test your example, though, I'm getting the serialization
failure on T3 rather than T2, so I'd call that a bug.  Will
investigate.  Thanks again for your tests!  You seem to be able to
shake out issues better than anyone else!  Once found, fixing them
is not usually very hard, it's coming up with that creative usage
pattern to *find* the problem which is the hard part."

Thank you very much, but I do not deserve this honor. I was just constructing 
an example for myself so that I could understand why read only transaction 
might pose a problem. I posted it to help other people to see a concrete 
example of the problem. I had no idea this would show an actual bug in the code.

"OK if I add this one to our dcheck test suite, too?"

It is of course OK. And if you want to add this as an example in the 
documentation, it would be great. This is a simple, but concrete example of why 
read only serializable transaction might cause an anomaly. If I am not 
mistaken, there isn't any complete example in the documentation. It was hard 
for me to grasp why there might be a problem and I don't think I am alone.

 - Anssi

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] READ ONLY fixes

2011-01-21 Thread Kevin Grittner

Robert Haas  wrote:

> Upon further review, I am wondering if it wouldn't be simpler and
> more logical to allow idempotent changes of these settings at any
> time, and to restrict only changes that actually change something.

I don't care a lot about that either -- if I remember correctly, we
got here based largely on my somewhat tentative interpretation of the
standard.  Even if my reading was right (of which I'm far from sure),
it would just mean that we have an extension to the standard in
allowing the benign declarations.  I'm sure not going to lose any
sleep over that.

I'll do whatever people want in this regard with no reservations.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI and Hot Standby

2011-01-21 Thread Kevin Grittner

> Jeff Davis  wrote:
> On Fri, 2011-01-21 at 18:52 -0600, Kevin Grittner wrote:
>> My assumption is that when we have a safe snapshot (which should
>> be pretty close to all the time), we immediately provide it to any
>> serializable transaction requesting a snapshot, except it seems to
>> make sense to use the new DEFERRABLE mode to mean that you want to
>> use the *next* one to arrive.
> 
> How would it handle this situation:
> 1. Standby has safe snapshot S1
> 2. Primary does a VACUUM which removes some stuff visible in S1
> 3. Standby can't replay the VACUUM because it still has S1, but
> also can't get a new S2 because the WAL needed for that is behind
> the VACUUM
> 
> So, S1 needs to be discarded. What do we do on the standby while
> there is no safe snapshot? I suppose throw errors -- I can't think
> of anything else.

We could wait for the next safe snapshot to arrive.  I don't know how
often that combination would occur, particulary in a situation where
there were long-running serializable read write transactions on the
master which would prevent a new safe snapshot from being generated. 
It seems as though a long-running transaction on the master would
also block vacuum activity.

I'm not sure how we can *really* know the frequency without field
experience.

>> This would effectively cause the point in time which was visible
>> to serializable transactions to lag behind what is visible to
>> other transactions by a variable amount, but would ensure that a
>> serializable transaction couldn't see any serialization anomalies.
>> It would also be immune to serialization failures from SSI logic;
>> but obviously, standby-related cancellations would be in play. I
>> don't know whether the older snapshots would tend to increase the
>> standby-related cancellations, but it wouldn't surprise me.
> 
> I'm also a little concerned about the user-understandability here.
> Is it possible to make the following guarantees in this approach:
> 
> 1. If transactions are completing on the primary, new snapshots
> will be taken on the standby; and

The rules there are rather complicated.  Based on previous responses
to posts where I've gotten into that detail, I fear that specifying
it with complete accuracy would cause so many eyes to glaze over that
nobody would get to the end of the description.  I will do it if
anybody asks, but without that I'll just say that the conditions for
blocking a safe snapshot in a mix of short-lived read-write
transactions are so esoteric that I expect that they would be
uncommon in practical use.  On the other hand, one long-running
read-write transaction could block generation of a new safe snapshot
indefinitely.  Transactions declared as read-only or running at an
isolation level other than serializable would have no impact on
generation of a safe snapshot.

> 2. If no write transactions are in progress on the primary, then
> the standby will get a snapshot that represents the exact same data
> as on the primary?

A snapshot taken while there are no serializable read write
transactions active can immediately be declared safe.  Whether such a
snapshot is always available on the standby depends on what sort of
throttling, if any, is used.

> That would be fairly easy to explain to users. If there is a
> visibility lag, then we just say "finish the write transactions,
> and progress will be made". And if the system is idle, they should
> see identical data.

Well, unless it's sync rep, you'll always have some latency between
the master and the standby.  And any throttling to control resource
utilization could also cause latency between other transactions and
serializable ones.  But other than that, you're exactly on target.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] READ ONLY fixes

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 7:08 PM, Kevin Grittner
 wrote:
> Robert Haas  wrote:
>> Jeff Janes  wrote:
>>> I found the following message somewhat confusing:
>>> ERROR:  read-only property must be set before any query
>>
>> I think what we need here is two messages, this one and a similar
>> one that starts with "read-write property...".
>
> Done.  I started out by being cute with plugging "only" or "write"
> into a single message, but then figured that might be hard on
> translators; so I went with two separate messages.

Make sense.

I committed the part of this that applies to SET TRANSACTION ISOLATION
LEVEL; the remainder is attached.

Upon further review, I am wondering if it wouldn't be simpler and more
logical to allow idempotent changes of these settings at any time, and
to restrict only changes that actually change something.  It feels
really weird to allow changing these properties to their own values at
any time within a subtransaction, but not in a top-level transaction.
Why not:

if (source != PGC_S_OVERRIDE && newval && XactReadOnly)
{
if (IsSubTransaction())
cannot set transaction read-write mode inside a read-only transaction;
else if (FirstSnapshotSet)
transaction read-write mode must be set before any query;
else if (RecoveryInProgress())
cannot set transaction read-write mode during recovery;
}

That seems a lot more straightforward than this logic, and it saves
one translatable message, too.

I'm not bent on this route if people feel strongly otherwise, but it
seems like it'd be simpler without really losing anything.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/commands/variable.c b/src/backend/commands/variable.c
index 1e9bdc3..02825c2 100644
--- a/src/backend/commands/variable.c
+++ b/src/backend/commands/variable.c
@@ -544,6 +544,49 @@ show_log_timezone(void)
 
 
 /*
+ * SET TRANSACTION READ ONLY and SET TRANSACTION READ WRITE
+ *
+ * These should be transaction properties which can be set in exactly the
+ * same points in time that transaction isolation may be set.
+ */
+bool
+assign_transaction_read_only(bool newval, bool doit, GucSource source)
+{
+	/* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */
+	if (source != PGC_S_OVERRIDE)
+	{
+		/* Can't go to r/w mode inside a r/o transaction */
+		if (newval == false && XactReadOnly && IsSubTransaction())
+		{
+			ereport(GUC_complaint_elevel(source),
+	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+	 errmsg("cannot set transaction read-write mode inside a read-only transaction")));
+			return false;
+		}
+		/* Top level transaction can't change this after first snapshot. */
+		if (FirstSnapshotSet && !IsSubTransaction())
+		{
+			ereport(GUC_complaint_elevel(source),
+	(errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
+	 errmsg(newval
+			? "transaction read-only mode must be set before any query"
+			: "transaction read-write mode must be set before any query")));
+			return false;
+		}
+		/* Can't go to r/w mode while recovery is still active */
+		if (newval == false && XactReadOnly && RecoveryInProgress())
+		{
+			ereport(GUC_complaint_elevel(source),
+	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+	 errmsg("cannot set transaction read-write mode during recovery")));
+			return false;
+		}
+	}
+
+	return true;
+}
+
+/*
  * SET TRANSACTION ISOLATION LEVEL
  */
 const char *
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 601..a7616df 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -168,7 +168,6 @@ static bool assign_bonjour(bool newval, bool doit, GucSource source);
 static bool assign_ssl(bool newval, bool doit, GucSource source);
 static bool assign_stage_log_stats(bool newval, bool doit, GucSource source);
 static bool assign_log_stats(bool newval, bool doit, GucSource source);
-static bool assign_transaction_read_only(bool newval, bool doit, GucSource source);
 static const char *assign_canonical_path(const char *newval, bool doit, GucSource source);
 static const char *assign_timezone_abbreviations(const char *newval, bool doit, GucSource source);
 static const char *show_archive_command(void);
@@ -7843,34 +7842,6 @@ assign_log_stats(bool newval, bool doit, GucSource source)
 	return true;
 }
 
-static bool
-assign_transaction_read_only(bool newval, bool doit, GucSource source)
-{
-	/* Can't go to r/w mode inside a r/o transaction */
-	if (newval == false && XactReadOnly && IsSubTransaction())
-	{
-		ereport(GUC_complaint_elevel(source),
-(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("cannot set transaction read-write mode inside a read-only transaction")));
-		/* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */
-		if (source != PGC_S_OVERRIDE)
-			return false;
-	}
-
-	/* Can't go to r/w mode while recovery is still active */
-	if (newval == false && XactReadOnly && RecoveryInProgress())
-	{

Re: [HACKERS] Sync Rep for 2011CF1

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 1:59 PM, Aidan Van Dyk  wrote:
> Yup.  And I'm OK with that.  In my case, it would be much better to
> have a few quick failures, which can complete automatically a few
> seconds later then to have a big buildup of transactions to re-verify
> by hand upon starting manual processing.

Why would you need to do that?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI and Hot Standby

2011-01-21 Thread Jeff Davis

On Fri, 2011-01-21 at 18:52 -0600, Kevin Grittner wrote:
> My assumption is that when we have a safe snapshot (which should be
> pretty close to all the time), we immediately provide it to any
> serializable transaction requesting a snapshot, except it seems to
> make sense to use the new DEFERRABLE mode to mean that you want to
> use the *next* one to arrive.

How would it handle this situation:
1. Standby has safe snapshot S1
2. Primary does a VACUUM which removes some stuff visible in S1
3. Standby can't replay the VACUUM because it still has S1, but also
can't get a new S2 because the WAL needed for that is behind the VACUUM

So, S1 needs to be discarded. What do we do on the standby while there
is no safe snapshot? I suppose throw errors -- I can't think of anything
else.

> This would effectively cause the point in time which was visible to
> serializable transactions to lag behind what is visible to other
> transactions by a variable amount, but would ensure that a
> serializable transaction couldn't see any serialization anomalies. 
> It would also be immune to serialization failures from SSI logic;
> but obviously, standby-related cancellations would be in play.  I
> don't know whether the older snapshots would tend to increase the
> standby-related cancellations, but it wouldn't surprise me.

I'm also a little concerned about the user-understandability here. Is it
possible to make the following guarantees in this approach:

1. If transactions are completing on the primary, new snapshots will be
taken on the standby; and
2. If no write transactions are in progress on the primary, then the
standby will get a snapshot that represents the exact same data as on
the primary?

That would be fairly easy to explain to users. If there is a visibility
lag, then we just say "finish the write transactions, and progress will
be made". And if the system is idle, they should see identical data.

Regards,
Jeff Davis

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI and Hot Standby

2011-01-21 Thread Kevin Grittner

I wrote:

> We're not talking about passing the backwards.  I'm suggesting
> that we probably don't even need to pass them forward, but that
> suggestion has been pretty handwavy so far.  I guess I should fill
> it out, because everyone's been ignoring it so far.

It's been too hectic today to flesh this out very well, but I can at
least do a better brain dump -- you know, wave my hands a little
less vaguely.

The idea of communicating regarding a safe snapshot through the WAL
without actually *sending* snapshot XIDs through the WAL might work
something like this:

(1)  We communicate when we are starting to consider a snapshot. 
This would always be related to the commit or rollback of a
serializable read-write transaction, so perhaps we could include the
information in an existing WAL record.  We would need to find one
free bit somewhere, or make room for it.  Alternatively, we could
send a new WAL record type to communicate this.  At the point that a
standby processes such a WAL record, it would grab a snapshot effect
after the commit, and save it as the "latest candidate", releasing
the previous candidate, if any.

(2)  If a snapshot fails to make it to a safe status on the master,
it will pick a new candidate and repeat (1) -- there's no need to
explicitly quash a failed candidate.

(3)  We communicate when we find that the last candidate made it to
"safe" status.  Again, this would be related to the commit or
rollback of a serializable read-write transaction.  Same issues
about needing (another) bit or using a new record type.  When a
standby receives this, it promotes the latest candidate to the new
"safe snapshot" to be used when a serializable transaction asks for
a snapshot, replacing the previous value, if any.  Any transactions
waiting for a snapshot (either because there previously wasn't a
safe snapshot on record or because they requested DEFERRABLE) could
be provided the new snapshot and turned loose.

(4)  It's not inconceivable that we might want to send both (1) and
(3) with the same commit.

(5)  Obviously, we can pick our heuristics for how often we try to
refresh this, limiting it to avoid too much overhead, at the cost of
less frequent snapshot updates for serializable transactions on the
standbys.

My assumption is that when we have a safe snapshot (which should be
pretty close to all the time), we immediately provide it to any
serializable transaction requesting a snapshot, except it seems to
make sense to use the new DEFERRABLE mode to mean that you want to
use the *next* one to arrive.

This would effectively cause the point in time which was visible to
serializable transactions to lag behind what is visible to other
transactions by a variable amount, but would ensure that a
serializable transaction couldn't see any serialization anomalies. 
It would also be immune to serialization failures from SSI logic;
but obviously, standby-related cancellations would be in play.  I
don't know whether the older snapshots would tend to increase the
standby-related cancellations, but it wouldn't surprise me.

Hopefully this is enough for people to make something of it.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] READ ONLY fixes

2011-01-21 Thread Kevin Grittner

Robert Haas  wrote:
> Jeff Janes  wrote:
>> I found the following message somewhat confusing:
>> ERROR:  read-only property must be set before any query
> 
> I think what we need here is two messages, this one and a similar
> one that starts with "read-write property...".
 
Done.  I started out by being cute with plugging "only" or "write"
into a single message, but then figured that might be hard on
translators; so I went with two separate messages.
 
Also, I noticed we seemed to be using "transaction read-only mode"
and "transaction read-write mode" elsewhere, so I made this
consistent with the others while I was at it.  Hopefully that was a
good idea.
 
>> When a subtransaction has set the mode more stringent than the
>> top-level transaction did, that setting is reversed when the
>> subtransaction ends (whether by success or by rollback), which
>> was discussed as the desired behavior.  But the included
>> regression tests do not exercise that case by testing the case
>> where a SAVEPOINT is either rolled back or released.  Should
>> those tests be included?
> 
> +1.
 
Done.
 
-Kevin
*** a/src/backend/commands/variable.c
--- b/src/backend/commands/variable.c
***
*** 544,572  show_log_timezone(void)
  
  
  /*
   * SET TRANSACTION ISOLATION LEVEL
   */
  
  const char *
  assign_XactIsoLevel(const char *value, bool doit, GucSource source)
  {
!   if (FirstSnapshotSet)
{
!   ereport(GUC_complaint_elevel(source),
!   (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
!errmsg("SET TRANSACTION ISOLATION LEVEL must 
be called before any query")));
!   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact 
abort */
!   if (source != PGC_S_OVERRIDE)
return NULL;
!   }
!   else if (IsSubTransaction())
!   {
!   ereport(GUC_complaint_elevel(source),
!   (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
!errmsg("SET TRANSACTION ISOLATION LEVEL must 
not be called in a subtransaction")));
!   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact 
abort */
!   if (source != PGC_S_OVERRIDE)
return NULL;
}
  
if (strcmp(value, "serializable") == 0)
--- 544,617 
  
  
  /*
+  * SET TRANSACTION READ ONLY and SET TRANSACTION READ WRITE
+  *
+  * These should be transaction properties which can be set in exactly the
+  * same points in time that transaction isolation may be set.
+  */
+ bool
+ assign_transaction_read_only(bool newval, bool doit, GucSource source)
+ {
+   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */
+   if (source != PGC_S_OVERRIDE)
+   {
+   /* Can't go to r/w mode inside a r/o transaction */
+   if (newval == false && XactReadOnly && IsSubTransaction())
+   {
+   ereport(GUC_complaint_elevel(source),
+   
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("cannot set transaction 
read-write mode inside a read-only transaction")));
+   return false;
+   }
+   /* Top level transaction can't change this after first 
snapshot. */
+   if (FirstSnapshotSet && !IsSubTransaction())
+   {
+   ereport(GUC_complaint_elevel(source),
+   
(errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
+errmsg(newval
+   ? "transaction 
read-only mode must be set before any query"
+   : "transaction 
read-write mode must be set before any query")));
+   return false;
+   }
+   /* Can't go to r/w mode while recovery is still active */
+   if (newval == false && XactReadOnly && RecoveryInProgress())
+   {
+   ereport(GUC_complaint_elevel(source),
+   
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("cannot set transaction 
read-write mode during recovery")));
+   return false;
+   }
+   }
+ 
+   return true;
+ }
+ 
+ /*
   * SET TRANSACTION ISOLATION LEVEL
   */
+ extern char *XactIsoLevel_string; /* in guc.c */
  
  const char *
  assign_XactIsoLevel(const char *value, bool doit, GucSource source)
  {
!   /* source == PGC_S_OVERRIDE means do it anyway, eg at xact abort */
!   if (source != PGC_S_OVERRIDE)
{
!   if (FirstSnapshotSet)
!   {
!   ereport(GUC_complaint_elevel(source),
!   
(errcode(ERRCODE_ACTIVE_SQL_TRAN

Re: [HACKERS] Problem building pgtestfsync.sgml

2011-01-21 Thread Bruce Momjian

Tom Lane wrote:
> "Kevin Grittner"  writes:
> > Anyone else seeing anything like this on `make world`?
> 
> Looks like Bruce didn't bother to test that before committing.  Fixed.

Thanks.  I built is several times that displayed fine, but I now see
that it throws a warning when built. Thanks for the fixes.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Semijoin selectivity estimation

2011-01-21 Thread Tom Lane

Since 8.4, we've been using the definition that join selectivity for
a clause in a semijoin is the fraction of outer-relation tuples that
pass the clause (or more strictly, for which there exists at least
one inner-relation tuple with which the clause succeeds).  While this
looks fairly straightforward, it's become apparent to me today that
this definition fails to support one of the usual expectations for
selectivity, namely

sel(NOT X) = 1 - sel(X)

(Please ignore questions of nulls for the moment; things are bad enough
without that.)  The reason is that, if sel(NOT X) is the probability
that there is an inner row for which NOT X succeeds, then this is not
one minus the probability that there is an inner row for which X
succeeds; rather it's one minus the probability that X succeeds for
*every* inner row.

It was noted earlier today in pgsql-performance that this breaks
neqjoinsel's implementation in terms of eqjoinsel; but it also means
that clause_selectivity's handling of NOT clauses is wrong in this
context, and more generally there are all sorts of gotchas in terms of
trying to reason about the relationships of related selectivities.

I can't avoid the feeling that this means we should be using some other
definition for semijoin selectivity.  I can't find any other definition
in use in the literature, though.

Right at the moment the consequences are pretty limited, since eqjoinsel
is actually the only code that tries to compute a special selectivity
estimate for semijoin/antijoin contexts.  But I can foresee this
becoming a real mess if we try to expand our intelligence about such
cases.

Any thoughts out there?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Problem building pgtestfsync.sgml

2011-01-21 Thread Tom Lane

"Kevin Grittner"  writes:
> Anyone else seeing anything like this on `make world`?

Looks like Bruce didn't bother to test that before committing.  Fixed.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Problem building pgtestfsync.sgml

2011-01-21 Thread Kevin Grittner

Anyone else seeing anything like this on `make world`?
 
make[3]: Entering directory
`/home/kgrittn/git/postgresql/kgrittn/doc/src/sgml'
{ \
  echo ""; \
  echo ""; \
} > version.sgml
"/usr/bin/perl" ./mk_feature_tables.pl YES
../../../src/backend/catalog/sql_feature_packages.txt
../../../src/backend/catalog/sql_features.txt > features-supported.sgml
"/usr/bin/perl" ./mk_feature_tables.pl NO
../../../src/backend/catalog/sql_feature_packages.txt
../../../src/backend/catalog/sql_features.txt >
features-unsupported.sgml
openjade  -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D .
-c /usr/share/sgml/docbook/stylesheet/dsssl/modular/catalog -d
stylesheet.dsl -t sgml -i output-html -V html-index postgres.sgml
openjade:pgtestfsync.sgml:23:7:E: document type does not allow element
"SECT2" here
openjade:pgtestfsync.sgml:62:7:E: document type does not allow element
"SECT2" here
openjade:pgtestfsync.sgml:70:7:E: "SECT2" not finished but containing
element ended
openjade:pgtestfsync.sgml:70:7:E: end tag for "SECT2" omitted, but
OMITTAG NO was specified
openjade:pgtestfsync.sgml:20:1: start tag was here
make[3]: *** [HTML.index] Error 1
make[3]: *** Deleting file `HTML.index'
make[3]: Leaving directory
`/home/kgrittn/git/postgresql/kgrittn/doc/src/sgml'
 
Docs have been building fine until today.  I can't rule out it being my
problem somehow, though
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] One Role, Two Passwords

2011-01-21 Thread Andreas Karlsson

On Fri, 2011-01-21 at 11:00 -0500, Garick Hamlin wrote:
> I can't tell if PG supports querying a secondary RADIUS server?  
> 
> I don't see how I would do it with the syntax here ... 
> http://developer.postgresql.org/pgdocs/postgres/auth-methods.html
> 
> Are multiple servers not supported?
> 
> Garick

>From a quick look at the source code it does not look like it can.

Andreas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: FDW API

2011-01-21 Thread Tom Lane

Heikki Linnakangas  writes:
> Some quick comments on that:

> * I wonder if CREATE FOREIGN DATA WRAPPER should automatically create 
> the handler function, if it doesn't exist yet. That's what CREATE 
> LANGUAGE does, which is similar. Although it doesn't seem to be 
> documented for CREATE LANGUAGE either, is it deprecated?

Doing that would require the equivalent of pg_pltemplate for FDWs, no?
I think we're a long way from wanting to do that.  Also, it seems to me
that add-on FDWs are likely to end up getting packaged as extensions,
so the extension machinery will probably render the question moot pretty
soon.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] REVIEW: WIP: plpgsql - foreach in

2011-01-21 Thread Pavel Stehule

Hello

I merge your changes and little enhanced comments.

Regards

Pavel Stehule


2011/1/20 Stephen Frost :
> Greetings,
>
> * Pavel Stehule (pavel.steh...@gmail.com) wrote:
>> attached patch contains a implementation of iteration over a array:
>
> I've gone through this patch and, in general, it looks pretty reasonable
> to me.  There's a number of places where I think additional comments
> would be good and maybe some variable name improvments.  Also, my
> changes should be reviewed to make sure they make sense.
>
> Attached is a patch against master which includes my changes, and a
> patch against Pavel's patch, so he can more easily see my changes and
> include them if he'd like.
>
> I'm going to mark this returned to author with feedback.
>
> commit 30295015739930e68c33b29da4f7ef535bc293ea
> Author: Stephen Frost 
> Date:   Wed Jan 19 17:58:24 2011 -0500
>
>    Clean up foreach-in-array PL/PgSQL code/comments
>
>    Minor clean-up of the PL/PgSQL foreach-in-array patch, includes
>    some white-space cleanup, grammar fixes, additional errhint where
>    it makes sense, etc.
>
>    Also added a number of 'XXX' comments asking for clarification
>    and additional comments on what's happening in the code.
>
> commit f1a02fe3a8fa84217dae32d5ba74e9764c77431c
> Author: Stephen Frost 
> Date:   Wed Jan 19 15:11:53 2011 -0500
>
>    PL/PgSQL - Add interate-over-array support
>
>    This patch adds support for iterating over an array in PL/PgSQL.
>
>    Patch Author: Pavel Stehule
>
>        Thanks,
>
>                Stephen
>
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.10 (GNU/Linux)
>
> iEYEARECAAYFAk03bf8ACgkQrzgMPqB3kihxuwCfZYKFpEraRCIltlUeYtD9AyX0
> tvoAnjuxddXhZB6w2/V9oVSD1+K7Idu9
> =w38Z
> -END PGP SIGNATURE-
>
>
*** ./doc/src/sgml/plpgsql.sgml.orig	2011-01-16 14:18:58.0 +0100
--- ./doc/src/sgml/plpgsql.sgml	2011-01-17 11:31:54.086217514 +0100
***
*** 2238,2243 
--- 2238,2268 
  
 
  
+
+ Looping Through Array
+ 
+  <> 
+ FOREACH target  SCALE number  IN expression LOOP
+ statements
+ END LOOP  label ;
+ 
+ 
+ 
+ CREATE FUNCTION sum(VARIADIC int[]) RETURNS int8 AS $$
+ DECLARE
+   s int8; x int;
+ BEGIN
+   FOREACH x IN $1
+   LOOP
+ s := s + x;
+   END LOOP;
+   RETURN s;
+ END;
+ $$ LANGUAGE plpgsql;
+ 
+ 
+
+ 
 
  Trapping Errors
  
*** ./src/backend/utils/adt/arrayfuncs.c.orig	2011-01-16 14:18:58.0 +0100
--- ./src/backend/utils/adt/arrayfuncs.c	2011-01-21 22:59:00.315522289 +0100
***
*** 68,74 
  			 Datum *values, bool *nulls, int nitems,
  			 int typlen, bool typbyval, char typalign,
  			 bool freedata);
- static bool array_get_isnull(const bits8 *nullbitmap, int offset);
  static void array_set_isnull(bits8 *nullbitmap, int offset, bool isNull);
  static Datum ArrayCast(char *value, bool byval, int len);
  static int ArrayCastAndSet(Datum src,
--- 68,73 
***
*** 3836,3842 
   * nullbitmap: pointer to array's null bitmap (NULL if none)
   * offset: 0-based linear element number of array element
   */
! static bool
  array_get_isnull(const bits8 *nullbitmap, int offset)
  {
  	if (nullbitmap == NULL)
--- 3835,3841 
   * nullbitmap: pointer to array's null bitmap (NULL if none)
   * offset: 0-based linear element number of array element
   */
! bool
  array_get_isnull(const bits8 *nullbitmap, int offset)
  {
  	if (nullbitmap == NULL)
*** ./src/include/utils/array.h.orig	2011-01-16 14:18:59.0 +0100
--- ./src/include/utils/array.h	2011-01-21 23:03:07.294898398 +0100
***
*** 215,220 
--- 215,221 
  extern ArrayType *array_set(ArrayType *array, int nSubscripts, int *indx,
  		  Datum dataValue, bool isNull,
  		  int arraytyplen, int elmlen, bool elmbyval, char elmalign);
+ extern bool array_get_isnull(const bits8 *nullbitmap, int offset);
  extern ArrayType *array_get_slice(ArrayType *array, int nSubscripts,
  int *upperIndx, int *lowerIndx,
  int arraytyplen, int elmlen, bool elmbyval, char elmalign);
*** ./src/pl/plpgsql/src/gram.y.orig	2011-01-16 14:18:59.0 +0100
--- ./src/pl/plpgsql/src/gram.y	2011-01-21 21:35:21.959579809 +0100
***
*** 21,26 
--- 21,27 
  #include "parser/parse_type.h"
  #include "parser/scanner.h"
  #include "parser/scansup.h"
+ #include "utils/array.h"
  
  
  /* Location tracking support --- simpler than bison's default */
***
*** 134,139 
--- 135,141 
  			PLpgSQL_datum   *scalar;
  			PLpgSQL_rec *rec;
  			PLpgSQL_row *row;
+ 			int	slice;
  		}		forvariable;
  		struct
  		{
***
*** 178,184 
  %type 	assign_var
  %type 		cursor_variable
  %type 	decl_cursor_arg
! %type 	for_variable
  %type 	for_control
  
  %type 		any_identifier opt_block_label opt_label
--- 180,186 
  %type 	assign_var
  %type 		cursor_variable
  %type 	decl_cursor_arg
! %type 	for_variable foreach_control
  %type 	for_control
  
  %type 		any_ident

Re: [HACKERS] exceptions not present in plpy with Python 3

2011-01-21 Thread Peter Eisentraut

On tor, 2011-01-20 at 22:30 +0200, Peter Eisentraut wrote:
> On lör, 2010-12-18 at 18:56 +0100, Jan Urbański wrote:
> > there seems to be a problem in the way we add exceptions to the plpy
> > module in PL/Python compiled with Python 3k.
> > 
> > Try this: DO $$ plpy.SPIError $$ language plpython3u;
> > 
> > I'm not a Python 3 expert, but I nicked some code from the Internet and
> > came up with this patch (passes regression tests on both Python 2 and 3).
> 
> It looks like the PyModule_AddObject() approach also works in Python 2.
> Anyone see an issue with using that uniformly?

Committed using PyModule_AddObject.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: RangeTypes

2011-01-21 Thread Jeff Davis

On Fri, 2011-01-21 at 12:31 -0500, Robert Haas wrote:
> On Thu, Jan 20, 2011 at 4:29 AM, Jeff Davis  wrote:
> > New patch. I added a lot of generic range functions, and a lot of
> > operators.
> >
> > There is still more work to do, this is just an updated patch. The
> > latest can be seen on the git repository, as well:
> 
> So is this 9.2 material at this point?

Regardless of whether it's eligible to be in 9.1, I plan to keep working
on it.

I would appreciate some overall feedback during this commitfest. Much of
the code is there, so it would be helpful if we could settle issues like
representation, functionality, interface, catalog, API, grammar, and
naming. Otherwise, those issues will just be a reason to bounce it from
commitfest-next, as well.

Regards,
Jeff Davis

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Large object corruption during 'piped' pg_restore

2011-01-21 Thread Tom Lane

Robert Haas  writes:
> On Fri, Jan 21, 2011 at 12:44 PM, Bosco Rama  wrote:
>> Tom Lane wrote:
>>> So I'm not sure whether to fix it, or leave it as a known failure case
>>> in old branches.  Comments?

>> As an end user there is one area of the DB that I want to work correctly
>> 100% of the time and that is the dump/restore tool(s).

> Yeah, I lean toward saying we should back-patch this.

Fair enough, I'll go do it.  I just wanted to hear at least one other
person opine that it was worth taking some risk for.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep for 2011CF1

2011-01-21 Thread Aidan Van Dyk

On Fri, Jan 21, 2011 at 1:32 PM, Robert Haas  wrote:

>> Again, I'm trying to stop "forward progress" as soon as possible when
>> a sync slave isn't replicating.  And I'ld like clients to fail with
>> errors sooner (hopefully they get to the commit point) rather than
>> accumulate the WAL synced to the master and just wait at the commit.

> Well, stopping all WAL activity with an error sounds *more* reasonable
> than refusing all logins, but I'm not personally sold on it.  For
> example, a brief network disruption on the connection between master
> and standby would cause the master to grind to a halt... and then
> almost immediately resume operations.

Yup.  And I'm OK with that.  In my case, it would be much better to
have a few quick failures, which can complete automatically a few
seconds later then to have a big buildup of transactions to re-verify
by hand upon starting manual processing.

But again, I'll stress that I'm talking about whe the master has no
sync slave connected.  a "brief netowrk disruption" between the
master/slave isn't likely going to disconnect the slave.  TCP is
pretty good at handling those.  If the master thinks it has a sync
slave connected, I'm fine with it continuing to queue WAL for it even
if it's lagging noticeably.

>    More generally, if you have
> short-running transactions, there's not much difference between
> wait-at-commit and wait-at-WAL, and if you have long-running
> transactions, then wait-at-WAL might be gumming up the works more than
> necessary.

Again, when there is not sync slave *connected*, I don't want to wait
*at all*.  I want to fail ASAP.  If there is a sync slave, and it's
just slow, I don't really care where it waits.

From my experience, if the slave is not connected (i.e TCP connection
has been disconnected), then we're in something like:

1) Proper slave shutdown: pilot error here stopping it if the master requires it
2) Master start, slave not connected yet:  I'm fine with getting
errors here... We *hope* a slave will be here soon, but...
3) network has seperated master/slave:  TCP means it's been like this
for a long time already...
4) Slave hardware/os low-level hang/crash: TCP means it's been like
this for a while already before master's os tears down the connection
5) Slave has crashed (or rebooted) and slave OS has closed/rejected
our TCP connection

In all of these, I'ld love for my master not to be generating WAL and
letting clients think they are making progress.  And I'm hoping that
for #3 & 4 above, PG will have keepalive type traffic that will
prevent me from queing WAL for normal TCP connection time values.

> One idea might be to wait both before and after commit.  If
> allow_standalone_primary is off, and a commit is attempted, we check
> whether there's a slave connected, and if not, wait for one to
> connect.  Then, we write and sync the commit WAL record.  Next, we
> wait for the WAL to be ack'd.  Of course, the standby might disappear
> between the first check and the second, but it would greatly reduce
> the possibility of the master being ahead of the standby after a
> crash, which might be useful for some people.

Ya, but that becomes much more expensive.  Instead of it just being a
"write WAL, fsync WAL, send WAL, wait for slave", it becomes "write
WAL, fsync WAL, send WAL, wait for slave fsync, write WAL, fsync WAL,
send WAL, wait for slave fsync".  And it's expense is all the time,
rather than just when the "no slave no go" situations arise.

And it doesn't reduce the transactions I need to verify by hand
either, because that waiting/error still only happens at the COMMIT
statement from the client.

-- 
Aidan Van Dyk                                             Create like a god,
ai...@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep for 2011CF1

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 1:09 PM, Aidan Van Dyk  wrote:
> On Fri, Jan 21, 2011 at 1:03 PM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> On Fri, Jan 21, 2011 at 12:23 PM, Aidan Van Dyk  wrote:
 When no sync slave is connected, yes, I want to stop things hard.
>>
>>> What you're proposing is to fail things earlier than absolutely
>>> necessary (when they try to XLOG, rather than at commit) but still
>>> later than what I think Simon is proposing (not even letting them log
>>> in).
>>
>> I can't see a reason to disallow login, because read-only transactions
>> can still run in such a situation --- and, indeed, might be fairly
>> essential if you need to inspect the database state on the way to fixing
>> the replication problem.  (Of course, we've already had the discussion
>> about it being a terrible idea to configure replication from inside the
>> database, but that doesn't mean there might not be views or status you
>> would wish to look at.)
>
> And just disallowing new logins is probably not even enough, because
> it allows current logged in clients "forward progress", leading
> towards an eventual hang (with now committed data on the master).
>
> Again, I'm trying to stop "forward progress" as soon as possible when
> a sync slave isn't replicating.  And I'ld like clients to fail with
> errors sooner (hopefully they get to the commit point) rather than
> accumulate the WAL synced to the master and just wait at the commit.
>
> So I think that's a more complete picture of my quick "not do anything
> with no synchronous slave replicating" that I think was what led to
> the no-login approach.

Well, stopping all WAL activity with an error sounds *more* reasonable
than refusing all logins, but I'm not personally sold on it.  For
example, a brief network disruption on the connection between master
and standby would cause the master to grind to a halt... and then
almost immediately resume operations.  More generally, if you have
short-running transactions, there's not much difference between
wait-at-commit and wait-at-WAL, and if you have long-running
transactions, then wait-at-WAL might be gumming up the works more than
necessary.

One idea might be to wait both before and after commit.  If
allow_standalone_primary is off, and a commit is attempted, we check
whether there's a slave connected, and if not, wait for one to
connect.  Then, we write and sync the commit WAL record.  Next, we
wait for the WAL to be ack'd.  Of course, the standby might disappear
between the first check and the second, but it would greatly reduce
the possibility of the master being ahead of the standby after a
crash, which might be useful for some people.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: compact fsync request queue on overflow

2011-01-21 Thread Chris Browne

robertmh...@gmail.com (Robert Haas) writes:
> On Mon, Jan 17, 2011 at 8:23 PM, Greg Smith  wrote:
>> Quite.  It's taken me 12 days of machine time running pgbench to find the
>> spots where this problem occurs on a system with a reasonably sized
>> shared_buffers (I'm testing against 256MB).  It's one of those things it's
>> hard to reproduce with test data.
>>
>> Thanks for the thorough code review.  I've got a clear test plan I'm
>> progressing through this week to beat on the performance measurement aspects
>> of the patch.
>
> Any update on this?  I think the test results you've posted previously
> - particularly, the fact that when the queue fills up, there are
> always many duplicates - is pretty much sufficient for us to convince
> ourselves that this will provide a benefit in cases where that occurs.

Agreed.  This showed up eminently nicely when beating up the database
using pgbench.

I imagine it would be interesting to run it against a different test
than pgbench, particularly one which involves a larger number of tables.

>From the behavior I have seen thus far, I'm expecting that the queue
essentially gets compressed to the size indicating the number of active
tables.  With pgbench, there are 4 tables, and the queue kept getting
compressed to 3 or 4 entries that nicely corresponds with that.

>  And, in cases where the queue doesn't fill up, we'll never hit the
> test that triggers this code, so it seems pretty clear there won't be
> a negative impact there either.  I don't want to rush your testing
> process, but if it's already fairly clear that this will have some
> benefit, I think it would be good to get it committed and move on to
> working on the parts we're less sure about, like sorting writes and
> spreading fsyncs, where we will probably need a lot more testing than
> here to be sure that we have the right behavior.

I'm pretty happy with what I've seen thus far; I don't want to be
over-antsy about getting it all dealt with Right Quick Instantly, but it
seems like a change that doesn't have a terribly bad risk of a big
downside.
-- 
(reverse (concatenate 'string "ofni.secnanifxunil" "@" "enworbbc"))
The statistics on  sanity are that one out of  every four Americans is
suffering from some  form of mental illness. Think  of your three best
friends. If they're okay, then it's you. -- Rita Mae Brown

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI and Hot Standby

2011-01-21 Thread Kevin Grittner

"Kevin Grittner"  wrote:
 
> When I test your example, though, I'm getting the serialization
> failure on T3 rather than T2, so I'd call that a bug.  Will
> investigate.  Thanks again for your tests!
 
Fixed with this:
 
http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=commitdiff;h=b91460812396b68362c812d6e4fb67799fc6147e
 
Thanks again!
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep for 2011CF1

2011-01-21 Thread Aidan Van Dyk

On Fri, Jan 21, 2011 at 1:03 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Fri, Jan 21, 2011 at 12:23 PM, Aidan Van Dyk  wrote:
>>> When no sync slave is connected, yes, I want to stop things hard.
>
>> What you're proposing is to fail things earlier than absolutely
>> necessary (when they try to XLOG, rather than at commit) but still
>> later than what I think Simon is proposing (not even letting them log
>> in).
>
> I can't see a reason to disallow login, because read-only transactions
> can still run in such a situation --- and, indeed, might be fairly
> essential if you need to inspect the database state on the way to fixing
> the replication problem.  (Of course, we've already had the discussion
> about it being a terrible idea to configure replication from inside the
> database, but that doesn't mean there might not be views or status you
> would wish to look at.)

And just disallowing new logins is probably not even enough, because
it allows current logged in clients "forward progress", leading
towards an eventual hang (with now committed data on the master).

Again, I'm trying to stop "forward progress" as soon as possible when
a sync slave isn't replicating.  And I'ld like clients to fail with
errors sooner (hopefully they get to the commit point) rather than
accumulate the WAL synced to the master and just wait at the commit.

So I think that's a more complete picture of my quick "not do anything
with no synchronous slave replicating" that I think was what led to
the no-login approach.

a.

-- 
Aidan Van Dyk                                             Create like a god,
ai...@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep for 2011CF1

2011-01-21 Thread Tom Lane

Robert Haas  writes:
> On Fri, Jan 21, 2011 at 12:23 PM, Aidan Van Dyk  wrote:
>> When no sync slave is connected, yes, I want to stop things hard.

> What you're proposing is to fail things earlier than absolutely
> necessary (when they try to XLOG, rather than at commit) but still
> later than what I think Simon is proposing (not even letting them log
> in).

I can't see a reason to disallow login, because read-only transactions
can still run in such a situation --- and, indeed, might be fairly
essential if you need to inspect the database state on the way to fixing
the replication problem.  (Of course, we've already had the discussion
about it being a terrible idea to configure replication from inside the
database, but that doesn't mean there might not be views or status you
would wish to look at.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Large object corruption during 'piped' pg_restore

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 12:44 PM, Bosco Rama  wrote:
> Tom Lane wrote:
>>
>> So I'm not sure whether to fix it, or leave it as a known failure case
>> in old branches.  Comments?
>
> I understand the reluctance to fool with stable code.  I have zero insight
> into your installed versions distribution and backward compatibility needs
> so any comment I may have here is purely selfish.
>
> As an end user there is one area of the DB that I want to work correctly
> 100% of the time and that is the dump/restore tool(s).  If it's not going
> to work under certain circumstances it should at least tell me so and
> fail.  I don't think having the tool appear to work while corrupting the
> data (even if documented as doing so) is a viable method of operation.

Yeah, I lean toward saying we should back-patch this.  Yeah, it'll be
lightly tested, but it's a pretty confined change, so it's unlikely to
break anything else.  ISTM the worst case scenario is that it takes
two minor releases to get it right, and even that seems fairly
unlikely.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Moving test_fsync to /contrib?

2011-01-21 Thread Bruce Momjian

Josh Berkus wrote:
> On 1/20/11 6:15 AM, Robert Haas wrote:
> > On Thu, Jan 20, 2011 at 9:13 AM, Bruce Momjian  wrote:
> >> OK, I am ready to move test_fsync to /contrib.  Is pg_test_fsync the
> >> best name?  pg_check_fsync?  pg_fsync_performance?  pg_verify_fsync?
> > 
> > I don't see too much reason to rename it more than necessary, so how
> > about pg_test_fsync?
> 
> +1.

OK, src/tools/test_fsync moved to contrib/pg_test_fsync.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Large object corruption during 'piped' pg_restore

2011-01-21 Thread Bosco Rama

Tom Lane wrote:
> 
> So I'm not sure whether to fix it, or leave it as a known failure case
> in old branches.  Comments?

I understand the reluctance to fool with stable code.  I have zero insight
into your installed versions distribution and backward compatibility needs
so any comment I may have here is purely selfish.

As an end user there is one area of the DB that I want to work correctly
100% of the time and that is the dump/restore tool(s).  If it's not going
to work under certain circumstances it should at least tell me so and
fail.  I don't think having the tool appear to work while corrupting the
data (even if documented as doing so) is a viable method of operation.

Just my $0.02

Bosco.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI and Hot Standby

2011-01-21 Thread Kevin Grittner

Anssi Kääriäinen wrote:

> I am beginning to understand the problem. If you don't mind, here
> is a complete example if somebody else is having troubles
> understanding this.
> 
> Let's say we have tables D1 and D2. Both contain a single column,
> id, and a single row. The data in the beginning is as follows:
> 
> D1:  id = 1
> D2:  id = 1
> 
> The constrains: D1.id can only be incremented. Whenever D2.id is 
> updated, it must be updated to D1.id + 1.
> 
> The transactions:
> T1: begin; update D1 set id = id + 1;
> T2: begin; update D2 set id = (select id+1 from D1);
> T1: commit;
> T3: begin; select id from D1; select id from D2; commit; Data
> seen: (2, 1) -- this is a possible state
> T2: commit;
> T4: begin; select id from D1; select id from D2; commit; Data
> seen: (2, 2)
> This is again a possible state. But if we compare this to the
> state seen 
> by T3 this is not valid. From state (2, 1) we can not get to state
> (2, 2) without breaking one of the constraints. Thus, the state of
> T3 is not valid in the database.
> 
> So, I finally got it! :-) I hope this example will help somebody
> else understand the problem.

Yeah, interesting example.  Under SSI, once T3 selects from D2 you
have a dangerous structure, and either T2 or T3 must fail to prevent
the possibility of the sort of anomaly your example demonstrates. 
We would prefer to see T2 fail, because if T3 fails it will continue
to fail on retry until T2 completes.  We're trying to avoid that
kind of thrashing.  If T2 fails and is retried, it will immediately
succeed and generate results consistent with what T3 saw.

When I test your example, though, I'm getting the serialization
failure on T3 rather than T2, so I'd call that a bug.  Will
investigate.  Thanks again for your tests!  You seem to be able to
shake out issues better than anyone else!  Once found, fixing them
is not usually very hard, it's coming up with that creative usage
pattern to *find* the problem which is the hard part.

OK if I add this one to our dcheck test suite, too?

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: RangeTypes

2011-01-21 Thread Robert Haas

On Thu, Jan 20, 2011 at 4:29 AM, Jeff Davis  wrote:
> New patch. I added a lot of generic range functions, and a lot of
> operators.
>
> There is still more work to do, this is just an updated patch. The
> latest can be seen on the git repository, as well:

So is this 9.2 material at this point?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep for 2011CF1

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 12:23 PM, Aidan Van Dyk  wrote:
> On Fri, Jan 21, 2011 at 11:59 AM, Simon Riggs  wrote:
>
>> We all think our own proposed options are the only reasonable thing, but
>> that helps us not at all in moving forwards. I've put much time into
>> delivering options many other people want, so there is a range of
>> function. I think we should hear from Aidan first before we decide to
>> remove that aspect.
>
> Since invited, I'll describe what I *want* do to do.  I understand I
> may not get it ;-)
>
> When no sync slave is connected, yes, I want to stop things hard.  I
> don't mind read-only queries working, but what I want to avoid (if
> possible) is having the master do lots of inserts/updates/deletes for
> clients, fsyncing them all to disk (so on some strange event causing
> recovery they'll be considered commit) and just delay the commit
> return until it has a valid sync slave connected and caught up again.
> And *I*'ld prefer if client transactions get errors right away rather
> than begin to hang if a sync slave is not connected.
>
> Even with single server, there's the window where stuff could be
> "committed" but the client not notified yet.  And that leads to
> transactions which need to be verified.  And with sync rep, that
> window get's a little larger.  But I'ld prefer not to make it a hanger
> door, *especially* when it gets flung open at the point where the shit
> has hit the fan and we're in the midst of switching over to manual
> processing...
>
> So, in my case, I'ld like it if PG couldn't do anything to generate
> any user-initiated WAL unless there is a sync slave connected.  Yes, I
> understand that leads to hard-fail, and yes, I understand I'm in the
> minority, maybe almost singular in that desire.

What you're proposing is to fail things earlier than absolutely
necessary (when they try to XLOG, rather than at commit) but still
later than what I think Simon is proposing (not even letting them log
in).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep for 2011CF1

2011-01-21 Thread Aidan Van Dyk

On Fri, Jan 21, 2011 at 11:59 AM, Simon Riggs  wrote:

> We all think our own proposed options are the only reasonable thing, but
> that helps us not at all in moving forwards. I've put much time into
> delivering options many other people want, so there is a range of
> function. I think we should hear from Aidan first before we decide to
> remove that aspect.

Since invited, I'll describe what I *want* do to do.  I understand I
may not get it ;-)

When no sync slave is connected, yes, I want to stop things hard.  I
don't mind read-only queries working, but what I want to avoid (if
possible) is having the master do lots of inserts/updates/deletes for
clients, fsyncing them all to disk (so on some strange event causing
recovery they'll be considered commit) and just delay the commit
return until it has a valid sync slave connected and caught up again.
And *I*'ld prefer if client transactions get errors right away rather
than begin to hang if a sync slave is not connected.

Even with single server, there's the window where stuff could be
"committed" but the client not notified yet.  And that leads to
transactions which need to be verified.  And with sync rep, that
window get's a little larger.  But I'ld prefer not to make it a hanger
door, *especially* when it gets flung open at the point where the shit
has hit the fan and we're in the midst of switching over to manual
processing...

So, in my case, I'ld like it if PG couldn't do anything to generate
any user-initiated WAL unless there is a sync slave connected.  Yes, I
understand that leads to hard-fail, and yes, I understand I'm in the
minority, maybe almost singular in that desire.

a.

-- 
Aidan Van Dyk                                             Create like a god,
ai...@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ALTER TABLE ... REPLACE WITH

2011-01-21 Thread Simon Riggs

On Thu, 2011-01-20 at 21:36 +, Simon Riggs wrote:

> I'll review your patch and commit it, problems or objections excepted.

Tom's comments elsewhere prevent me from committing.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump directory archive format / parallel pg_dump

2011-01-21 Thread Euler Taveira de Oliveira


Em 21-01-2011 12:47, Andrew Dunstan escreveu:

Maybe we could change the hint to say "--file=DESTINATION" or
"--file=FILENAME|DIRNAME" ?


... "--file=OUTPUT" or "--file=OUTPUTNAME".


--
  Euler Taveira de Oliveira
  http://www.timbira.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep for 2011CF1

2011-01-21 Thread Simon Riggs

On Fri, 2011-01-21 at 14:34 +0100, Magnus Hagander wrote:
> On Fri, Jan 21, 2011 at 14:24, Simon Riggs  wrote:
> > On Fri, 2011-01-21 at 14:45 +0200, Heikki Linnakangas wrote:
> >> * it seems like overkill to not let clients to even connect when
> >> allow_standalone_primary=off and no synchronous standbys are available.
> >> What if you just want to run a read-only query?
> >
> > That's what Aidan requested, I agreed and so its there. You're using
> > sync rep because of writes, so you have a read-write app. If you allow
> > connections then half of the app will work, half will not. Half-working
> > isn't very useful, as Aidan eloquently explained. If your app is all
> > read-only you wouldn't be using sync rep anyway. That's the argument,
> > but I've not got especially strong feelings it has to be this way.
> >
> > Perhaps discuss that on a separate thread? See what everyone thinks?
> 
> I'll respond here once, and we'll see if more people want to comment
> then we can move it :-)
> 
> Doesn't this make a pretty strange assumption - namely that you have a
> single application? We support multiple databases, and multiple users,
> and multiple pretty much anything - in most cases, people deploy
> multiple apps. (They may well be part of the same "solution" or
> whatever you want to call it, but parts may well be readonly - like a
> reporting app, or even just a monitoring client)

There are various problems whatever we do. If we don't like one way, we
must balance that by judging what happens if we do things the other way.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep for 2011CF1

2011-01-21 Thread Simon Riggs

On Fri, 2011-01-21 at 17:33 +0200, Heikki Linnakangas wrote:
> On 21.01.2011 15:24, Simon Riggs wrote:
> > On Fri, 2011-01-21 at 14:45 +0200, Heikki Linnakangas wrote:
> >> * it seems like overkill to not let clients to even connect when
> >> allow_standalone_primary=off and no synchronous standbys are available.
> >> What if you just want to run a read-only query?
> >
> > That's what Aidan requested, I agreed and so its there. You're using
> > sync rep because of writes, so you have a read-write app. If you allow
> > connections then half of the app will work, half will not. Half-working
> > isn't very useful, as Aidan eloquently explained. If your app is all
> > read-only you wouldn't be using sync rep anyway. That's the argument,
> > but I've not got especially strong feelings it has to be this way.
> 
> It's also possible that most of your transactions in fact do "set 
> synchronous_replication=off", and only a few actually do synchronous 
> replication. It would be pretty bad to not allow connections in that 
> case. And what if you want to connect to the server to diagnose the 
> issue? Oh, you can't... Besides, we're not kicking out existing 
> connections, are we? Seems inconsistent to let the old connections live.
> 
> IMHO the only reasonable option is to allow connections as usual, and 
> only fail (or block forever) at COMMIT.

We all think our own proposed options are the only reasonable thing, but
that helps us not at all in moving forwards. I've put much time into
delivering options many other people want, so there is a range of
function. I think we should hear from Aidan first before we decide to
remove that aspect.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More detailed auth info

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 11:47 AM, Tom Lane  wrote:
> Magnus Hagander  writes:
>> On Fri, Jan 21, 2011 at 16:40, Robert Haas  wrote:
>>> So... is there centralized structure which contains the info you're
>>> thinking of exposing?
>
>> No, not today.
>
> I think that most of the info Magnus suggested isn't stored at all,
> anywhere, at the moment; much less stored in shared memory.

Oh.

Well, that's another can of worms, then.  I think this conversation is
best deferred at least until we get a 9.1 beta out the door; we're
clearly not going to do anything about it right now...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More detailed auth info

2011-01-21 Thread Tom Lane

Magnus Hagander  writes:
> On Fri, Jan 21, 2011 at 16:40, Robert Haas  wrote:
>> So... is there centralized structure which contains the info you're
>> thinking of exposing?

> No, not today.

I think that most of the info Magnus suggested isn't stored at all,
anywhere, at the moment; much less stored in shared memory.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: compact fsync request queue on overflow

2011-01-21 Thread Robert Haas

On Mon, Jan 17, 2011 at 8:23 PM, Greg Smith  wrote:
> Quite.  It's taken me 12 days of machine time running pgbench to find the
> spots where this problem occurs on a system with a reasonably sized
> shared_buffers (I'm testing against 256MB).  It's one of those things it's
> hard to reproduce with test data.
>
> Thanks for the thorough code review.  I've got a clear test plan I'm
> progressing through this week to beat on the performance measurement aspects
> of the patch.

Any update on this?  I think the test results you've posted previously
- particularly, the fact that when the queue fills up, there are
always many duplicates - is pretty much sufficient for us to convince
ourselves that this will provide a benefit in cases where that occurs.
 And, in cases where the queue doesn't fill up, we'll never hit the
test that triggers this code, so it seems pretty clear there won't be
a negative impact there either.  I don't want to rush your testing
process, but if it's already fairly clear that this will have some
benefit, I think it would be good to get it committed and move on to
working on the parts we're less sure about, like sorting writes and
spreading fsyncs, where we will probably need a lot more testing than
here to be sure that we have the right behavior.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sepgsql contrib module

2011-01-21 Thread Tom Lane

Robert Haas  writes:
> I don't want to go there, and it's not what Tom was proposing anyway.
> The idea is - if the user creates a function which is NOT a trusted
> procedure and executes it, and then subsequently changes the system
> security policy so that it becomes a trusted procedure, the user will
> be responsible for flushing the cached plans before the new value will
> take effect.

Yeah.  Given the rather limited set of things that can be inlined,
I don't think that it's worth the complexity or performance cost to
do differently.  Note also that it's pretty easy to force the cache
flush if you are the procedure's owner: any sort of dummy ALTER on
the procedure should do it.

Mind you, I think there probably *is* a case for fixing REVOKE to force
a cache flush on the procedure as well.  I just don't want to have to
deal with magic outside-the-database changes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: FDW API

2011-01-21 Thread Heikki Linnakangas


On 18.01.2011 17:26, Shigeru HANADA wrote:

3) 20110118-fdw_handler.patch - This patch adds support for HANDLER
option to FOREIGN DATA WRAPPER object.


Some quick comments on that:

* I wonder if CREATE FOREIGN DATA WRAPPER should automatically create 
the handler function, if it doesn't exist yet. That's what CREATE 
LANGUAGE does, which is similar. Although it doesn't seem to be 
documented for CREATE LANGUAGE either, is it deprecated?


* The elogs in parse_func_options() should be ereports.

* pg_dump should check the version number and only try to select 
fdwhandler column if >= 9.1. See the other functions there for example 
of that.


* dumpForeignDataWrapper() in pg_dump checks if fdwhandler field is "-". 
I don't think we use that as magic value there, do we? Same with validator.


* Please check that the HANDLER and VALIDATOR options that pg_dump 
creates properly quoted.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI and Hot Standby

2011-01-21 Thread Kevin Grittner

Tom Lane  wrote:
> Simon Riggs  writes:
>> On Fri, 2011-01-21 at 11:19 +0200, Heikki Linnakangas wrote:
>>> It's not the order in which the xid was assigned that matters,
>>> but the order the transactions started and got their snapshots.
>>> The xids might be assigned a lot later, after the transactions
>>> have already read data.
> 
>> So if a read-write transaction assigns an xid before it takes a
>> snapshot then we'll be OK? That seems much easier to arrange than
>> passing chunks of snapshot data backwards and forwards.
 
We're not talking about passing the backwards.  I'm suggesting that
we probably don't even need to pass them forward, but that
suggestion has been pretty handwavy so far.  I guess I should fill
it out, because everyone's been ignoring it so far.
 
> No, that idea is DOA from a performance standpoint.  We sweated
> blood to avoid having to assign XIDs to read-only transactions,
> and we're not going back.  If SSI requires that, SSI is not
> getting committed.
 
SSI doesn't require that.  The suggestion that it would in *any* way
help with the interaction with hot standby is off-base.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sepgsql contrib module

2011-01-21 Thread Kohei KaiGai

2011/1/22 Robert Haas :
> On Fri, Jan 21, 2011 at 9:55 AM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> For that matter, I wonder what happens with regular function
>>> permissions.  If the plan inlines the function and then somebody goes
>>> and changes the permission on the function and makes it SECURITY
>>> DEFINER, what happens?
>>
>> ALTER FUNCTION is supposed to cause plan invalidation in such a case.
>> Not sure if GRANT plays nice with that though.
>
> And in the case of SE-Linux, this could get changed from outside the
> database.  Not sure how to handle that.  I guess we could just never
> inline anything, but that might be an overreaction.
>
We can have two standpoints.

The one is that functions are once allowed to execute on the plan time,
so we don't need to check it on execution time, and inlined.
Thus, the function shall be melted.

The other is that permission checks should be done in execution time,
so we never allows to inline functions anyway.
This attitude is more strict, but mostly overreaction.

In my opinion, the later one is more correct standpoint when we put
the highest priority on security.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sepgsql contrib module

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 11:00 AM, Kohei KaiGai  wrote:
> 2011/1/22 Robert Haas :
>> On Fri, Jan 21, 2011 at 10:46 AM, Tom Lane  wrote:
>>> Robert Haas  writes:
 On Fri, Jan 21, 2011 at 9:55 AM, Tom Lane  wrote:
> ALTER FUNCTION is supposed to cause plan invalidation in such a case.
> Not sure if GRANT plays nice with that though.
>>>
 And in the case of SE-Linux, this could get changed from outside the
 database.  Not sure how to handle that.  I guess we could just never
 inline anything, but that might be an overreaction.
>>>
>>> I think SELinux is just out of luck in that case.  If it didn't refuse
>>> execution permission at the time we checked before inlining (which we
>>> do), it doesn't get to change its mind later.
>>
>> Seems reasonable to me, if it works for KaiGai.
>>
> I assume users of SE-PostgreSQL put their first priority on security,
> not best-performance. So, I also think it is reasonable to kill a part of
> optimization for the strict security checks.
>
> Here is one request for the hook.
> needs_fmgr_hook() is called by fmgr_info_cxt_security() and routines
> to inline. I need a flag to distinct these cases, because we don't need
> to invoke all the functions via fmgr_security_definer(), even if it never
> allows to inline.

I don't want to go there, and it's not what Tom was proposing anyway.
The idea is - if the user creates a function which is NOT a trusted
procedure and executes it, and then subsequently changes the system
security policy so that it becomes a trusted procedure, the user will
be responsible for flushing the cached plans before the new value will
take effect.  That doesn't require nearly as much de-optimization, and
I don't believe it is a serious issue from a security perspective,
either.  (Note that the reverse case, where a trusted procedure is
demoted to a non-trusted procedure, isn't an issue, because we will
have suppressed inlining and the new execution will follow the right
rules, just with reduced performance.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: FDW API

2011-01-21 Thread Heikki Linnakangas


On 21.01.2011 17:57, Robert Haas wrote:

How much review have you done of parts (3) and (4)?


Not much. I'm getting there..


The key issue for
all of the FDW work in progress seems to be what the handler API is
going to look like, and so once we get that committed it will unblock
a lot of other things.


Yep. The API that's there now was originally suggested by me, so I 
probably won't have big complaints about it. I'll have to also look at 
the PostgreSQL and file implementations of it to see that it really fits 
the bill.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] One Role, Two Passwords

2011-01-21 Thread Garick Hamlin

On Thu, Jan 20, 2011 at 08:55:42PM -0500, Daniel Farina wrote:
> The problem is that running a reliable,
> centralized LDAP service is not justifiable as compared to role
> mangling on a per-node level, and the role mangling seems has some
> shortcomings that are negotiable with gritted teeth.

Wandering slightly off-topic here...

I often find RADIUS is an easier way to setup a highly available 
(and perhaps too flexible) AuthN/AuthZ service, but I've never used 
PG's RADIUS implementation.   I was curious and I took a look at 
the documentation...

I can't tell if PG supports querying a secondary RADIUS server?  

I don't see how I would do it with the syntax here ... 
http://developer.postgresql.org/pgdocs/postgres/auth-methods.html

Are multiple servers not supported?

Garick

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sepgsql contrib module

2011-01-21 Thread Kohei KaiGai

2011/1/22 Robert Haas :
> On Fri, Jan 21, 2011 at 10:46 AM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> On Fri, Jan 21, 2011 at 9:55 AM, Tom Lane  wrote:
 ALTER FUNCTION is supposed to cause plan invalidation in such a case.
 Not sure if GRANT plays nice with that though.
>>
>>> And in the case of SE-Linux, this could get changed from outside the
>>> database.  Not sure how to handle that.  I guess we could just never
>>> inline anything, but that might be an overreaction.
>>
>> I think SELinux is just out of luck in that case.  If it didn't refuse
>> execution permission at the time we checked before inlining (which we
>> do), it doesn't get to change its mind later.
>
> Seems reasonable to me, if it works for KaiGai.
>
I assume users of SE-PostgreSQL put their first priority on security,
not best-performance. So, I also think it is reasonable to kill a part of
optimization for the strict security checks.

Here is one request for the hook.
needs_fmgr_hook() is called by fmgr_info_cxt_security() and routines
to inline. I need a flag to distinct these cases, because we don't need
to invoke all the functions via fmgr_security_definer(), even if it never
allows to inline.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: FDW API

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 10:17 AM, Heikki Linnakangas
 wrote:
> On 18.01.2011 17:26, Shigeru HANADA wrote:
>>
>> 1) 20110118-no_fdw_perm_check.patch - This patch is not included in
>> last post.  This had been proposed on 2011-01-05 first, but maybe has
>> not been reviewd yet.  I re-propose this patch for SQL standard
>> conformance.  This patch removes permission check that requires USAGE
>> on the foreign-data wrapper at CREATE FOREIGN TABLE.
>> Please see original post for details.
>>
>> http://archives.postgresql.org/message-id/20110105145206.30fd.69899...@metrosystems.co.jp
>
> Committed this part.

How much review have you done of parts (3) and (4)?  The key issue for
all of the FDW work in progress seems to be what the handler API is
going to look like, and so once we get that committed it will unblock
a lot of other things.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More detailed auth info

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 10:43 AM, Magnus Hagander  wrote:
> On Fri, Jan 21, 2011 at 16:40, Robert Haas  wrote:
>> On Fri, Jan 21, 2011 at 10:34 AM, Magnus Hagander  
>> wrote:
>>> Um, none of the fields I've suggested so far was "connection string".
>>> In fact, that would be Pretty Darn Hard without modifying the client
>>> to actually *send* the connection string. Which id doesn't.
>>
>> So... is there centralized structure which contains the info you're
>> thinking of exposing?
>
> No, not today. That's what would have to be created. (And before you
> or somebody says something, no, it's not on the CF, so this is likely
> a 9.2 feature unless that structure thingy turns out to be a lot
> *less* code than I think it will)

Well, unlike Tom, I don't object to the basic idea, but I reserve the
right to object in detail.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep for 2011CF1

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 10:33 AM, Heikki Linnakangas
 wrote:
> It's also possible that most of your transactions in fact do "set
> synchronous_replication=off", and only a few actually do synchronous
> replication. It would be pretty bad to not allow connections in that case.
> And what if you want to connect to the server to diagnose the issue? Oh, you
> can't... Besides, we're not kicking out existing connections, are we? Seems
> inconsistent to let the old connections live.
>
> IMHO the only reasonable option is to allow connections as usual, and only
> fail (or block forever) at COMMIT.

Another point is that the synchronous standby could come back at any
time.  There's no reason not to let the client do all the work they
want up until the commit - maybe the standby will pop back up before
the COMMIT actually issued.  Or even if it doesn't, as soon as it pops
back up, all those COMMITs get released.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sepgsql contrib module

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 10:46 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Fri, Jan 21, 2011 at 9:55 AM, Tom Lane  wrote:
>>> ALTER FUNCTION is supposed to cause plan invalidation in such a case.
>>> Not sure if GRANT plays nice with that though.
>
>> And in the case of SE-Linux, this could get changed from outside the
>> database.  Not sure how to handle that.  I guess we could just never
>> inline anything, but that might be an overreaction.
>
> I think SELinux is just out of luck in that case.  If it didn't refuse
> execution permission at the time we checked before inlining (which we
> do), it doesn't get to change its mind later.

Seems reasonable to me, if it works for KaiGai.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to know killed by pg_terminate_backend

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 10:35 AM, Tom Lane  wrote:
> Itagaki Takahiro  writes:
>> On Fri, Jan 21, 2011 at 13:56, Tatsuo Ishii  wrote:
>>> Anyone has better idea? Tom dislikes my patch but I don't know how to
>>> deal with it.
>
>> There was another design in the past discussion:
>> One idea is postmaster sets a flag in the shared memory area
>> indicating it rceived SIGTERM before forwarding the signal to
>> backends.
>
>> Is it enough for your purpose and do we think it is more robust way?
>
> To put this as briefly as possible: I don't want to add even one line of
> code to distinguish pg_terminate_backend from database-wide shutdown.
> That function should be a last-ditch tool, not something used on a daily
> basis.  So I disagree with the premise as much as with any particular
> implementation.

Well, that seems awfully unfriendly.

Frequency of use is beside the point - people are trying to write
client applications - like pgpool-II - that understand the behavior of
PG.  If we send the same error code in two different situations with
different behaviors, such applications have to do so silly workarounds
to figure out what really happened.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump directory archive format / parallel pg_dump

2011-01-21 Thread Andrew Dunstan




On 01/21/2011 10:34 AM, Heikki Linnakangas wrote:

On 21.01.2011 15:35, Robert Haas wrote:

On Fri, Jan 21, 2011 at 4:41 AM, Heikki Linnakangas
  wrote:

There's one UI thing that bothers me. The option to specify the target
directory is called --file. But it's clearly not a file. OTOH, I'd 
hate to

introduce a parallel --dir option just for this. Any thoughts on this?


If we were starting over, I'd probably suggest calling the option -o,
--output.  But since -o is already taken (for --oids) I'd be inclined
to just make the help text read:

   -f, --file=FILENAME output file (or directory) name
   -F, --format=c|t|p|doutput file format (custom, tar, text, 
dir)


Ok, that's exactly what the patch does now. I guess it's fine then.



Maybe we could change the hint to say "--file=DESTINATION" or 
"--file=FILENAME|DIRNAME" ?


Just a thought.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sepgsql contrib module

2011-01-21 Thread Tom Lane

Robert Haas  writes:
> On Fri, Jan 21, 2011 at 9:55 AM, Tom Lane  wrote:
>> ALTER FUNCTION is supposed to cause plan invalidation in such a case.
>> Not sure if GRANT plays nice with that though.

> And in the case of SE-Linux, this could get changed from outside the
> database.  Not sure how to handle that.  I guess we could just never
> inline anything, but that might be an overreaction.

I think SELinux is just out of luck in that case.  If it didn't refuse
execution permission at the time we checked before inlining (which we
do), it doesn't get to change its mind later.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More detailed auth info

2011-01-21 Thread Magnus Hagander

On Fri, Jan 21, 2011 at 16:40, Robert Haas  wrote:
> On Fri, Jan 21, 2011 at 10:34 AM, Magnus Hagander  wrote:
>> Um, none of the fields I've suggested so far was "connection string".
>> In fact, that would be Pretty Darn Hard without modifying the client
>> to actually *send* the connection string. Which id doesn't.
>
> So... is there centralized structure which contains the info you're
> thinking of exposing?

No, not today. That's what would have to be created. (And before you
or somebody says something, no, it's not on the CF, so this is likely
a 9.2 feature unless that structure thingy turns out to be a lot
*less* code than I think it will)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More detailed auth info

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 10:34 AM, Magnus Hagander  wrote:
> Um, none of the fields I've suggested so far was "connection string".
> In fact, that would be Pretty Darn Hard without modifying the client
> to actually *send* the connection string. Which id doesn't.

So... is there centralized structure which contains the info you're
thinking of exposing?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep for 2011CF1

2011-01-21 Thread Heikki Linnakangas


On 21.01.2011 15:24, Simon Riggs wrote:

On Fri, 2011-01-21 at 14:45 +0200, Heikki Linnakangas wrote:

* it seems like overkill to not let clients to even connect when
allow_standalone_primary=off and no synchronous standbys are available.
What if you just want to run a read-only query?


That's what Aidan requested, I agreed and so its there. You're using
sync rep because of writes, so you have a read-write app. If you allow
connections then half of the app will work, half will not. Half-working
isn't very useful, as Aidan eloquently explained. If your app is all
read-only you wouldn't be using sync rep anyway. That's the argument,
but I've not got especially strong feelings it has to be this way.


It's also possible that most of your transactions in fact do "set 
synchronous_replication=off", and only a few actually do synchronous 
replication. It would be pretty bad to not allow connections in that 
case. And what if you want to connect to the server to diagnose the 
issue? Oh, you can't... Besides, we're not kicking out existing 
connections, are we? Seems inconsistent to let the old connections live.


IMHO the only reasonable option is to allow connections as usual, and 
only fail (or block forever) at COMMIT.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI and Hot Standby

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 10:32 AM, Tom Lane  wrote:
> Simon Riggs  writes:
>> On Fri, 2011-01-21 at 11:19 +0200, Heikki Linnakangas wrote:
>>> It's not the order in which the xid was assigned that matters, but the
>>> order the transactions started and got their snapshots. The xids might
>>> be assigned a lot later, after the transactions have already read data.
>
>> So if a read-write transaction assigns an xid before it takes a snapshot
>> then we'll be OK? That seems much easier to arrange than passing chunks
>> of snapshot data backwards and forwards. Optionally.
>
> No, that idea is DOA from a performance standpoint.  We sweated blood to
> avoid having to assign XIDs to read-only transactions, and we're not
> going back.  If SSI requires that, SSI is not getting committed.

So far I think all of the ideas proposed for generalizing this across
the master-standby connection seem likely to be DOA from a performance
perspective.  But I think we have a pretty broad consensus that it's
OK to punt this issue for 9.1.  We can always add this in 9.2 if it
can be demonstrated to work well, but it's all vapor-ware right now
anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to know killed by pg_terminate_backend

2011-01-21 Thread Tom Lane

Itagaki Takahiro  writes:
> On Fri, Jan 21, 2011 at 13:56, Tatsuo Ishii  wrote:
>> Anyone has better idea? Tom dislikes my patch but I don't know how to
>> deal with it.

> There was another design in the past discussion:
> One idea is postmaster sets a flag in the shared memory area
> indicating it rceived SIGTERM before forwarding the signal to
> backends.

> Is it enough for your purpose and do we think it is more robust way?

To put this as briefly as possible: I don't want to add even one line of
code to distinguish pg_terminate_backend from database-wide shutdown.
That function should be a last-ditch tool, not something used on a daily
basis.  So I disagree with the premise as much as with any particular
implementation.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump directory archive format / parallel pg_dump

2011-01-21 Thread Heikki Linnakangas


On 21.01.2011 15:35, Robert Haas wrote:

On Fri, Jan 21, 2011 at 4:41 AM, Heikki Linnakangas
  wrote:

There's one UI thing that bothers me. The option to specify the target
directory is called --file. But it's clearly not a file. OTOH, I'd hate to
introduce a parallel --dir option just for this. Any thoughts on this?


If we were starting over, I'd probably suggest calling the option -o,
--output.  But since -o is already taken (for --oids) I'd be inclined
to just make the help text read:

   -f, --file=FILENAME output file (or directory) name
   -F, --format=c|t|p|doutput file format (custom, tar, text, dir)


Ok, that's exactly what the patch does now. I guess it's fine then.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More detailed auth info

2011-01-21 Thread Magnus Hagander

On Fri, Jan 21, 2011 at 16:32, Robert Haas  wrote:
> On Fri, Jan 21, 2011 at 10:14 AM, Magnus Hagander  wrote:
>> On Fri, Jan 21, 2011 at 15:51, Tom Lane  wrote:
>>> Magnus Hagander  writes:
 I came across a case this week where I wanted to be able to determine
 more detailed auth information on already logged in sessions - not
 from the client, but from the server. In this specific case, I wanted
 to examine the "is ssl" flag on the connection. But I can see other
 things being interesting, such as which user is on the other end (when
 pg_ident is in use), more detailed SSL information, full kerberos
 principal when kerberos in use etc.
>>>
 I doubt this is common enough to want to stick it in pg_stat_activity
 though, but what do people think? And if not there, as a separate view
 or just as a function to call (e.g.
 pg_get_detailed_authinfo())
>>>
>>> By and large, it's been thought to be a possible security hole to expose
>>> such information, except possibly in the postmaster log.  I'm certainly
>>> *not* in favor of creating a view for it.
>>
>> Well, it would obviously be superuser only.
>
> What if the user's password is in their connection string?

Um, none of the fields I've suggested so far was "connection string".
In fact, that would be Pretty Darn Hard without modifying the client
to actually *send* the connection string. Which id doesn't.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is there a way to build PostgreSQL client libraries with MinGW

2011-01-21 Thread Magnus Hagander

On Fri, Jan 21, 2011 at 16:24, Andrew Dunstan  wrote:
>
>
> On 01/21/2011 05:24 AM, Magnus Hagander wrote:
>>>
>>> That advice needs to be taken with a grain or two of salt. First, while
>>> you
>>> probably should not use Cygwin postgres as a production server, it is
>>> still
>>> the best way to run psql on Windows that I know of. And second, the stuff
>>
>> Yeah, I agree for psql the client tool (though it used to suck badly
>> if you were in a non-english locale, but they may have fixed that).
>> But not for PostgreSQL the full product. I guess we could add a
>> sentence about the client side, but it needs to be clear that the
>> non-sucky part only applies to the client.
>
>
> It's not so bad it can't be used for development, and I have known people
> who do that, and indeed I have deployed one very complex app developed in
> just that way.
>
> More importantly from my POV, there is no support in the buildfarm for just
> building the client side, and I have no intention of providing it. So it's
> not insignificant for us to be able to continue supporting a complete build
> on Cygwin, however much you dislike it.

That's certainly a reasonable argument. And I don't mind supporting a
complete build env for it either - as long as *I* don't have to do it.
And you seem to be doing a good job at it.

>>> about not being able to generate 64-bit binaries with Mingw is no longer
>>> true (that's why it's no longer called Mingw32), although it is true that
>>> nobody I know has yet tried to do so. It's on my long TODO list, and well
>>> worth doing. (Relying on one compiler is the techno equivalent of
>>> monolingualism, which my sister's bumper sticker used to tell me is a
>>> curable condition.)
>>
>> It's true from the perspective of *postgresql* - you can't use those
>> compiler to generate 64-bit binaries of PostgreSQL. And it's referring
>> to "these builds", not the compiler itself.
>>
>> And I'm certainly not going to stand in the way of somebody adding
>> build support for it if they (you or others) want to spend time on it
>> - that patch should just include an update to that documentation
>> paragraph, of course.
>>
>> Personally, I'm going to put what time I can put into "windows build
>> system updates" into making us work with VS 2010 because I find that
>> more important - but that's just me personally.
>>
>
>
> VS2010 is important, no doubt. But clearly there's some demand for continued
> Mingw support, hence the OP's question.
>
> As I've remarked before, I think we should support as many build
> platforms/environments as we can.

Definitely agreed - as long as it doesn't mean we have to avoid adding
useful features because a specific compiler/env can't deal with it.
Which we've been reasonably able to avoid so far.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI and Hot Standby

2011-01-21 Thread Tom Lane

Simon Riggs  writes:
> On Fri, 2011-01-21 at 11:19 +0200, Heikki Linnakangas wrote:
>> It's not the order in which the xid was assigned that matters, but the 
>> order the transactions started and got their snapshots. The xids might 
>> be assigned a lot later, after the transactions have already read data.

> So if a read-write transaction assigns an xid before it takes a snapshot
> then we'll be OK? That seems much easier to arrange than passing chunks
> of snapshot data backwards and forwards. Optionally.

No, that idea is DOA from a performance standpoint.  We sweated blood to
avoid having to assign XIDs to read-only transactions, and we're not
going back.  If SSI requires that, SSI is not getting committed.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More detailed auth info

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 10:14 AM, Magnus Hagander  wrote:
> On Fri, Jan 21, 2011 at 15:51, Tom Lane  wrote:
>> Magnus Hagander  writes:
>>> I came across a case this week where I wanted to be able to determine
>>> more detailed auth information on already logged in sessions - not
>>> from the client, but from the server. In this specific case, I wanted
>>> to examine the "is ssl" flag on the connection. But I can see other
>>> things being interesting, such as which user is on the other end (when
>>> pg_ident is in use), more detailed SSL information, full kerberos
>>> principal when kerberos in use etc.
>>
>>> I doubt this is common enough to want to stick it in pg_stat_activity
>>> though, but what do people think? And if not there, as a separate view
>>> or just as a function to call (e.g.
>>> pg_get_detailed_authinfo())
>>
>> By and large, it's been thought to be a possible security hole to expose
>> such information, except possibly in the postmaster log.  I'm certainly
>> *not* in favor of creating a view for it.
>
> Well, it would obviously be superuser only.

What if the user's password is in their connection string?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is there a way to build PostgreSQL client libraries with MinGW

2011-01-21 Thread Andrew Dunstan




On 01/21/2011 05:24 AM, Magnus Hagander wrote:


That advice needs to be taken with a grain or two of salt. First, while you
probably should not use Cygwin postgres as a production server, it is still
the best way to run psql on Windows that I know of. And second, the stuff

Yeah, I agree for psql the client tool (though it used to suck badly
if you were in a non-english locale, but they may have fixed that).
But not for PostgreSQL the full product. I guess we could add a
sentence about the client side, but it needs to be clear that the
non-sucky part only applies to the client.



It's not so bad it can't be used for development, and I have known 
people who do that, and indeed I have deployed one very complex app 
developed in just that way.


More importantly from my POV, there is no support in the buildfarm for 
just building the client side, and I have no intention of providing it. 
So it's not insignificant for us to be able to continue supporting a 
complete build on Cygwin, however much you dislike it.






about not being able to generate 64-bit binaries with Mingw is no longer
true (that's why it's no longer called Mingw32), although it is true that
nobody I know has yet tried to do so. It's on my long TODO list, and well
worth doing. (Relying on one compiler is the techno equivalent of
monolingualism, which my sister's bumper sticker used to tell me is a
curable condition.)

It's true from the perspective of *postgresql* - you can't use those
compiler to generate 64-bit binaries of PostgreSQL. And it's referring
to "these builds", not the compiler itself.

And I'm certainly not going to stand in the way of somebody adding
build support for it if they (you or others) want to spend time on it
- that patch should just include an update to that documentation
paragraph, of course.

Personally, I'm going to put what time I can put into "windows build
system updates" into making us work with VS 2010 because I find that
more important - but that's just me personally.




VS2010 is important, no doubt. But clearly there's some demand for 
continued Mingw support, hence the OP's question.


As I've remarked before, I think we should support as many build 
platforms/environments as we can.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup for streaming base backups

2011-01-21 Thread Tom Lane

Fujii Masao  writes:
> Probably true. But we cannot create a tablespace for root-owned directory.
> The directory must be owned by the PostgreSQL system user. So ISTM that
> you says that creating a tablespace on a mount point itself is a security 
> hole.

Generally, the root user would have to mount the filesystem and then
create a Postgres-owned directory under it, yes.  This is a feature not
a bug.

>> In the case you sketch, there would be nothing to stop the (non root)
>> postgres user from renaming $PGDATA/mnt to something else and then
>> inserting his own trojan-horse directories.

> Hmm.. can non-root postgres user really rename the root-owned directory
> while it's being mounted?

If you have write privilege on the parent directory, you can rename any
filesystem entry.

>> Moreover, I see no positive *good* reason to do it.  There isn't
>> anyplace under $PGDATA that users should be randomly creating
>> directories, much less mount points.

> When taking a base backup, you don't need to take a backup of tablespaces
> separately from that of $PGDATA. You have only to take a backup of $PGDATA.

Doesn't work, and doesn't tell you it didn't work, if the mount point
isn't mounted.  I believe "what happens if the secondary filesystem
isn't mounted" is exactly one of the basic reasons for the
mount-points-must-be-owned-by-root rule.  Otherwise, applications may
scribble directly on the / drive, which results in serious problems when
the mount eventually comes back.  There's an example in our archives
(from Joe Conway if memory serves) about someone destroying their
database that way.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sepgsql contrib module

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 9:55 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> For that matter, I wonder what happens with regular function
>> permissions.  If the plan inlines the function and then somebody goes
>> and changes the permission on the function and makes it SECURITY
>> DEFINER, what happens?
>
> ALTER FUNCTION is supposed to cause plan invalidation in such a case.
> Not sure if GRANT plays nice with that though.

And in the case of SE-Linux, this could get changed from outside the
database.  Not sure how to handle that.  I guess we could just never
inline anything, but that might be an overreaction.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: FDW API

2011-01-21 Thread Heikki Linnakangas


On 18.01.2011 17:26, Shigeru HANADA wrote:

1) 20110118-no_fdw_perm_check.patch - This patch is not included in
last post.  This had been proposed on 2011-01-05 first, but maybe has
not been reviewd yet.  I re-propose this patch for SQL standard
conformance.  This patch removes permission check that requires USAGE
on the foreign-data wrapper at CREATE FOREIGN TABLE.
Please see original post for details.
http://archives.postgresql.org/message-id/20110105145206.30fd.69899...@metrosystems.co.jp


Committed this part.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More detailed auth info

2011-01-21 Thread Magnus Hagander

On Fri, Jan 21, 2011 at 15:51, Tom Lane  wrote:
> Magnus Hagander  writes:
>> I came across a case this week where I wanted to be able to determine
>> more detailed auth information on already logged in sessions - not
>> from the client, but from the server. In this specific case, I wanted
>> to examine the "is ssl" flag on the connection. But I can see other
>> things being interesting, such as which user is on the other end (when
>> pg_ident is in use), more detailed SSL information, full kerberos
>> principal when kerberos in use etc.
>
>> I doubt this is common enough to want to stick it in pg_stat_activity
>> though, but what do people think? And if not there, as a separate view
>> or just as a function to call (e.g.
>> pg_get_detailed_authinfo())
>
> By and large, it's been thought to be a possible security hole to expose
> such information, except possibly in the postmaster log.  I'm certainly
> *not* in favor of creating a view for it.

Well, it would obviously be superuser only.

Would you object to a function as well?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI and Hot Standby

2011-01-21 Thread Nicolas Barbier

2011/1/21 Robert Haas :

> On Fri, Jan 21, 2011 at 8:05 AM, Nicolas Barbier
>  wrote:
>
>> 2011/1/21 Anssi Kääriäinen :
>>
>>> Sorry for bothering all of you, but I just don't get this. What if T2 rolls
>>> back instead of committing? Then the snapshot of T3 would have been valid,
>>> right? Now, for the snapshot of T3 it doesn't matter if T2 commits or if it
>>> doesn't, because it can't see the changes of T2 in any case. Thus, it would
>>> seem that the snapshot is valid. On the other hand I can't see anything
>>> wrong in the logic in your post. What am I missing? I am feeling stupid...
>>>
>>> At least for dumps I don't see how T2 can matter (assuming T3 is the
>>> pg_dump's snapshot). Because if you reload from the dump, T2 never happened
>>> in that dump. In the reloaded database it just did not exist at all.
>>
>> This has been discussed before; in [1] I summarized:
>>
>> "IOW, one could say that the backup is consistent only if it were
>> never compared against the system as it continued running after the
>> dump took place."
>
> But that's a pretty fair way to look at it, isn't it?

Indeed, I just wanted to point Anssi to the previous discussion.

> It occurs to me that focusing on how this is going to work on Hot
> Standby might be looking at the question too narrowly.  The general
> issue is - does this technique generalize to a distributed computing
> environment, with distributed transactions across multiple PostgreSQL
> databases?  For example, what if the control record in Kevin's example
> is stored in another database, or on another server.  Or what if some
> tables are replicated via Slony?  I realize this is all outside the
> scope of the patch, but that's exactly the point: making this stuff
> work across multiple databases (even if they are replicas of each
> other) is much more complex than getting it to work on just one
> machine.  Even if we could agree on how to do it, coming up with some
> hack that can only ever possibly work in the Hot Standby case might
> not be the best thing to do.

You seem to be questioning whether the normal (?) way of using 2PC on
multiple DBs (just start transactions, and let 2PC coordinate the
commits) that all use SERIALIZABLE isolation mode always results in
global serializable behavior.

I must say that I don't immediately see the answer (my gut feeling
says "nope"), but it sure is an interesting question.

In the special case where all databases use SS2PL as their concurrency
mechanism, the answer is "yes". (Because any "possibly conflicting"
local transactions of any global transactions that touch the same
tables ("conflict") will necessarily commit in some globally
consistent order, which in the case of SS2PL is consistent with the
"must-have-happened-before-in-any-serialized-order" relation, i.e.,
using the terminology from the literature: "because SS2PL is CO".)

I don't know whether it is generally considered a requirement for a
2PC environment to guarantee global serializability.

Nicolas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ToDo List Item - System Table Index Clustering

2011-01-21 Thread Bruce Momjian

Robert Haas wrote:
> On Thu, Jan 20, 2011 at 4:40 PM, Simone Aiken
>  wrote:
> > After playing with this in benchmarks and researching the weird results I
> > got I'm going to advise dropping the todo for now unless something happens
> > to change how postgres handles clustering.
> 
> I agree, let's remove it.
> 
> That having been said, analyzing TODO items to figure out which ones
> are worthless is a useful thing to do, so please feel free to keep at
> it.

OK, removed.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI and Hot Standby

2011-01-21 Thread Kevin Grittner

Robert Haas  wrote:
> On Fri, Jan 21, 2011 at 8:05 AM, Nicolas Barbier

>> This has been discussed before; in [1] I summarized:
>>
>> "IOW, one could say that the backup is consistent only if it were
>> never compared against the system as it continued running after
>> the dump took place."
> 
> But that's a pretty fair way to look at it, isn't it?  I mean, I
> guess it's a question of what you plan to use that backup for, but
> if it's disaster recovery, everything that happened after the dump
> is gone, so no such comparison will occur.  And that's probably
> the most common reason for taking a dump.

It's not, however, a reason for having a hot standby (versus a warm
standby or PITR backup).

> It occurs to me that focusing on how this is going to work on Hot
> Standby might be looking at the question too narrowly.  The
> general issue is - does this technique generalize to a distributed
> computing environment, with distributed transactions across
> multiple PostgreSQL databases?

No, and I can pretty much guarantee that you can't have such a
solution without blocking on all masters at commit time.  What
you're suggesting goes *way* beyond two phase commit, which just
guarantees the integrity rules of each database are honored and
that all transactions either commit or don't.  You're talking about
sharing lock information across high-latency communication links
which in SSI are communicated via LW locking.  Expect any such
generalized "and world peace!" solution to be rather slow.

> For example, what if the control record in Kevin's example is
> stored in another database, or on another server.  Or what if some
> tables are replicated via Slony?  I realize this is all outside
> the scope of the patch

Yep.  Again, the patch achieves true serializability with minimal
cost and *no blocking*.  Spend a few minutes thinking about how you
might coordinate what you propose, and you'll see it's going to
involve blocking based on waiting for messages from across the wire.

> but that's exactly the point: making this stuff work across
> multiple databases (even if they are replicas of each other) is
> much more complex than getting it to work on just one machine. 
> Even if we could agree on how to do it, coming up with some hack
> that can only ever possibly work in the Hot Standby case might not
> be the best thing to do.

I don't see it as a hack.  It's the logical extension of SSI onto
read only replicas.  If you're looking for something more than that
(as the above suggests), it's not a good fit; but I suspect that
there are people besides me who would want to use hot standby for
reporting and read only web access who would want a serializable
view.  What this proposal does is to say that there are two time
streams to look at on the standby -- how far along you are for
purposes of recovery, and how far along you are for purposes of
seeing a view of the data sure to be consistent the later state of
the master.  With SSI they can't be the same.  If someone wants them
to be, they could implement a traditional S2PL serializable mode,
complete with blocking and deadlocks, and then you'd have it
automatically on the replicas, because with S2PL the apparent order
of execution matches the commit order.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sepgsql contrib module

2011-01-21 Thread Tom Lane

Robert Haas  writes:
> For that matter, I wonder what happens with regular function
> permissions.  If the plan inlines the function and then somebody goes
> and changes the permission on the function and makes it SECURITY
> DEFINER, what happens?

ALTER FUNCTION is supposed to cause plan invalidation in such a case.
Not sure if GRANT plays nice with that though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI and Hot Standby

2011-01-21 Thread Anssi Kääriäinen


On 01/21/2011 02:21 PM, Florian Pflug wrote:

Still, the would dump reflects a database state that *logically* never existed 
(i.e. not in any serial schedule). If you dump for disaster recovery, you might 
not care. If you dump to copy the data onto some reporting server you might.

best regards,
Florian Pflug


I am beginning to understand the problem. If you don't mind, here is a 
complete example if somebody else is having troubles understanding this.


Let's say we have tables D1 and D2. Both contain a single column, id, 
and a single row. The data in the beginning is as follows:


D1:  id = 1
D2:  id = 1

The constrains: D1.id can only be incremented. Whenever D2.id is 
updated, it must be updated to D1.id + 1.


The transactions:
T1: begin; update D1 set id = id + 1;
T2: begin; update D2 set id = (select id+1 from D1);
T1: commit;
T3: begin; select id from D1; select id from D2; commit; Data seen: (2, 
1) -- this is a possible state

T2: commit;
T4: begin; select id from D1; select id from D2; commit; Data seen: (2, 2)
This is again a possible state. But if we compare this to the state seen 
by T3 this is not valid. From state (2, 1) we can not get to state (2, 
2) without breaking one of the constraints. Thus, the state of T3 is not 
valid in the database.


So, I finally got it! :-) I hope this example will help somebody else 
understand the problem. The problem I had understanding this was that 
the state in T3 is in fact perfectly valid. I though that there must be 
some problem with that state alone. There isn't, unless you compare it 
to the state after T2 has committed.


Thanks to all explaining this to me,
 - Anssi

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More detailed auth info

2011-01-21 Thread Tom Lane

Magnus Hagander  writes:
> I came across a case this week where I wanted to be able to determine
> more detailed auth information on already logged in sessions - not
> from the client, but from the server. In this specific case, I wanted
> to examine the "is ssl" flag on the connection. But I can see other
> things being interesting, such as which user is on the other end (when
> pg_ident is in use), more detailed SSL information, full kerberos
> principal when kerberos in use etc.

> I doubt this is common enough to want to stick it in pg_stat_activity
> though, but what do people think? And if not there, as a separate view
> or just as a function to call (e.g.
> pg_get_detailed_authinfo())

By and large, it's been thought to be a possible security hole to expose
such information, except possibly in the postmaster log.  I'm certainly
*not* in favor of creating a view for it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI and Hot Standby

2011-01-21 Thread Kevin Grittner

"Kevin Grittner"  wrote:

> (1)  A read write transaction might need to be canceled to
> prevent the view of the data a committed read only transaction has
> already seen from becoming inconsistent.  (Dan's example)

And this one seems entirely a theoretical possibility.  I spent a
little time looking it over, and I don't see how it could be made to
work from hot standbys without an unbounded flow of predicate lock
information from all standbys to the master *plus* blocking commits
on the master for the duration of the longest round trip latency to
any standby.  I think we can call this one dead on arrival.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] More detailed auth info

2011-01-21 Thread Magnus Hagander

I came across a case this week where I wanted to be able to determine
more detailed auth information on already logged in sessions - not
from the client, but from the server. In this specific case, I wanted
to examine the "is ssl" flag on the connection. But I can see other
things being interesting, such as which user is on the other end (when
pg_ident is in use), more detailed SSL information, full kerberos
principal when kerberos in use etc.

I doubt this is common enough to want to stick it in pg_stat_activity
though, but what do people think? And if not there, as a separate view
or just as a function to call (e.g.
pg_get_detailed_authinfo())

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQL/MED - file_fdw

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 8:59 AM, Itagaki Takahiro
 wrote:
> On Fri, Jan 21, 2011 at 22:12, Shigeru HANADA  
> wrote:
>>> My concern is the explainInfo interface is not ideal for the purpose
>>> and therefore it will be unstable interface. If we support nested plans
>>> in FDWs, each FDW should receive a tree writer used internally in
>>> explain.c. explainInfo, that is a plan text, is not enough for complex
>>> FdwPlans. However, since we don't have any better solution for now,
>>> we could have the variable for 9.1. It's much better than nothing.
>>
>> When I was writing file_fdw, I hoped to use static functions in
>> explain.c such as ExplainProperty() to handle complex information.
>> Even for single plan node, I think that filename and size (currently
>> they are printed in a plain text together) should be separated in the
>> output of explain, especially when the format was XML or JSON.
>
> Just an idea -- we could return complex node trees with explainInfo
> if we use XML or JSON for the format. For example, pgsql_fdw can
> return the result from "EXPLAIN (FORMAT json)" without modification.
>
> It might be one of the reasons we should should support JSON in the core :)

Nice try, but I think that'd be a real drag.  You wouldn't want to
return JSON when the explain format is text, or XML.

I think we probably need to modify the EXPLAIN code so that FDWs get a
chance to inject their own customer properties into the output, but I
don't know that we need to get that done right this minute.  We can
ship something really crude/basic for 9.1, if need be, and fix this up
for 9.2.  Of course if it turns out that getting EXPLAIN working the
way we'd like is really easy, then we can just do it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sepgsql contrib module

2011-01-21 Thread Robert Haas

2011/1/21 KaiGai Kohei :
> - Add checks to avoid inlining function without db_procedure:{execute}
>  permission. Sorry, process:{transition} shall be checked in other place.

Hrm.  What happens if permissions change between plan time and execution time?

For that matter, I wonder what happens with regular function
permissions.  If the plan inlines the function and then somebody goes
and changes the permission on the function and makes it SECURITY
DEFINER, what happens?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQL/MED - file_fdw

2011-01-21 Thread Itagaki Takahiro

On Fri, Jan 21, 2011 at 22:12, Shigeru HANADA  wrote:
>> My concern is the explainInfo interface is not ideal for the purpose
>> and therefore it will be unstable interface. If we support nested plans
>> in FDWs, each FDW should receive a tree writer used internally in
>> explain.c. explainInfo, that is a plan text, is not enough for complex
>> FdwPlans. However, since we don't have any better solution for now,
>> we could have the variable for 9.1. It's much better than nothing.
>
> When I was writing file_fdw, I hoped to use static functions in
> explain.c such as ExplainProperty() to handle complex information.
> Even for single plan node, I think that filename and size (currently
> they are printed in a plain text together) should be separated in the
> output of explain, especially when the format was XML or JSON.

Just an idea -- we could return complex node trees with explainInfo
if we use XML or JSON for the format. For example, pgsql_fdw can
return the result from "EXPLAIN (FORMAT json)" without modification.

It might be one of the reasons we should should support JSON in the core :)

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Large object corruption during 'piped' pg_restore

2011-01-21 Thread Vick Khera

On Thu, Jan 20, 2011 at 6:14 PM, Tom Lane  wrote:
> So I'm not sure whether to fix it, or leave it as a known failure case
> in old branches.  Comments?

Since there is a workaround, I think it is best to document it and
leave it as-is.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI and Hot Standby

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 8:05 AM, Nicolas Barbier
 wrote:
> 2011/1/21 Anssi Kääriäinen :
>
>> Sorry for bothering all of you, but I just don't get this. What if T2 rolls
>> back instead of committing? Then the snapshot of T3 would have been valid,
>> right? Now, for the snapshot of T3 it doesn't matter if T2 commits or if it
>> doesn't, because it can't see the changes of T2 in any case. Thus, it would
>> seem that the snapshot is valid. On the other hand I can't see anything
>> wrong in the logic in your post. What am I missing? I am feeling stupid...
>>
>> At least for dumps I don't see how T2 can matter (assuming T3 is the
>> pg_dump's snapshot). Because if you reload from the dump, T2 never happened
>> in that dump. In the reloaded database it just did not exist at all.
>
> This has been discussed before; in [1] I summarized:
>
> "IOW, one could say that the backup is consistent only if it were
> never compared against the system as it continued running after the
> dump took place."

But that's a pretty fair way to look at it, isn't it?  I mean, I guess
it's a question of what you plan to use that backup for, but if it's
disaster recovery, everything that happened after the dump is gone, so
no such comparison will occur.  And that's probably the most common
reason for taking a dump.

It occurs to me that focusing on how this is going to work on Hot
Standby might be looking at the question too narrowly.  The general
issue is - does this technique generalize to a distributed computing
environment, with distributed transactions across multiple PostgreSQL
databases?  For example, what if the control record in Kevin's example
is stored in another database, or on another server.  Or what if some
tables are replicated via Slony?  I realize this is all outside the
scope of the patch, but that's exactly the point: making this stuff
work across multiple databases (even if they are replicas of each
other) is much more complex than getting it to work on just one
machine.  Even if we could agree on how to do it, coming up with some
hack that can only ever possibly work in the Hot Standby case might
not be the best thing to do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI and Hot Standby

2011-01-21 Thread Kevin Grittner

Heikki Linnakangas  wrote:

> It's not the order in which the xid was assigned that matters, but
> the order the transactions started and got their snapshots. The
> xids might be assigned a lot later, after the transactions have
> already read data.

>From the "Apparent Serial Order of Execution" sub-section of the
Overview section of the Wiki page:

http://wiki.postgresql.org/wiki/Serializable#Apparent_Serial_Order_of_Execution

| The order in which transactions appear to have executed is
| determined by something more subtle than in S2PL: read/write
| dependencies. If a transaction attempts to read data which is not
| visible to it because the transaction which wrote it (or will later
| write it) is concurrent (one of them was running when the other
| acquired its snapshot), then the reading transaction appears to
| have executed first, regardless of the actual sequence of
| transaction starts or commits (since it sees a database state prior
| to that in which the other transaction leaves it).

It is the fact that it generates results consistent with some serial
order of execution which makes this truly serializable.  With SSI
that order isn't tied to transaction start or transaction commit when
transactions overlap -- the one which *appears* to have run first
might have started first *and* committed first.  And as Dan pointed
out with a repeatable example, a read only transaction can contribute
to a cycle in teh apparent order of execution, even if it commits
ahead of one of the writers.

There really are only two ways a read only transaction can be
guaranteed to see consistent data under SSI:

(1)  A read write transaction might need to be cancelled to prevent
the view of the data a committed read only transaction has already
seen from becoming inconsistent.  (Dan's example)

(2)  The read only transaction needs to get a snapshot which the
master has determined is safe.

There really aren't any other alternatives under SSI.  The question
is which we support on hot standbys and how we achieve it.  Or
whether we just dont, I guess; but that doesn't seem very
satisfactory as a long term solution.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump directory archive format / parallel pg_dump

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 4:41 AM, Heikki Linnakangas
 wrote:
> There's one UI thing that bothers me. The option to specify the target
> directory is called --file. But it's clearly not a file. OTOH, I'd hate to
> introduce a parallel --dir option just for this. Any thoughts on this?

If we were starting over, I'd probably suggest calling the option -o,
--output.  But since -o is already taken (for --oids) I'd be inclined
to just make the help text read:

  -f, --file=FILENAME output file (or directory) name
  -F, --format=c|t|p|doutput file format (custom, tar, text, dir)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep for 2011CF1

2011-01-21 Thread Magnus Hagander

On Fri, Jan 21, 2011 at 14:24, Simon Riggs  wrote:
> On Fri, 2011-01-21 at 14:45 +0200, Heikki Linnakangas wrote:
>> * it seems like overkill to not let clients to even connect when
>> allow_standalone_primary=off and no synchronous standbys are available.
>> What if you just want to run a read-only query?
>
> That's what Aidan requested, I agreed and so its there. You're using
> sync rep because of writes, so you have a read-write app. If you allow
> connections then half of the app will work, half will not. Half-working
> isn't very useful, as Aidan eloquently explained. If your app is all
> read-only you wouldn't be using sync rep anyway. That's the argument,
> but I've not got especially strong feelings it has to be this way.
>
> Perhaps discuss that on a separate thread? See what everyone thinks?

I'll respond here once, and we'll see if more people want to comment
then we can move it :-)

Doesn't this make a pretty strange assumption - namely that you have a
single application? We support multiple databases, and multiple users,
and multiple pretty much anything - in most cases, people deploy
multiple apps. (They may well be part of the same "solution" or
whatever you want to call it, but parts may well be readonly - like a
reporting app, or even just a monitoring client)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep for 2011CF1

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 7:45 AM, Heikki Linnakangas
 wrote:
> * it seems like overkill to not let clients to even connect when
> allow_standalone_primary=off and no synchronous standbys are available. What
> if you just want to run a read-only query?

For what it's worth, +1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for "terminating connection due to conflict with recovery"

2011-01-21 Thread Robert Haas

On Fri, Jan 21, 2011 at 7:48 AM, Simon Riggs  wrote:
> Ah, thanks Florian. Now I understand. There are two related issues here.
>
> 1. The discussion around ERRCODE_ADMIN_SHUTDOWN is incorrect and the
> specific patch should be rejected as is. No changes are required in
> ProcessInterrupts(), nor new errcodes.

Can you please justify that statement instead of simply asserting it?
Tatsuo-san and I both seem to agree that it looks wrong.
ERRCODE_ADMIN_SHUTDOWN is in class 57, operator intervention, and it's
used elsewhere when a SIGTERM is received and the database is shutting
down.  That's a world away from what's actually happening here.
Wanting to have a different error code for this type of failure may
make sense, but that doesn't mean that this is the right one.

> 2. Robert is correct that CheckRecoveryConflictDeadlock() returns
> ERRCODE_QUERY_CANCELED. Thanks to Florian for noting that we had
> switched away from the original discussion onto another part of the
> code, which confused me. I agree the use of ERRCODE_QUERY_CANCELED is a
> mistake; CheckRecoveryConflictDeadlock() should return
> ERRCODE_T_R_DEADLOCK_DETECTED. This was an omission from my commit of 12
> May 2010.

This part sounds good.

> This should be backpatched to 9.0.

Hmm, I don't necessarily agree.  The standard for changing behavior in
an existing release is fairly high.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep for 2011CF1

2011-01-21 Thread Simon Riggs

On Fri, 2011-01-21 at 14:45 +0200, Heikki Linnakangas wrote:
> (grr, I wrote this on Monday already, but just found it in my drafts 
> folder, unsent)

No worries, thanks for commenting.

> Thanks! Some quick observations after first read-through:
> 
> * The docs for synchronous_replication still claim that it means two 
> different things in master and standby. Looking at the code, I believe 
> that's not true anymore.

Probably. The docs changed so many times I had gone "code-blind".

> * it seems like overkill to not let clients to even connect when 
> allow_standalone_primary=off and no synchronous standbys are available. 
> What if you just want to run a read-only query?

That's what Aidan requested, I agreed and so its there. You're using
sync rep because of writes, so you have a read-write app. If you allow
connections then half of the app will work, half will not. Half-working
isn't very useful, as Aidan eloquently explained. If your app is all
read-only you wouldn't be using sync rep anyway. That's the argument,
but I've not got especially strong feelings it has to be this way.

Perhaps discuss that on a separate thread? See what everyone thinks?

> * Please separate the hot standby feedback loop into a separate patch on 
> top of the synch rep patch. I know it's not a lot of code, but it's 
> still easier to handle features separately.

I tried to do that initially, but there is interaction between those
features. The way I have it is that the replies from the standby act as
keepalives to the master. So the hot standby feedback is just an extra
parameter and an extra field. Removing that doesn't really make the
patch any easier to understand.

> * The UI differs from what was agreed on here: 
> http://archives.postgresql.org/message-id/4d1dcf5a.7070...@enterprisedb.com.

You mean synchronous_standbys is not there yet? Yes, I know. It can be
added after we commit this, its only a small bit of code and no
dependencies. I figured we had bigger things to agree first.

> * Instead of the short-circuit for autovacuum in SyncRepWaitOnQueue(), 
> it's probably better to set synchronous_commit=off locally when the 
> autovacuum process starts.

Even better plan, thanks.

> * the "queue id" thing is dead code at the moment, as there is only one 
> queue. I gather this is a leftover from having different queues for 
> "apply", "sync", "write" modes, but I think it would be better to just 
> remove it for now.

It's a trivial patch to add options to either fsync or apply, so I was
expecting to add that back in this release also.

> PS, I'm surprised how small this patch is. Thinking about it some more, 
> I don't know why I expected this to be a big patch.

Yes, it's the decisions which seem fairly big this time.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] REVIEW: "writable CTEs" - doc patch

2011-01-21 Thread Magnus Hagander

On Thu, Jan 20, 2011 at 03:48, Peter Geoghegan
 wrote:
> I think that a major goal of the DocBook format is that it separates
> content from presentation, so whatever tool is used to render that
> content as HTML for .org isn't necessarily publicly available.

Sure it is. And FWIW, it just uses the HTML generated by the docbook
tools, we don't actually work off the SGML.

The current version can be found at
https://pgweb.postgresql.org/browser/trunk/portal/tools/docs.

The new, upcoming, version that does things like attempt-to-make-xhtml
is at 
http://git.postgresql.org/gitweb?p=pgweb.git;a=blob;f=tools/docs/docload.py;h=5abc2f734003d28836a85fbc0ec6c01804937af8;hb=a3523e2ba8a250950a928879ae7fa5c0a8a46d94

You will quickly notice they do basically nothing other than read in
the HTML, and then "copy/paste" it into the current framework.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQL/MED - file_fdw

2011-01-21 Thread Shigeru HANADA

On Thu, 20 Jan 2011 22:21:37 +0900
Itagaki Takahiro  wrote:
> On Wed, Jan 19, 2011 at 00:34, Shigeru HANADA  
> wrote:
> > Attached patch requires FDW API patches and copy_export-20110114.patch.
> 
> Some minor comments:

Thanks for the comments.
I'll post revised version of patches in new threads.

> * Can you pass slot->tts_values and tts_isnull directly to NextCopyFrom()?
> It won't allocate the arrays; just fill the array buffers.

I think it's safe to pass them to NextCopyFrom() directly because that
arrays are allocated in ExecSetSlotDescriptor() during
ExecInitForeignScan(), and size of arrays are taken from 
Please let me know if I've missed your point.

> * You can pass NULL for the 4th argument for NextCopyFrom().
> | Oid tupleoid; /* just for required parameter */

I didn't know that it's NULL-safe, thanks.

> * file_fdw_validator still has duplicated codes with BeginCopy,
> but I have no idea to share the validation code in clean way...

It would be necessary to change considerable part of BeginCopy() to
separate validation from it to use validation from file_fdw...

> * Try strVal() instead of DefElem->val.str
> * FdwEPrivate seems too abbreviated for me. How about FileFdwPrivate?

Thanks, fixed.

> * "private" is a bad identifier name because it's a C++ keyword.
> We should rename FdwExecutionState->private.

Renamed to fdw_private, including another 'private' in FdwPlan.

> > In that message, you also pointed out that FDW must generate
> > explainInfo in every PlanRelScan call even if the planning is not for
> > EXPLAIN.  I'll try to defer generating explainInfo until EXPLAIN
> > VERBOSE really uses it.  It might need new hook point in expalain.c,
> > though.
> 
> I complained about the overhead, but it won't be a problem for
> file_fdw and pgsql_fdw. file_fdw can easily generate the text,
> and pgsql_fdw needs to generate a SQL query anyway.
> 
> My concern is the explainInfo interface is not ideal for the purpose
> and therefore it will be unstable interface. If we support nested plans
> in FDWs, each FDW should receive a tree writer used internally in
> explain.c. explainInfo, that is a plan text, is not enough for complex
> FdwPlans. However, since we don't have any better solution for now,
> we could have the variable for 9.1. It's much better than nothing.

When I was writing file_fdw, I hoped to use static functions in
explain.c such as ExplainProperty() to handle complex information. 
Even for single plan node, I think that filename and size (currently
they are printed in a plain text together) should be separated in the
output of explain, especially when the format was XML or JSON.

Regards,
--
Shigeru Hanada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI and Hot Standby

2011-01-21 Thread Nicolas Barbier

2011/1/21 Anssi Kääriäinen :

> Sorry for bothering all of you, but I just don't get this. What if T2 rolls
> back instead of committing? Then the snapshot of T3 would have been valid,
> right? Now, for the snapshot of T3 it doesn't matter if T2 commits or if it
> doesn't, because it can't see the changes of T2 in any case. Thus, it would
> seem that the snapshot is valid. On the other hand I can't see anything
> wrong in the logic in your post. What am I missing? I am feeling stupid...
>
> At least for dumps I don't see how T2 can matter (assuming T3 is the
> pg_dump's snapshot). Because if you reload from the dump, T2 never happened
> in that dump. In the reloaded database it just did not exist at all.

This has been discussed before; in [1] I summarized:

"IOW, one could say that the backup is consistent only if it were
never compared against the system as it continued running after the
dump took place."

Nicolas

[1] http://archives.postgresql.org/pgsql-hackers/2010-09/msg01763.php>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI and Hot Standby

2011-01-21 Thread Florian Pflug

On Jan21, 2011, at 10:19 , Heikki Linnakangas wrote:
> On 21.01.2011 11:10, Simon Riggs wrote:
>> So any xid that commits in a different sequence to the order in which
>> the xid was assigned creates a potential for unserialization? Or?
> 
> It's not the order in which the xid was assigned that matters, but the order 
> the transactions started and got their snapshots. The xids might be assigned 
> a lot later, after the transactions have already read data.

Any pair of concurrent transactions on the master between which 
r/w-dependencies exist are a potential risk. If their order in all equivalent 
serial schedule doesn't match their commit order, any snapshot taken between 
the two commits don't represent a fully consistent view of the database. 

Hm, wait a minute... Thinks...

Since transactions on the slave don't write, they really don't need to be aware 
of any SIREAD locks taken on the master, right? We'd still need to detect 
conflicts between SIREAD locks taken on the slaves and writes by transaction on 
the master, but that could be handled by the recovery process without having to 
report anything back to the master, and without logging SIREAD lock 
acquisitions. So, how about the following

A) We log r/w-dependencies between transactions committed on the master in the 
WAL, probably in the COMMIT record
B) SERIALIZABLE queries on the slave use the SIREAD lock machinery like they'd 
do on the master. The recovery process does the necessary conflict flagging in 
case the write happens (in wall clock time) after the slave, mimicking what the 
writing transaction had done on the master had it seen the SIREAD lock
C) By using the r/w-dependency information from the WAL plus the r/w-dependency 
information generated on the slave we can detect dangerous situations on the 
slave, and abort the offending query on the slave.

(A) and (B) seem quite straight-forward. Regarding (C), I'm not familiar enough 
with the inner workings of the SSI patch to judge that.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for "terminating connection due to conflict with recovery"

2011-01-21 Thread Simon Riggs

On Fri, 2011-01-21 at 13:09 +0100, Florian Pflug wrote:
> > 
> >>> I'd also be in favor of changing the one that uses
> >>> ERRCODE_QUERY_CANCELLED to use ERRCODE_T_R_SERIALIZATION_FAILURE, as
> >>> the former might be taken to imply active user intervention, and for
> >>> consistency.
> >> 
> >> +1.
> > 
> > We already use ERRCODE_T_R_SERIALIZATION_FAILURE for retryable errors,
> > which is almost every error. So no change required there.
> > 
> > ERRCODE_ADMIN_SHUTDOWN is used only in situations where we cannot
> > reconnect or retry because the database we said we wished to connect to
> > no longer exists. That needs to be a different error code to a normal,
> > retryable error, so that pgpool can tell the difference between things
> > it can help with and things it cannot help with.
> 
> Yeah. Clients absolutely need to be able to distinguish transient and
> permanent errors. Otherwise, how would a client know when to retry
> a transaction (as he needs to in case of a serialization anomaly) and
> when to report the error to the user?
> 
> ERRCODE_T_R_SERIALIZATION_FAILURE  and ERRCODE_T_R_DEADLOCK_DETECTED
> are probably both assumed to be transient failure by client aready. So
> we should use those two for transient recovery conflicts (i.e. those
> which go away if you retry) and something else for the others (like
> database dropped)
> 
> This'd mean that the code is fine as it is, except that we should
> raise ERRCODE_T_R_DEADLOCK_DETECTED instead of ERRCODE_QUERY_CANCELED
> in CheckRecoveryConflictDeadlock(). I might be missing something though -
> Simon, what were your reasons for using ERRCODE_QUERY_CANCELED there?

Ah, thanks Florian. Now I understand. There are two related issues here.

1. The discussion around ERRCODE_ADMIN_SHUTDOWN is incorrect and the
specific patch should be rejected as is. No changes are required in
ProcessInterrupts(), nor new errcodes.

2. Robert is correct that CheckRecoveryConflictDeadlock() returns
ERRCODE_QUERY_CANCELED. Thanks to Florian for noting that we had
switched away from the original discussion onto another part of the
code, which confused me. I agree the use of ERRCODE_QUERY_CANCELED is a
mistake; CheckRecoveryConflictDeadlock() should return
ERRCODE_T_R_DEADLOCK_DETECTED. This was an omission from my commit of 12
May 2010.

Tatsuo, would you like to modify the patch to correct the issue in
CheckRecoveryConflictDeadlock() ? Or would you prefer me to fix?

This should be backpatched to 9.0.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep for 2011CF1

2011-01-21 Thread Heikki Linnakangas

(grr, I wrote this on Monday already, but just found it in my drafts 
folder, unsent)


On 15.01.2011 23:40, Simon Riggs wrote:


Here's the latest patch for sync rep.


From here, I will be developing the patch further on public git

repository towards commit. My expectation is that commit is at least 2
weeks away, though there are no major unresolved problems. I expect
essential follow on patches to continue for a further 2-4 weeks after
that first commit.


Thanks! Some quick observations after first read-through:

* The docs for synchronous_replication still claim that it means two 
different things in master and standby. Looking at the code, I believe 
that's not true anymore.


* it seems like overkill to not let clients to even connect when 
allow_standalone_primary=off and no synchronous standbys are available. 
What if you just want to run a read-only query?


* Please separate the hot standby feedback loop into a separate patch on 
top of the synch rep patch. I know it's not a lot of code, but it's 
still easier to handle features separately.


* The UI differs from what was agreed on here: 
http://archives.postgresql.org/message-id/4d1dcf5a.7070...@enterprisedb.com.


* Instead of the short-circuit for autovacuum in SyncRepWaitOnQueue(), 
it's probably better to set synchronous_commit=off locally when the 
autovacuum process starts.


* the "queue id" thing is dead code at the moment, as there is only one 
queue. I gather this is a leftover from having different queues for 
"apply", "sync", "write" modes, but I think it would be better to just 
remove it for now.


PS, I'm surprised how small this patch is. Thinking about it some more, 
I don't know why I expected this to be a big patch.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup for streaming base backups

2011-01-21 Thread Magnus Hagander

On Fri, Jan 21, 2011 at 07:02, Fujii Masao  wrote:
> On Fri, Jan 21, 2011 at 1:00 AM, Tom Lane  wrote:
>> Fujii Masao  writes:
>>> On Thu, Jan 20, 2011 at 10:53 AM, Tom Lane  wrote:
>> In the case you sketch, there would be nothing to stop the (non root)
>> postgres user from renaming $PGDATA/mnt to something else and then
>> inserting his own trojan-horse directories.
>
> Hmm.. can non-root postgres user really rename the root-owned directory
> while it's being mounted?

No, but you can rename the parent directory of it, and then create
another directory inside it with the same name as the root owned
directory had.


>> Moreover, I see no positive *good* reason to do it.  There isn't
>> anyplace under $PGDATA that users should be randomly creating
>> directories, much less mount points.
>
> When taking a base backup, you don't need to take a backup of tablespaces
> separately from that of $PGDATA. You have only to take a backup of $PGDATA.

But why are you creating tablespaces in the first place, if you're
sticking them in $PGDATA?

I'd put myself in the +1 camp for "throw an error when someone tries
to create a tablespace inside $PGDATA".

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] REVIEW: EXPLAIN and nfiltered

2011-01-21 Thread Florian Pflug

On Jan21, 2011, at 03:29 , Tom Lane wrote:
> "Kevin Grittner"  writes:
>>> Robert Haas  wrote:
>>> Oh, you mean water that had some things you didn't want taken out
>>> of it?
> 
>> Right -- God only knows the number of things were filtered out to
>> leave me with filtered water.  What's "filtered" in this case is what
>> was passed through, not what was removed.
> 
> I think it's pretty common to use the phrase "filtered out" to identify
> the stuff that gets removed by the filter, as opposed to what gets
> through.  So we could possibly use "Rows Filtered Out: nnn".  I still
> think that's more awkward than "Rows Removed: nnn" though.

"Rows Skipped: nnn", maybe?

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI and Hot Standby

2011-01-21 Thread Florian Pflug

On Jan21, 2011, at 12:55 , Anssi Kääriäinen wrote:
> On 01/21/2011 03:25 AM, Florian Pflug wrote:
>> The COMMIT order in the actual, concurrent, schedule doesn't not necessarily
>> represent the order of the transaction in an equivalent serial schedule. 
>> Here's
>> an example
>> 
>> T1: BEGIN SERIALIZABLE; -- (Assume snapshot is set here)
>> T1: UPDATE D1 ... ;
>> T2: BEGIN SERIALIZABLE; -- (Assume snapshot is set here)
>> T2: SELECT * FROM D1 ... ;
>> T2: UPDATE D2 ... ;
>> T1: COMMIT;
>> T3: SELECT * FROM D1, D2;
>> T2: COMMIT;
>> 
>> Now, the COMMIT order is T1, T3, T2. Lets check if there is a equivalent
>> serial schedule. In any such schedule
>> 
>> T2 must run before T1 because T2 didn't see T1's changes to D1
>> T3 must run after T1 because T3 did see T1's changes to D1
>> T3 must run before T2 because T3 didn't see T2's changes to D2
>> 
>> This is obviously impossible - if T3 runs before T2 and T2 runs before T1
>> then T3 runs before T1, contradicting the second requirement. There is thus
>> no equivalent serial schedule and we must abort of these transactions with
>> a serialization error.
>> 
>> Note that aborting T3 is sufficient, even though T3 is READ ONLY!. With T3 
>> gone,
>> an equivalent serial schedule is T2,T1!

> Sorry for bothering all of you, but I just don't get this. What if T2 rolls 
> back instead of committing? Then the snapshot of T3 would have been valid, 
> right? 

Yeah. If T2 is removed from the picture, the only remaining ordering constraint 
is "T3 must run after T1 because T3 did see T1's changes to D1", and thus T1,T3 
is an equivalent serial schedule.

> Now, for the snapshot of T3 it doesn't matter if T2 commits or if it doesn't, 
> because it can't see the changes of T2 in any case. Thus, it would seem that 
> the snapshot is valid. On the other hand I can't see anything wrong in the 
> logic in your post. What am I missing? I am feeling stupid...  

The problem is that T3 sees the effects of T1 but not those of T2. Since T2 
must run *before* T1 in any equivalent serial schedule, that is impossible. In 
other words, if you look at an equivalent serial schedule of the *writing* 
transactions T1 and T2 you won't find a place to insert T3 such that it gives 
the same answer as in the concurrent schedule.

It isn't really T3's snapshot that is invalid, it's the interleaving of 
T1,T2,T3 because there is no equivalent serial schedule (a serial schedule the 
produces the same results). If, for example T3 reads only *one* of the tables 
D1,D2 then the whole thing suddenly *is* serializable! If T3 reads only D1 an 
equivalent serial schedule must run T3 after T1, and if it reads only D2 then 
it must run before T2.

That "validity" of snapshots comes into play if you attempt to distinguish safe 
and unsafe interleaved schedules *without* taking the dataset inspected by T3 
into account. So you simply assume that T3 reads the *whole* database (since 
thats the worst case), and must thus run *after* all transactions I didn't see 
as COMMITTED in any serial schedule.

The root of the whole issue is that this might not be possible! Some 
not-yet-committed transaction (T2 in the example) may have to be placed 
*before* some transaction seen as COMMITTED by T3 (T1 is our example). Since T3 
needs to run *after* T1 (since it saw it as committed) it'd thus also see T2 in 
any serial schedule. But it didn't see T2 in the interleaved schedule, we're 
hosed.

> At least for dumps I don't see how T2 can matter (assuming T3 is the 
> pg_dump's snapshot). Because if you reload from the dump, T2 never happened 
> in that dump. In the reloaded database it just did not exist at all.

Still, the would dump reflects a database state that *logically* never existed 
(i.e. not in any serial schedule). If you dump for disaster recovery, you might 
not care. If you dump to copy the data onto some reporting server you might.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 112 matches

Mail list logo