Re: [HACKERS] sinval synchronization considered harmful

2011-07-27 Thread Noah Misch
On Tue, Jul 26, 2011 at 06:04:16PM -0400, Tom Lane wrote:
 Noah Misch n...@2ndquadrant.com writes:
  On Tue, Jul 26, 2011 at 05:05:15PM -0400, Tom Lane wrote:
  Dirty cache line, maybe not, but what if the assembly code commands the
  CPU to load those variables into CPU registers before doing the
  comparison?  If they're loaded with maxMsgNum coming in last (or at
  least after resetState), I think you can have the problem without any
  assumptions about cache line behavior at all.  You just need the process
  to lose the CPU at the right time.
 
  True.  If the compiler places the resetState load first, you could hit the
  anomaly by merely setting a breakpoint on the next instruction, waiting 
  for
  exactly MSGNUMWRAPAROUND messages to enqueue, and letting the backend 
  continue.
  I think, though, we should either plug that _and_ the cache incoherency 
  case or
  worry about neither.
 
 How do you figure that?  The poor-assembly-code-order risk is both a lot
 easier to fix and a lot higher probability.  Admittedly, it's still way
 way down there, but you only need a precisely-timed sleep, not a
 precisely-timed sleep *and* a cache line that somehow remained stale.

I think both probabilities are too low to usefully distinguish.  An sinval
wraparound takes a long time even in a deliberate test setup: almost 30 hours @
10k messages/sec.  To get a backend to sleep that long, you'll probably need
something like SIGSTOP or a debugger attach.  The sleep has to fall within the
space of no more than a few instructions.  Then, you'd need to release the
process at the exact moment for it to observe wrapped equality.  In other words,
you get one split-millisecond opportunity every 30 hours of process sleep time.
If your backends don't have multi-hour sleeps, it can't ever happen.

Even so, all the better if we settle on an approach that has neither hazard.

-- 
Noah Mischhttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Check constraints on partition parents only?

2011-07-27 Thread Nikhil Sontakke
Hi,


  Yeah, but I think we need to take that chance.  At the very least, we
 need to support the equivalent of a non-inherited CHECK (false) on
 parent tables.


 Indeed. I usually enforce that with a trigger that raises an exception, but
 of course that doesn't help at all with constraint exclusion, and I saw a
 result just a few weeks ago (I forget the exact details) where it appeared
 that the plan chosen was significantly worse because the parent table wasn't
 excluded, so there's a  non-trivial downside from having this restriction.


The downside appears to be non-trivial indeed! I cooked up the attached
patch to try to allow ALTER...ONLY...CHECK(false) on parent tables.

If this approach looks acceptable, I can provide a complete patch later with
some documentation changes (I think we ought to tell about this special case
in the documentation) and a minor test case along with it (if the need be
felt for the test case).

Although partitioning ought to be looked at from a different angle
completely, maybe this small patch can help out a bit in the current scheme
of things, although this is indeed a unusual special casing... Thoughts?

Regards,
Nikhils
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 82bb756..5340402 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -5433,6 +5433,7 @@ ATAddCheckConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
 	ListCell   *lcon;
 	List	   *children;
 	ListCell   *child;
+	bool		skip_children = false;
 
 	/* At top level, permission check was done in ATPrepCmd, else do it */
 	if (recursing)
@@ -5502,9 +5503,31 @@ ATAddCheckConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
 	 * tables; else the addition would put them out of step.
 	 */
 	if (children  !recurse)
-		ereport(ERROR,
+	{
+		/*
+		 * Try a bit harder and check if this is a CHECK(FALSE) kinda
+		 * constraint. Allow if so, otherwise error out
+		 */
+		if (list_length(newcons) == 1)
+		{
+			CookedConstraint *cooked = linitial(newcons);
+
+			if (cooked-contype == CONSTR_CHECK  cooked-expr)
+			{
+Node *expr = cooked-expr;
+if (IsA(expr, Const)  ((Const *)expr)-consttype == BOOLOID 
+	 ((Const *)expr)-constvalue == 0)
+{
+	skip_children = true;
+}
+			}
+		}
+
+		if (!skip_children)
+			ereport(ERROR,
 (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
  errmsg(constraint must be added to child tables too)));
+	}
 
 	foreach(child, children)
 	{
@@ -5512,6 +5535,13 @@ ATAddCheckConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
 		Relation	childrel;
 		AlteredTableInfo *childtab;
 
+		/*
+		 * Skipping the constraint should be good enough for the special case.
+		 * No need to even release the locks on the children immediately..
+		 */
+		if (skip_children)
+			break;
+
 		/* find_inheritance_children already got lock */
 		childrel = heap_open(childrelid, NoLock);
 		CheckTableNotInUse(childrel, ALTER TABLE);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Fast GiST index build

2011-07-27 Thread Alexander Korotkov
I found a problem in WAL with this patch. I use simplified insert algorithm
in my patch which don't insert downlink one by one but insert them at once.
Thus FollowRight flag is leaving uncleared when redoing from WAL, because
only one flag can be cleared by one WAL record. Do you think modification of
WAL record structure is possible or I have to insert downlink one by one in
buffering build too?

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] WIP: Fast GiST index build

2011-07-27 Thread Heikki Linnakangas

On 27.07.2011 15:29, Alexander Korotkov wrote:

I found a problem in WAL with this patch. I use simplified insert algorithm
in my patch which don't insert downlink one by one but insert them at once.
Thus FollowRight flag is leaving uncleared when redoing from WAL, because
only one flag can be cleared by one WAL record. Do you think modification of
WAL record structure is possible or I have to insert downlink one by one in
buffering build too?


Dunno, both approaches seem reasonable to me. There's no rule against 
changing WAL record structure across major releases, if that's what you 
were worried about.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Add missing newlines at end of error messages

2011-07-27 Thread Peter Eisentraut
On tis, 2011-07-26 at 15:13 -0700, David Fetter wrote:
 This seems like a mechanical check.  Should it be part of what gets
 checked when people push?

It's not really that mechanical, because some libpq functions supply the
newline, so you can't just mechanically add or enforce it everywhere.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pull up aggregate sublink (was: Parameterized aggregate subquery (was: Pull up aggregate subquery))

2011-07-27 Thread Robert Haas
On Tue, Jul 26, 2011 at 5:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeb Havinga yebhavi...@gmail.com writes:
 A few days ago I read Tomas Vondra's blog post about dss tpc-h queries
 on PostgreSQL at
 http://fuzzy.cz/en/articles/dss-tpc-h-benchmark-with-postgresql/ - in
 which he showed how to manually pull up a dss subquery to get a large
 speed up. Initially I thought: cool, this is probably now handled by
 Hitoshi's patch, but it turns out the subquery type in the dss query is
 different.

 Actually, I believe this example is the exact opposite of the
 transformation Hitoshi proposes.  Tomas was manually replacing an
 aggregated subquery by a reference to a grouped table, which can be
 a win if the subquery would be executed enough times to amortize
 calculation of the grouped table over all the groups (some of which
 might never be demanded by the outer query).  Hitoshi was talking about
 avoiding calculations of grouped-table elements that we don't need,
 which would be a win in different cases.  Or at least that was the
 thrust of his original proposal; I'm not sure where the patch went since
 then.

 This leads me to think that we need to represent both cases as the same
 sort of query and make a cost-based decision as to which way to go.
 Thinking of it as a pull-up or push-down transformation is the wrong
 approach because those sorts of transformations are done too early to
 be able to use cost comparisons.

I think you're right.  OTOH, our estimates of what will pop out of an
aggregate are so poor that denying the user to control the plan on the
basis of how they write the query might be a net negative.  :-(

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XMLATTRIBUTES vs. values of type XML

2011-07-27 Thread Peter Eisentraut
On tis, 2011-07-26 at 22:44 +0200, Florian Pflug wrote:
 While reviewing the (now applied) XPATH escaping patches, Radoslaw
 found one
 case where the previous failure of XPATH to escape its return value
 was offset
 by XMLATTRIBUTES insistence to escape all input values, even if
 they're
 already of type XML.
 
 To wit, if you do
 
   SELECT XMLELEMENT(NAME t, XMLATTRIBUTES('amp;'::XML AS a))
 
 you get
 
  xmlelement 
 
  t a=amp;amp;/ 

Per SQL standard, the attribute values may not be of type XML, so maybe
we should just prohibit it.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pull up aggregate sublink (was: Parameterized aggregate subquery (was: Pull up aggregate subquery))

2011-07-27 Thread Yeb Havinga

On 2011-07-27 16:16, Robert Haas wrote:

On Tue, Jul 26, 2011 at 5:37 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

Yeb Havingayebhavi...@gmail.com  writes:

A few days ago I read Tomas Vondra's blog post about dss tpc-h queries
on PostgreSQL at
http://fuzzy.cz/en/articles/dss-tpc-h-benchmark-with-postgresql/ - in
which he showed how to manually pull up a dss subquery to get a large
speed up. Initially I thought: cool, this is probably now handled by
Hitoshi's patch, but it turns out the subquery type in the dss query is
different.

Actually, I believe this example is the exact opposite of the
transformation Hitoshi proposes.  Tomas was manually replacing an
aggregated subquery by a reference to a grouped table, which can be
a win if the subquery would be executed enough times to amortize
calculation of the grouped table over all the groups (some of which
might never be demanded by the outer query).  Hitoshi was talking about
avoiding calculations of grouped-table elements that we don't need,
which would be a win in different cases.  Or at least that was the
thrust of his original proposal; I'm not sure where the patch went since
then.

This leads me to think that we need to represent both cases as the same
sort of query and make a cost-based decision as to which way to go.
Thinking of it as a pull-up or push-down transformation is the wrong
approach because those sorts of transformations are done too early to
be able to use cost comparisons.

I think you're right.  OTOH, our estimates of what will pop out of an
aggregate are so poor that denying the user to control the plan on the
basis of how they write the query might be a net negative.  :-(



Tom and Robert, thank you both for your replies. I think I'm having some 
blind spots and maybe false assumptions regarding the overal work in the 
optimizer, as it is not clear to me what 'the same sort of query' would 
look like. I was under the impression that using cost to select the best 
paths is only done per simple query, and fail to see how a total 
combined plan with pulled up subquery could be compared on cost with a 
total plan where the subquery is still a separate subplan, since the 
range tables / simple-queries to compare are different.


regards,
Yeb


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Fast GiST index build

2011-07-27 Thread Alexander Korotkov
On Wed, Jul 27, 2011 at 6:05 PM, Heikki Linnakangas 
heikki.linnakan...@enterprisedb.com wrote:

 Dunno, both approaches seem reasonable to me. There's no rule against
 changing WAL record structure across major releases, if that's what you were
 worried about.


OK, thanks. I also found behaviour of GiST(without patch) with streaming
replication that seems strange for me. On master there are only few
rightlinks are InvalidBlockNumber while on slave there are a lot of them. I
hack gevel for getting index structure on slave (accessing tree without
AccessExclusiveLock).

On master:
# create table test as (select point(random(),random()) from
generate_series(1,10));
# create index test_idx on test using gist(point);
# \copy (select gist_tree('test_idx')) to 'tree1r.txt';

On slave:
# \copy (select gist_tree('test_idx')) to 'tree2r.txt';

In bash:
# cat tree1r.txt | sed 's/\\n/\n/g'  tree1.txt
# cat tree2r.txt | sed 's/\\n/\n/g'  tree2.txt
# diff tree1.txt tree2.txt

2,89c2,89
 1(l:1) blk: 324 numTuple: 129 free: 2472b(69.71%) rightlink:637 (OK)
 1(l:2) blk: 242 numTuple: 164 free: 932b(88.58%) rightlink:319
(OK)
 2(l:2) blk: 525 numTuple: 121 free: 2824b(65.39%) rightlink:153
(OK)
 3(l:2) blk: 70 numTuple: 104 free: 3572b(56.23%) rightlink:551
(OK)
 4(l:2) blk: 384 numTuple: 106 free: 3484b(57.30%) rightlink:555
(OK)
 5(l:2) blk: 555 numTuple: 121 free: 2824b(65.39%) rightlink:74
(OK)
 6(l:2) blk: 564 numTuple: 109 free: 3352b(58.92%) rightlink:294
(OK)
 7(l:2) blk: 165 numTuple: 108 free: 3396b(58.38%) rightlink:567
(OK)
.
---
 1(l:1) blk: 324 numTuple: 129 free: 2472b(69.71%) rightlink:4294967295
(InvalidBlockNumber)
 1(l:2) blk: 242 numTuple: 164 free: 932b(88.58%)
rightlink:4294967295 (InvalidBlockNumber)
 2(l:2) blk: 525 numTuple: 121 free: 2824b(65.39%)
rightlink:4294967295 (InvalidBlockNumber)
 3(l:2) blk: 70 numTuple: 104 free: 3572b(56.23%)
rightlink:4294967295 (InvalidBlockNumber)
 4(l:2) blk: 384 numTuple: 106 free: 3484b(57.30%)
rightlink:4294967295 (InvalidBlockNumber)
 5(l:2) blk: 555 numTuple: 121 free: 2824b(65.39%)
rightlink:4294967295 (InvalidBlockNumber)
 6(l:2) blk: 564 numTuple: 109 free: 3352b(58.92%)
rightlink:4294967295 (InvalidBlockNumber)
 7(l:2) blk: 165 numTuple: 108 free: 3396b(58.38%)
rightlink:4294967295 (InvalidBlockNumber)
.

Isn't it a bug?

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Pull up aggregate sublink (was: Parameterized aggregate subquery (was: Pull up aggregate subquery))

2011-07-27 Thread Tom Lane
Yeb Havinga yebhavi...@gmail.com writes:
 Tom and Robert, thank you both for your replies. I think I'm having some 
 blind spots and maybe false assumptions regarding the overal work in the 
 optimizer, as it is not clear to me what 'the same sort of query' would 
 look like. I was under the impression that using cost to select the best 
 paths is only done per simple query, and fail to see how a total 
 combined plan with pulled up subquery could be compared on cost with a 
 total plan where the subquery is still a separate subplan, since the 
 range tables / simple-queries to compare are different.

Well, you could look at what planagg.c does to decide whether an
indexscan optimization of MIN/MAX is worthwhile, or at the calculations
in planner.c that decide which way to do grouping/aggregation/ordering.

It could be fairly expensive to handle this type of problem because of
the need to cost out two fundamentally different scan/join trees, but
we're assuming the queries are expensive enough to make that worthwhile
...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: bogus descriptions displayed by \d+

2011-07-27 Thread Robert Haas
On Tue, Jul 26, 2011 at 9:21 PM, Josh Kupershmidt schmi...@gmail.com wrote:
 On Tue, Jul 26, 2011 at 9:53 AM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Jul 25, 2011 at 10:29 PM, Josh Kupershmidt schmi...@gmail.com 
 wrote:
 I think this is basically the right approach but I found what you did
 here a bit wordy, so I simplified it, committed it, and back-patched
 to 9.0 with suitable adjustment.  Hopefully I didn't simplify it into
 a form you don't like.

 That looks fine. Minor grammar quibble about:

 +      When commenting on a column,
 +      replaceable class=parameterrelation_name/replaceable must refer
 +      to a table, view, composite types, or foreign table.

 types should probably be the singular type.

Woops.  That was dumb.  Fixed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Fast GiST index build

2011-07-27 Thread Heikki Linnakangas

On 27.07.2011 17:43, Alexander Korotkov wrote:

 1(l:1) blk: 324 numTuple: 129 free: 2472b(69.71%) rightlink:4294967295

(InvalidBlockNumber)

 1(l:2) blk: 242 numTuple: 164 free: 932b(88.58%)

rightlink:4294967295 (InvalidBlockNumber)

 2(l:2) blk: 525 numTuple: 121 free: 2824b(65.39%)

rightlink:4294967295 (InvalidBlockNumber)

 3(l:2) blk: 70 numTuple: 104 free: 3572b(56.23%)

rightlink:4294967295 (InvalidBlockNumber)

 4(l:2) blk: 384 numTuple: 106 free: 3484b(57.30%)

rightlink:4294967295 (InvalidBlockNumber)

 5(l:2) blk: 555 numTuple: 121 free: 2824b(65.39%)

rightlink:4294967295 (InvalidBlockNumber)

 6(l:2) blk: 564 numTuple: 109 free: 3352b(58.92%)

rightlink:4294967295 (InvalidBlockNumber)

 7(l:2) blk: 165 numTuple: 108 free: 3396b(58.38%)

rightlink:4294967295 (InvalidBlockNumber)
.

Isn't it a bug?


Yeah, looks like a bug. I must've messed up the WAL logging in my recent 
changes to this. I'll look into that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Pull up aggregate sublink (was: Parameterized aggregate subquery (was: Pull up aggregate subquery))

2011-07-27 Thread Hitoshi Harada
2011/7/27 Yeb Havinga yebhavi...@gmail.com:
 On 2011-07-22 17:35, Hitoshi Harada wrote:

 2011/7/23 Yeb Havingayebhavi...@gmail.com:

 A few days ago I read Tomas Vondra's blog post about dss tpc-h queries on
 PostgreSQL at
 http://fuzzy.cz/en/articles/dss-tpc-h-benchmark-with-postgresql/ - in which
 he showed how to manually pull up a dss subquery to get a large speed up.
 Initially I thought: cool, this is probably now handled by Hitoshi's patch,
 but it turns out the subquery type in the dss query is different.

 The original and rewritten queries are below. The debug_print_plan output
 shows the subquery is called from a opexpr ( l_quantity, subquery output)
 and the sublink type is EXPR_SUBLINK. Looking at the source code;
 pull_up_sublink only considers ANY and EXISTS sublinks. I'm wondering if
 this could be expanded to deal with EXPR sublinks. Clearly in the example
 Tomas has given this can be done. I'm wondering if there are show stoppers
 that prevent this to be possible in the general case, but can't think of
 any, other than the case of a sublink returning NULL and the opexpr is part
 of a larger OR expression or IS NULL; in which case it should not be pulled
 op, or perhaps it could be pulled up as outer join.

 Thoughts?

Good catch. I was not aware of the sublink case so I'm not sure if it
is possible, but I believe it will be worth modifying the optimizer to
handle them in the same way.  Since my latest proposal is based on
parameterized NestLoop, the first step is how to transform the sublink
expression into join. I bet there are chances in simple cases since we
have Semi/Anti Join technique. On the other hand, those pseudo-join
types are easily failing to be transformed to join, in such cases
above like it have another filter clause than join qual expression.

If tpc bechmark can be speed up that's a good use case which you
pointed out I'm missing.

Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pull up aggregate sublink (was: Parameterized aggregate subquery (was: Pull up aggregate subquery))

2011-07-27 Thread Hitoshi Harada
2011/7/27 Tom Lane t...@sss.pgh.pa.us:
 Yeb Havinga yebhavi...@gmail.com writes:
 A few days ago I read Tomas Vondra's blog post about dss tpc-h queries
 on PostgreSQL at
 http://fuzzy.cz/en/articles/dss-tpc-h-benchmark-with-postgresql/ - in
 which he showed how to manually pull up a dss subquery to get a large
 speed up. Initially I thought: cool, this is probably now handled by
 Hitoshi's patch, but it turns out the subquery type in the dss query is
 different.

 Actually, I believe this example is the exact opposite of the
 transformation Hitoshi proposes. Tomas was manually replacing an
 aggregated subquery by a reference to a grouped table, which can be
 a win if the subquery would be executed enough times to amortize
 calculation of the grouped table over all the groups (some of which
 might never be demanded by the outer query).  Hitoshi was talking about
 avoiding calculations of grouped-table elements that we don't need,
 which would be a win in different cases.  Or at least that was the
 thrust of his original proposal; I'm not sure where the patch went since
 then.

My first proposal which is about pulling up aggregate like sublink
expression is exact opposite of this (Tomas pushed down the sublink
expression to join subquery). But the latest proposal is upon
parameterized NestLoop, so I think my latest patch might help
something for the *second* query. Actually the problem is the same; We
want to reduce grouping operation which is not interesting to the
final output, by filtering other relations expression. In this case,
if the joined lineitem-part relation has very few rows by WHERE
conditions (p_brand, p_container), we don't want calculate avg of huge
lineitem because we know almost all of the avg result is not in the
upper result. However, the current optimizer cannot pass the upper
query's condition (like it will have only few rows) down to the
lower aggregate query.

 This leads me to think that we need to represent both cases as the same
 sort of query and make a cost-based decision as to which way to go.
 Thinking of it as a pull-up or push-down transformation is the wrong
 approach because those sorts of transformations are done too early to
 be able to use cost comparisons.

Wrapping up my mind above and reading this paragraph, it might be
another work to make sublink expression look like the same as join.
But what we want to solve is the same goal, I think.

Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sinval synchronization considered harmful

2011-07-27 Thread Robert Haas
On Tue, Jul 26, 2011 at 9:57 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jul 26, 2011 at 4:38 PM, Noah Misch n...@2ndquadrant.com wrote:
 No new ideas come to mind, here.

 OK, I have a new idea.  :-)

 1. Add a new flag to each procState called something like 
 timeToPayAttention.
 2. Each call to SIGetDataEntries() iterates over all the ProcStates
 whose index is  lastBackend and sets stateP-timeToPayAttention =
 TRUE for each.
 3. At the beginning of SIGetDataEntries(), we do an unlocked if
 (!stateP-timeToPayAttention) return 0.
 4. Immediately following that if statement and before acquiring any
 locks, we set stateP-timeToPayAttention = FALSE.

 The LWLockRelease() in SIGetDataEntries() will be sufficient to force
 all of the stateP-timeToPayAttention writes out to main memory, and
 the read side is OK because we either just took a lock (which acted as
 a fence) or else there's a race anyway.  But unlike my previous
 proposal, it doesn't involve *comparing* anything.  We needn't worry
 about whether we could read two different values that are through
 great misfortune out of sync because we're only reading one value.

 If, by chance, the value is set to true just after we set it to false,
 that won't mess us up either: we'll still read all the messages after
 acquiring the lock.

There turned out to be a little bit of further subtlety to this, but
it seems to work.  Patch attached.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


sinval-hasmessages.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sinval synchronization considered harmful

2011-07-27 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Jul 26, 2011 at 9:57 PM, Robert Haas robertmh...@gmail.com wrote:
 1. Add a new flag to each procState called something like 
 timeToPayAttention.
 2. Each call to SIGetDataEntries() iterates over all the ProcStates
 whose index is  lastBackend and sets stateP-timeToPayAttention =
 TRUE for each.
 3. At the beginning of SIGetDataEntries(), we do an unlocked if
 (!stateP-timeToPayAttention) return 0.
 4. Immediately following that if statement and before acquiring any
 locks, we set stateP-timeToPayAttention = FALSE.

 There turned out to be a little bit of further subtlety to this, but
 it seems to work.  Patch attached.

And?

It didn't sound to me like this could possibly be a performance win,
but I await some numbers ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sinval synchronization considered harmful

2011-07-27 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 There turned out to be a little bit of further subtlety to this,
 but it seems to work.  Patch attached.
 
Stylistic question: Why is stateP-hasMessages set to false in one
place and FALSE and TRUE in others?  It seems like it would be less
confusing to be consistent.
 
A quick count of the usage of both forms in the PostgreSQL source
codes shows the lowercase is used about 10 times more often:
 
kgrittn@kgrittn-desktop:~/git/postgresql/kgrittn$ find -name '*.h'
-or -name '*.c' | egrep -v '/tmp_check/' | xargs cat | egrep -c
'\b(FALSE|TRUE)\b'
1670
 
kgrittn@kgrittn-desktop:~/git/postgresql/kgrittn$ find -name '*.h'
-or -name '*.c' | egrep -v '/tmp_check/' | xargs cat | egrep -c
'\b(false|true)\b'
17349
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sinval synchronization considered harmful

2011-07-27 Thread Robert Haas
On Wed, Jul 27, 2011 at 12:34 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Jul 26, 2011 at 9:57 PM, Robert Haas robertmh...@gmail.com wrote:
 1. Add a new flag to each procState called something like 
 timeToPayAttention.
 2. Each call to SIGetDataEntries() iterates over all the ProcStates
 whose index is  lastBackend and sets stateP-timeToPayAttention =
 TRUE for each.
 3. At the beginning of SIGetDataEntries(), we do an unlocked if
 (!stateP-timeToPayAttention) return 0.
 4. Immediately following that if statement and before acquiring any
 locks, we set stateP-timeToPayAttention = FALSE.

 There turned out to be a little bit of further subtlety to this, but
 it seems to work.  Patch attached.

 And?

 It didn't sound to me like this could possibly be a performance win,
 but I await some numbers ...

On master, with the patch, scale factor 100, pgbench -S -c $CLIENTS -j
$CLIENTS -T 300 results for different client counts, on a 32-core
machine with 128GB RAM, shared_buffers = 8GB:

01 tps = .280459 (including connections establishing)
01 tps = 4438.365576 (including connections establishing)
01 tps = 4423.718466 (including connections establishing)
08 tps = 33187.827872 (including connections establishing)
08 tps = 33288.247330 (including connections establishing)
08 tps = 33304.307835 (including connections establishing)
32 tps = 178876.350559 (including connections establishing)
32 tps = 177293.082295 (including connections establishing)
32 tps = 175577.058885 (including connections establishing)
80 tps = 159202.449993 (including connections establishing)
80 tps = 151541.717088 (including connections establishing)
80 tps = 150454.658132 (including connections establishing)

Without the patch:

01 tps = 4447.660101 (including connections establishing)
01 tps = 4440.830713 (including connections establishing)
01 tps = 4411.610348 (including connections establishing)
08 tps = 33135.773476 (including connections establishing)
08 tps = 33365.387051 (including connections establishing)
08 tps = 33364.859705 (including connections establishing)
32 tps = 175834.280471 (including connections establishing)
32 tps = 176713.118271 (including connections establishing)
32 tps = 176830.687087 (including connections establishing)
80 tps = 135211.036587 (including connections establishing)
80 tps = 130642.264016 (including connections establishing)
80 tps = 133621.549513 (including connections establishing)

I'm fairly certain the results will be even more dramatic with the
lazy-vxid patch applied, since master still has to fight with the lock
manager on this workload.  I haven't tested that yet, but there's not
much reason to suppose that the results will be dramatically different
from any of the other methods of reducing the sinval overhead that
I've benchmarked, many of which I've already posted about.  See, for
example, the OP on this thread.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sinval synchronization considered harmful

2011-07-27 Thread Noah Misch
On Tue, Jul 26, 2011 at 09:57:10PM -0400, Robert Haas wrote:
 On Tue, Jul 26, 2011 at 4:38 PM, Noah Misch n...@2ndquadrant.com wrote:
  No new ideas come to mind, here.
 
 OK, I have a new idea.  :-)
 
 1. Add a new flag to each procState called something like 
 timeToPayAttention.
 2. Each call to SIGetDataEntries() iterates over all the ProcStates
 whose index is  lastBackend and sets stateP-timeToPayAttention =
 TRUE for each.
 3. At the beginning of SIGetDataEntries(), we do an unlocked if
 (!stateP-timeToPayAttention) return 0.
 4. Immediately following that if statement and before acquiring any
 locks, we set stateP-timeToPayAttention = FALSE.
 
 The LWLockRelease() in SIGetDataEntries() will be sufficient to force
 all of the stateP-timeToPayAttention writes out to main memory, and
 the read side is OK because we either just took a lock (which acted as
 a fence) or else there's a race anyway.  But unlike my previous
 proposal, it doesn't involve *comparing* anything.  We needn't worry
 about whether we could read two different values that are through
 great misfortune out of sync because we're only reading one value.
 
 If, by chance, the value is set to true just after we set it to false,
 that won't mess us up either: we'll still read all the messages after
 acquiring the lock.

This approach would work if a spinlock release constrained the global stores
timeline.  It makes a weaker guarantee: all stores preceding the lock release
in program order will precede it globally.  Consequently, no processor will
observe the spinlock to be available without also observing all stores made by
the last holding processor before that processor released it.  That guarantee
is not enough for this sequence of events:

Backend 0:
add a message for table a
store 5 = maxMsgNum
store true = timeToPayAttention
LWLockRelease(SInvalWriteLock)
plenty of time passes
Backend 2:
LOCK TABLE a;
load timeToPayAttention = true
Backend 1:
add a message for table b
store 6 = maxMsgNum
SpinLockRelease(vsegP-msgnumLock);
store true = timeToPayAttention
Backend 2:
store false = timeToPayAttention
LWLockAcquire(SInvalReadLock, LW_SHARED)
load maxMsgNum = 5 [!]
Backend 1:
LWLockRelease(SInvalWriteLock);
Backend 2:
LOCK TABLE b;
load timeToPayAttention = false
use b without processing updates

The SpinLockRelease(vsegP-msgnumLock) guarantees that any process
observing the backend 2 store of timeToPayAttention will also observe
maxMsgNum = 6.  However, nothing constrains which timeToPayAttention store
will win in main memory.  Backend 1 can observe neither update from backend
2, yet still have its store appear later than the backend 1 stores in the
global timeline.

 Now, there's some downside to this approach - it involves doing O(N)
 work for each invalidation message we receive.  But as long as it's
 only O(N) stores and not O(N) lock acquisitions and releases, I think
 that might be OK.

I think a benchmark is in order, something like 900 idle connections and 80
connections running small transactions that create a few temporary tables.  If
that shows no statistically significant regression, then we're probably fine
here.  I'm not sure what result to expect, honestly.

What did you think of making the message number a uint64, removing wraparound
handling, and retaining SISeg.msgnumLock for 32-bit only?  You could isolate the
variant logic in READ_MSGNUM and WRITE_MSGNUM macros.

-- 
Noah Mischhttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] isolation test deadlocking on buildfarm member coypu

2011-07-27 Thread Noah Misch
On Tue, Jul 26, 2011 at 05:04:28PM -0400, Alvaro Herrera wrote:
 *** 
 /home/pgbuildfarm/workdir/HEAD/pgsql.20950/src/test/isolation/expected/fk-deadlock2.out
Sun Jul 24 08:46:44 2011
 --- 
 /home/pgbuildfarm/workdir/HEAD/pgsql.20950/src/test/isolation/results/fk-deadlock2.out
 Sun Jul 24 15:11:42 2011
 ***
 *** 22,29 
   step s2u1:  UPDATE B SET Col2 = 1 WHERE BID = 2; 
   step s1u2:  UPDATE B SET Col2 = 1 WHERE BID = 2;  waiting ...
   step s2u2:  UPDATE B SET Col2 = 1 WHERE BID = 2; 
 - step s1u2: ... completed
   ERROR:  deadlock detected
   step s1c:  COMMIT; 
   step s2c:  COMMIT; 
   
 --- 22,29 
   step s2u1:  UPDATE B SET Col2 = 1 WHERE BID = 2; 
   step s1u2:  UPDATE B SET Col2 = 1 WHERE BID = 2;  waiting ...
   step s2u2:  UPDATE B SET Col2 = 1 WHERE BID = 2; 
   ERROR:  deadlock detected
 + step s1u2: ... completed
   step s1c:  COMMIT; 
   step s2c:  COMMIT; 
 
 
 
 I think the only explanation necessary for this is that one process
 reports its status before the other one.  I think it would be enough to
 add another variant of the expected file to fix this problem, but I
 don't quite want to do that because we already have three of them, and I
 think we would need to add one per existing expected, so we'd end up
 with 6 expected files which would be a pain to work with.

To really cover the problem in this way, we would need 16*3 variations covering
every permutation of the deadlocks detected when s1 runs the first command.  I
see these other options:

1. Keep increasing the s1 deadlock_timeout until we stop getting these.  This is
simple, but it proportionally slows the test suite for everyone.  No value will
ever be guaranteed sufficient.

2. Post-process the output to ascribe the deadlock detections in a standard way
before comparing with expected output.  This would also let us drop
deadlock_timeout arbitrarily low, giving a rapid test run.

3. Implement actual deadlock priorities, per discussion at
http://archives.postgresql.org/message-id/aanlktimaqfzkv4sc1dscrufft_be78y-owpeuurcd...@mail.gmail.com
This is much more work, but it would let us drop deadlock_timeout arbitrarily
low and still get consistent results from the start.

-- 
Noah Mischhttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Add missing newlines at end of error messages

2011-07-27 Thread David Fetter
On Wed, Jul 27, 2011 at 05:14:58PM +0300, Peter Eisentraut wrote:
 On tis, 2011-07-26 at 15:13 -0700, David Fetter wrote:
  This seems like a mechanical check.  Should it be part of what gets
  checked when people push?
 
 It's not really that mechanical, because some libpq functions supply the
 newline, so you can't just mechanically add or enforce it everywhere.
 
Thanks for the heads-up :)

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sinval synchronization considered harmful

2011-07-27 Thread Robert Haas
On Wed, Jul 27, 2011 at 12:55 PM, Noah Misch n...@2ndquadrant.com wrote:
 This approach would work if a spinlock release constrained the global stores
 timeline.  It makes a weaker guarantee: all stores preceding the lock release
 in program order will precede it globally.  Consequently, no processor will
 observe the spinlock to be available without also observing all stores made by
 the last holding processor before that processor released it.  That guarantee
 is not enough for this sequence of events:

 Backend 0:
        add a message for table a
        store 5 = maxMsgNum
        store true = timeToPayAttention
        LWLockRelease(SInvalWriteLock)
 plenty of time passes
 Backend 2:
        LOCK TABLE a;
        load timeToPayAttention = true
 Backend 1:
        add a message for table b
        store 6 = maxMsgNum
        SpinLockRelease(vsegP-msgnumLock);
        store true = timeToPayAttention
 Backend 2:
        store false = timeToPayAttention
        LWLockAcquire(SInvalReadLock, LW_SHARED)
        load maxMsgNum = 5 [!]

Eh, how can this possibly happen?  You have to hold msgNumLock to to
set maxMsgNum and msgNumLock to read maxMsgNum.  If that's not enough
to guarantee that we never read a stale value, what is?

 I think a benchmark is in order, something like 900 idle connections and 80
 connections running small transactions that create a few temporary tables.  If
 that shows no statistically significant regression, then we're probably fine
 here.  I'm not sure what result to expect, honestly.

That's setting the bar pretty high.  I don't mind doing the
experiment, but I'm not sure that's the case we should be optimizing
for.

 What did you think of making the message number a uint64, removing wraparound
 handling, and retaining SISeg.msgnumLock for 32-bit only?  You could isolate 
 the
 variant logic in READ_MSGNUM and WRITE_MSGNUM macros.

Well, what you really need to know is whether the platform has 8-byte
atomic stores, which doesn't seem trivial to figure out, plus you need
a memory fence.  If that's the only method of fixing this problem we
can agree on, I'm willing to work on it, but an
architecture-independent fix would be nicer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XMLATTRIBUTES vs. values of type XML

2011-07-27 Thread Florian Pflug
On Jul27, 2011, at 16:18 , Peter Eisentraut wrote:
 On tis, 2011-07-26 at 22:44 +0200, Florian Pflug wrote:
 While reviewing the (now applied) XPATH escaping patches, Radoslaw
 found one
 case where the previous failure of XPATH to escape its return value
 was offset
 by XMLATTRIBUTES insistence to escape all input values, even if
 they're
 already of type XML.
 
 To wit, if you do
 
  SELECT XMLELEMENT(NAME t, XMLATTRIBUTES('amp;'::XML AS a))
 
 you get
 
 xmlelement 
 
 t a=amp;amp;/ 
 
 Per SQL standard, the attribute values may not be of type XML, so maybe
 we should just prohibit it.

We probably should have, but I think it's too late for that. I don't
believe I'm the only one who uses XPATH results as attribute values,
and we'd severely break that use-case.

You might say the same thing about my proposal, of course, but I believe
the risk is much smaller there. Applications would only break if they
  (a) Pass XML from a source other than a XPath expression selecting
  a text or attribute and
  (b) actually want double-escaping to occur.

As a data point, I've written an application with makes heavy use of
our XML infrastructure over the last few months (as you might have guessed
from the stream of patches ;-)). That application would be pretty much
untroubled by the changes to XMLATTRIBUTES I proposed, but would be
severely broken if we rejected values of type XML all together.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PQescapeByteaConn - returns wrong string for PG9.1 Beta3

2011-07-27 Thread Petro Meier
Normal021false  
  falsefalseDEX-NONEX-NONE  

  MicrosoftInternetExplorer4


















If  I use PQescapeByteaConn() for a conenction to a PG9.1 Beta3 server, 
this function returns (e.g.) \xea2abd8ef31...(and so on.)

Here the problem: there should be a second backslash in the prefix. 
The SQL Statement which uses this string (INSERT statement in my case) 
returns with an error (Invalid byte sequence...). If I add the second 
backslash manually everything works fine.

When connecting to a PG9.0 server and using this function, the 
return value is correct (with two backslashes): \\xea2abd8ef31...( and so 
on.)

This should be a bug in PG9.1 Beta3

 Regards

Petro 
-- 
NEU: FreePhone - 0ct/min Handyspartarif mit Geld-zurück-Garantie!   
Jetzt informieren: http://www.gmx.net/de/go/freephone


Re: [HACKERS] PQescapeByteaConn - returns wrong string for PG9.1 Beta3

2011-07-27 Thread Florian Pflug
On Jul27, 2011, at 08:51 , Petro Meier wrote:
 If  I use PQescapeByteaConn() for a conenction to a PG9.1 Beta3 server, this 
 function returns (e.g.) \xea2abd8ef31...(and so on.)
 Here the problem: there should be a second backslash in the prefix. The SQL 
 Statement which uses this string (INSERT statement in my case) returns with 
 an error (Invalid byte sequence...). If I add the second backslash manually 
 everything works fine.
 When connecting to a PG9.0 server and using this function, the return value 
 is correct (with two backslashes): \\xea2abd8ef31...( and so on.)
 This should be a bug in PG9.1 Beta3

Sounds as if PQescapeByteaConn() is confused about whether 
standard_conforming_strings is on or off. What value does that setting have in 
your 9.0 and 9.1 instances?

BTW, I think 9.1 is the first release where that settings defaults to on, so 
maybe that adds to PQescapeByteaConn()'s confusion. In theory it shouldn't 
since PQescapeByteaConn() should simply detect the server's setting and react 
accordingly, but who knows...

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sinval synchronization considered harmful

2011-07-27 Thread Noah Misch
On Wed, Jul 27, 2011 at 01:30:47PM -0400, Robert Haas wrote:
 On Wed, Jul 27, 2011 at 12:55 PM, Noah Misch n...@2ndquadrant.com wrote:
  [wrong objection]
 
 Eh, how can this possibly happen?  You have to hold msgNumLock to to
 set maxMsgNum and msgNumLock to read maxMsgNum.  If that's not enough
 to guarantee that we never read a stale value, what is?

Indeed, my analysis was all wrong.

  I think a benchmark is in order, something like 900 idle connections and 80
  connections running small transactions that create a few temporary tables.  
  If
  that shows no statistically significant regression, then we're probably fine
  here.  I'm not sure what result to expect, honestly.
 
 That's setting the bar pretty high.  I don't mind doing the
 experiment, but I'm not sure that's the case we should be optimizing
 for.

Granted.  How about 32 clients running the temporary table transaction, no idle
connections?  Given the meager benefit of this patch compared to your previous
version, it would be hard to justify a notable performance drop in return.

  What did you think of making the message number a uint64, removing 
  wraparound
  handling, and retaining SISeg.msgnumLock for 32-bit only?  You could 
  isolate the
  variant logic in READ_MSGNUM and WRITE_MSGNUM macros.
 
 Well, what you really need to know is whether the platform has 8-byte
 atomic stores, which doesn't seem trivial to figure out, plus you need
 a memory fence.  If that's the only method of fixing this problem we
 can agree on, I'm willing to work on it, but an
 architecture-independent fix would be nicer.

Fair enough.

Thanks,
nm

-- 
Noah Mischhttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQescapeByteaConn - returns wrong string for PG9.1 Beta3

2011-07-27 Thread Alvaro Herrera
Excerpts from Petro Meier's message of mié jul 27 02:51:22 -0400 2011:

 If  I use PQescapeByteaConn() for a conenction to a PG9.1 Beta3 server, 
 this function returns (e.g.) \xea2abd8ef31...(and so on.)
 
 Here the problem: there should be a second backslash in the prefix. 
 The SQL Statement which uses this string (INSERT statement in my case) 
 returns with an error (Invalid byte sequence...). If I add the second 
 backslash manually everything works fine.

You're just being bitten by the fact that the
standard_conforming_strings setting changed its default from false to
true.  If you want the old behavior, you can just flip the switch, but
the recommended action is to change your expectations.  You can use E''
if you want backslashes to continue working without changing the switch.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sinval synchronization considered harmful

2011-07-27 Thread Robert Haas
On Wed, Jul 27, 2011 at 1:58 PM, Noah Misch n...@2ndquadrant.com wrote:
  I think a benchmark is in order, something like 900 idle connections and 80
  connections running small transactions that create a few temporary tables. 
   If
  that shows no statistically significant regression, then we're probably 
  fine
  here.  I'm not sure what result to expect, honestly.

 That's setting the bar pretty high.  I don't mind doing the
 experiment, but I'm not sure that's the case we should be optimizing
 for.

 Granted.  How about 32 clients running the temporary table transaction, no 
 idle
 connections?  Given the meager benefit of this patch compared to your previous
 version, it would be hard to justify a notable performance drop in return.

The reason the benefit is smaller is, I believe, because the previous
numbers were generated with the lazy vxid locks patch applied, and
these were generated against master.  With the lock manager as a
bottleneck, the sinval stuff doesn't get hit quite as hard, so the
benefit is less.  I can regenerate the numbers with the lazy vxid
patch applied; I suspect they'll be similar to what we saw before.
I'll also test out creating and dropping some tables.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQescapeByteaConn - returns wrong string for PG9.1 Beta3

2011-07-27 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Petro Meier's message of mié jul 27 02:51:22 -0400 2011:
 If  I use PQescapeByteaConn() for a conenction to a PG9.1 Beta3 server, 
 this function returns (e.g.) \xea2abd8ef31...(and so on.)
 
 Here the problem: there should be a second backslash in the prefix. 

 You're just being bitten by the fact that the
 standard_conforming_strings setting changed its default from false to
 true.

Well, the question is why is it actually failing for him.  AFAICS the
value being emitted is correct for the 9.1 server.  Perhaps we need to
see a complete example...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQescapeByteaConn - returns wrong string for PG9.1 Beta3

2011-07-27 Thread Florian Pflug
On Jul27, 2011, at 20:05 , Alvaro Herrera wrote:
 Excerpts from Petro Meier's message of mié jul 27 02:51:22 -0400 2011:
 
 If  I use PQescapeByteaConn() for a conenction to a PG9.1 Beta3 server, 
 this function returns (e.g.) \xea2abd8ef31...(and so on.)
 
Here the problem: there should be a second backslash in the prefix. 
 The SQL Statement which uses this string (INSERT statement in my case) 
 returns with an error (Invalid byte sequence...). If I add the second 
 backslash manually everything works fine.
 
 You're just being bitten by the fact that the
 standard_conforming_strings setting changed its default from false to
 true.  If you want the old behavior, you can just flip the switch, but
 the recommended action is to change your expectations.  You can use E''
 if you want backslashes to continue working without changing the switch.

Hm, but PQescapeByteaConn() shouldn't produce a literal that the server
later rejects, no matter what standard_conforming_strings is set to.

It looks like PQescapeByteaConn() does the right thing here, though -
it doesn't escape the backslash in it's result when dealing with 9.1,
presumably because that server has wstandard_conforming_strings set to on.
But why then is the server rejecting the result?

The only way I can see that make that happend would be to prefix the
string returned by PQescapeByteaConn() with 'E'.

@OP: Could you post the code fragment that causes the error?

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Check constraints on partition parents only?

2011-07-27 Thread Robert Haas
On Wed, Jul 27, 2011 at 6:39 AM, Nikhil Sontakke
nikhil.sonta...@enterprisedb.com wrote:

 Yeah, but I think we need to take that chance.  At the very least, we
 need to support the equivalent of a non-inherited CHECK (false) on
 parent tables.

 Indeed. I usually enforce that with a trigger that raises an exception,
 but of course that doesn't help at all with constraint exclusion, and I saw
 a result just a few weeks ago (I forget the exact details) where it appeared
 that the plan chosen was significantly worse because the parent table wasn't
 excluded, so there's a  non-trivial downside from having this restriction.


 The downside appears to be non-trivial indeed! I cooked up the attached
 patch to try to allow ALTER...ONLY...CHECK(false) on parent tables.

 If this approach looks acceptable, I can provide a complete patch later with
 some documentation changes (I think we ought to tell about this special case
 in the documentation) and a minor test case along with it (if the need be
 felt for the test case).
 Although partitioning ought to be looked at from a different angle
 completely, maybe this small patch can help out a bit in the current scheme
 of things, although this is indeed a unusual special casing... Thoughts?

Well, I don't have anything strongly against the idea of an
uninherited constraint, though it sounds like Tom does.  But I think
allowing it just in the case of CHECK (false) would be pretty silly.
And, I'm fairly certain that this isn't going to play nice with
coninhcount... local constraints would have to be marked as local,
else the wrong things will happen later on when you drop them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Check constraints on partition parents only?

2011-07-27 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Well, I don't have anything strongly against the idea of an
 uninherited constraint, though it sounds like Tom does.  But I think
 allowing it just in the case of CHECK (false) would be pretty silly.
 And, I'm fairly certain that this isn't going to play nice with
 coninhcount... local constraints would have to be marked as local,
 else the wrong things will happen later on when you drop them.

Yeah.  If we're going to allow this then we should just have a concept
of a non-inherited constraint, full stop.  This might just be a matter
of removing the error thrown in ATAddCheckConstraint, but I'd be worried
about whether pg_dump will handle the case correctly, what happens when
a new child is added later, etc etc.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Check constraints on partition parents only?

2011-07-27 Thread David E. Wheeler
On Jul 27, 2011, at 1:08 PM, Tom Lane wrote:

 Yeah.  If we're going to allow this then we should just have a concept
 of a non-inherited constraint, full stop.  This might just be a matter
 of removing the error thrown in ATAddCheckConstraint, but I'd be worried
 about whether pg_dump will handle the case correctly, what happens when
 a new child is added later, etc etc.

Is this looking at the wrong problem? The reason I've wanted to get a parent 
check constraint not to fire in a child is because I'm using the parent/child 
relationship for partioning. Will this be relevant if/when an independent 
partitioning feature is added that does not rely on table inheritance?

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Check constraints on partition parents only?

2011-07-27 Thread Andrew Dunstan



On 07/27/2011 04:14 PM, David E. Wheeler wrote:

On Jul 27, 2011, at 1:08 PM, Tom Lane wrote:


Yeah.  If we're going to allow this then we should just have a concept
of a non-inherited constraint, full stop.  This might just be a matter
of removing the error thrown in ATAddCheckConstraint, but I'd be worried
about whether pg_dump will handle the case correctly, what happens when
a new child is added later, etc etc.

Is this looking at the wrong problem? The reason I've wanted to get a parent 
check constraint not to fire in a child is because I'm using the parent/child 
relationship for partioning. Will this be relevant if/when an independent 
partitioning feature is added that does not rely on table inheritance?




Yes, I have clients using inheritance for non-partitioning purposes, and 
they would love to have this.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI error messages

2011-07-27 Thread Peter Eisentraut
On lör, 2011-07-16 at 15:01 -0400, Tom Lane wrote:
 Well, as I mentioned in the commit message, I've thought for some time
 that there were use cases for errdetail_internal.  Whether these
 particular places in predicate.c use it or not doesn't affect that.

Looking at commit 1af37ec96d97722aeb527f5f43d6f6f2304f0861, not all of
these are strings that don't need to be translated.  For example, you
can't assume that the translation of %s: %s is %s: %s or that the
translation of %s (%x) is %s (%x).  I'll review those in detail when
I find some time, but some of them will probably have to be reverted.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Check constraints on partition parents only?

2011-07-27 Thread Robert Haas
On Wed, Jul 27, 2011 at 4:08 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Well, I don't have anything strongly against the idea of an
 uninherited constraint, though it sounds like Tom does.  But I think
 allowing it just in the case of CHECK (false) would be pretty silly.
 And, I'm fairly certain that this isn't going to play nice with
 coninhcount... local constraints would have to be marked as local,
 else the wrong things will happen later on when you drop them.

 Yeah.  If we're going to allow this then we should just have a concept
 of a non-inherited constraint, full stop.  This might just be a matter
 of removing the error thrown in ATAddCheckConstraint, but I'd be worried
 about whether pg_dump will handle the case correctly, what happens when
 a new child is added later, etc etc.

Right.  I'm fairly sure all that stuff is gonna break with the
proposed implementation.  It's a solvable problem, but it's going to
take more than an afternoon to crank it out.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI error messages

2011-07-27 Thread Peter Eisentraut
On lör, 2011-07-16 at 21:55 +0300, Heikki Linnakangas wrote:
 I find it strange to simply leave those strings untranslated. It's
 going 
 to look wrong, like someone just forgot to translate them. However, I 
 agree it's perhaps a bit too much detail to translate all of those 
 messages, and the translations would probably sound weird because
 there 
 isn't established terms for these things yet.
 
 I think I would prefer something like this:
 
 ERROR:  could not serialize access due to read/write dependencies
 among 
 transactions
 DETAIL: Reason code: %s
 HINT:  The transaction might succeed if retried.

Yes, I think that would be better.  I will work on that.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI error messages

2011-07-27 Thread Alvaro Herrera
Excerpts from Peter Eisentraut's message of mié jul 27 16:19:22 -0400 2011:
 On lör, 2011-07-16 at 15:01 -0400, Tom Lane wrote:
  Well, as I mentioned in the commit message, I've thought for some time
  that there were use cases for errdetail_internal.  Whether these
  particular places in predicate.c use it or not doesn't affect that.
 
 Looking at commit 1af37ec96d97722aeb527f5f43d6f6f2304f0861, not all of
 these are strings that don't need to be translated.  For example, you
 can't assume that the translation of %s: %s is %s: %s or that the
 translation of %s (%x) is %s (%x).  I'll review those in detail when
 I find some time, but some of them will probably have to be reverted.

Hmm, if %s: %s has different uses in different places, we're going to
need a more involved solution because they might not translate
identically.

In any case, we need /* translator: */ comments to explain what's going
on.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] patch: move dumpUserConfig call in dumpRoles function of pg_dumpall.c

2011-07-27 Thread Phil Sorber
Hello,

The attached patch changes the location of the dumpUserConfig call in
the dumpRoles function of pg_dumpall.

This is related to this thread:
http://archives.postgresql.org/pgsql-hackers/2011-02/msg02359.php

Currently if you use 'ALTER ROLE rolename SET ROLE', pg_dumpall will
dump an 'ALTER ROLE' out right after the 'CREATE ROLE' statement.
Sometimes this will cause a conflict when a dependent role is not yet
created:

--
-- Roles
--

CREATE ROLE a;
ALTER ROLE a WITH NOSUPERUSER INHERIT NOCREATEROLE NOCREATEDB NOLOGIN
NOREPLICATION;
ALTER ROLE a SET role TO 'b';
CREATE ROLE b;
ALTER ROLE b WITH NOSUPERUSER INHERIT NOCREATEROLE NOCREATEDB NOLOGIN
NOREPLICATION;
CREATE ROLE postgres;
ALTER ROLE postgres WITH SUPERUSER INHERIT CREATEROLE CREATEDB LOGIN
NOREPLICATION;

As you can see, role a is set to role b before role b is created.

This patch moves the call to dumpUserConfig to after the loop where
all the roles are created. This produces output like the this:

--
-- Roles
--

CREATE ROLE a;
ALTER ROLE a WITH NOSUPERUSER INHERIT NOCREATEROLE NOCREATEDB NOLOGIN
NOREPLICATION;
CREATE ROLE b;
ALTER ROLE b WITH NOSUPERUSER INHERIT NOCREATEROLE NOCREATEDB NOLOGIN
NOREPLICATION;
CREATE ROLE postgres;
ALTER ROLE postgres WITH SUPERUSER INHERIT CREATEROLE CREATEDB LOGIN
NOREPLICATION;
ALTER ROLE a SET role TO 'b';

Now this dump will succeed upon restore.

This passed all regression tests.

Thanks.
diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
new file mode 100644
index b5f64e8..ee597d5
*** a/src/bin/pg_dump/pg_dumpall.c
--- b/src/bin/pg_dump/pg_dumpall.c
*** dumpRoles(PGconn *conn)
*** 804,814 
  			 buf, ROLE, rolename);
  
  		fprintf(OPF, %s, buf-data);
- 
- 		if (server_version = 70300)
- 			dumpUserConfig(conn, rolename);
  	}
  
  	PQclear(res);
  
  	fprintf(OPF, \n\n);
--- 804,815 
  			 buf, ROLE, rolename);
  
  		fprintf(OPF, %s, buf-data);
  	}
  
+ 	if (server_version = 70300)
+ 		for (i = 0; i  PQntuples(res); i++)
+ 			dumpUserConfig(conn, PQgetvalue(res, i, i_rolname));
+ 
  	PQclear(res);
  
  	fprintf(OPF, \n\n);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XMLATTRIBUTES vs. values of type XML

2011-07-27 Thread Peter Eisentraut
On ons, 2011-07-27 at 19:37 +0200, Florian Pflug wrote:
  Per SQL standard, the attribute values may not be of type XML, so
 maybe
  we should just prohibit it.
 
 We probably should have, but I think it's too late for that. I don't
 believe I'm the only one who uses XPATH results as attribute values,
 and we'd severely break that use-case.
 
 You might say the same thing about my proposal, of course, but I
 believe
 the risk is much smaller there. Applications would only break if they
   (a) Pass XML from a source other than a XPath expression selecting
   a text or attribute and
   (b) actually want double-escaping to occur.

Well, offhand I would expect that passing an XML value to XMLATTRIBUTES
would behave as in

SELECT XMLELEMENT(NAME t, XMLATTRIBUTES(XMLSERIALIZE(content 'amp;'::XML AS 
text) AS a))

which is what it indeed does in 9.1.

So if we don't want to restrict this, for backward compatibility, then I
would suggest that we fix it to work like it used to.

I would be very hesitant about adding another escape mechanism that
escapes some things but not others.  We already have two or three of
those for XML, and it doesn't seem worth adding another one just for
this, which is outside the standard and for which a valid workaround
exists.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: bogus descriptions displayed by \d+

2011-07-27 Thread Peter Eisentraut
On tis, 2011-07-26 at 09:53 -0400, Robert Haas wrote:
 On Mon, Jul 25, 2011 at 10:29 PM, Josh Kupershmidt
 schmi...@gmail.com wrote:
  That seems like a good way to document this; patch for master
 updated.
  I avoided mucking with the documentation for COMMENT ON RULE and
  COMMENT ON TRIGGER this time; they both say table when they really
  mean table or view, but maybe trying to differentiate between
  table, table_or_view, and relation will make things overly
  complicated.
 
 I think this is basically the right approach but I found what you did
 here a bit wordy, so I simplified it, committed it, and back-patched
 to 9.0 with suitable adjustment.  Hopefully I didn't simplify it into
 a form you don't like.

I would like to argue for reverting this.  If you want to word-smith
details like this, relation doesn't carry any additional meaning.  PG
hackers know that internally, a relation is a table, view, index,
sequence, etc., but for the user, it doesn't mean anything.

Note that we don't use relation_name anywhere else in SQL command
synopses.  So far, no one has complained that we don't mention that
views are allowed in the SELECT command or the GRANT command.

I think table_name is fine, and if you are very worried, add below that
a table_name also includes views (or whatever).

As a side note, backpatching this creates additional translation work in
backbranches.  So please keep it to a minimum if it's not outright
wrong.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQescapeByteaConn - returns wrong string for PG9.1 Beta3

2011-07-27 Thread Andrew Dunstan



On 07/27/2011 02:05 PM, Alvaro Herrera wrote:

Excerpts from Petro Meier's message of mié jul 27 02:51:22 -0400 2011:


If  I use PQescapeByteaConn() for a conenction to a PG9.1 Beta3 server,
this function returns (e.g.) \xea2abd8ef31...(and so on.)

 Here the problem: there should be a second backslash in the prefix.
The SQL Statement which uses this string (INSERT statement in my case)
returns with an error (Invalid byte sequence...). If I add the second
backslash manually everything works fine.

You're just being bitten by the fact that the
standard_conforming_strings setting changed its default from false to
true.  If you want the old behavior, you can just flip the switch, but
the recommended action is to change your expectations.  You can use E''
if you want backslashes to continue working without changing the switch.


Or even better don't interpolate it into SQL at all, but use a statement 
placeholder.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XMLATTRIBUTES vs. values of type XML

2011-07-27 Thread Florian Pflug
On Jul27, 2011, at 23:08 , Peter Eisentraut wrote:
 Well, offhand I would expect that passing an XML value to XMLATTRIBUTES
 would behave as in
 
 SELECT XMLELEMENT(NAME t, XMLATTRIBUTES(XMLSERIALIZE(content 'amp;'::XML 
 AS text) AS a))

With both 9.1 and 9.2 this query returns

 xmlelement 

 t a=amp;amp;/

i.e. makes the value of a represent the *literal* string 'amp;', *not*
the literal string ''. Just to be sure there's no miss-understanding here
- is this what you expect?

 which is what it indeed does in 9.1.
 
 So if we don't want to restrict this, for backward compatibility, then I
 would suggest that we fix it to work like it used to.

There's currently no difference in behaviour between 9.1 and 9.2 there.
We've only modified XPATH to always correctly escape it's result in 9.2,
so there's only a difference if you pass the result of XPATH() to
XMLATTRIBUTES. Which I figured to be the most likely reason for to pass
values of type XML to XMLATTRIBUTES, but maybe you disagree there.

 I would be very hesitant about adding another escape mechanism that
 escapes some things but not others.  We already have two or three of
 those for XML, and it doesn't seem worth adding another one just for
 this, which is outside the standard and for which a valid workaround
 exists.

What's the workaround if you have a value of type XML, e.g. 'amp;',
and want to set an attribute to the value represented by that XML fragment
(i.e. '')? Since we have no XMLUNESCAPE function, I don't see an easy
way to do that. Maybe I'm missing something, though.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Is a heads-up in 9.1 in order regarding the XML-related changes in 9.2?

2011-07-27 Thread Florian Pflug
Hi

As it stands, we're going to release 9.1, knowing that 9.2 will change the
behavior of XPATH. This brings forth the question whether we should somehow
warn about that in either the release notes or the documentation of 9.1

If we don't, then applications developed on 9.1 might contain workarounds for
the deficiencies of XPATH in that version (like for example manually escaping
its output), which make them break on 9.2. That seems a bit unfriendly.

OTOH, had we committed the changes to 9.2 a month or two from now, than
9.1 certainly couldn't have warned about them. So maybe it shouldn't thus
warn now, either.

Is there an establishes practice for situations like this, i.e. a behavior-
changing bug-fix committed to X.Y+1 before X.Y is released?

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Check constraints on partition parents only?

2011-07-27 Thread Alex Hunsaker
On Wed, Jul 27, 2011 at 14:08, Tom Lane t...@sss.pgh.pa.us wrote:

 Yeah.  If we're going to allow this then we should just have a concept
 of a non-inherited constraint, full stop.  This might just be a matter
 of removing the error thrown in ATAddCheckConstraint, but I'd be worried
 about whether pg_dump will handle the case correctly, what happens when
 a new child is added later, etc etc.

[ For those who missed it ]
pg_dump getting things wrong was a big reason to disallow
ONLYconstraints. That is pg_dump did not treat ONLY constraints
correctly, it always tried to stick them on the parent table:
http://archives.postgresql.org/pgsql-bugs/2007-04/msg00026.php

I for example had some backups that had to be manually fixed (by
removing constraints) to get them to import. I would wager the
mentioned clients that have been doing this have broken backups as
well :-(

Now that we have coninhcnt, conislocal etc... we can probably support
ONLY. But I agree with Robert it's probably a bit more than an
afternoon to crank out :-)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: bogus descriptions displayed by \d+

2011-07-27 Thread Josh Kupershmidt
On Wed, Jul 27, 2011 at 5:19 PM, Peter Eisentraut pete...@gmx.net wrote:
 On tis, 2011-07-26 at 09:53 -0400, Robert Haas wrote:
 On Mon, Jul 25, 2011 at 10:29 PM, Josh Kupershmidt
 schmi...@gmail.com wrote:
  That seems like a good way to document this; patch for master
 updated.
  I avoided mucking with the documentation for COMMENT ON RULE and
  COMMENT ON TRIGGER this time; they both say table when they really
  mean table or view, but maybe trying to differentiate between
  table, table_or_view, and relation will make things overly
  complicated.

 I think this is basically the right approach but I found what you did
 here a bit wordy, so I simplified it, committed it, and back-patched
 to 9.0 with suitable adjustment.  Hopefully I didn't simplify it into
 a form you don't like.

 I would like to argue for reverting this.  If you want to word-smith
 details like this, relation doesn't carry any additional meaning.  PG
 hackers know that internally, a relation is a table, view, index,
 sequence, etc., but for the user, it doesn't mean anything.

The original page used table_name in the synopsis in three places:
COMMENT ON {COLUMN, CONSTRAINT, and RULE}. If you're suggesting that
it's intuitively obvious what's meant by table in each of those
three cases, I respectfully disagree: I only think I know now because
I bothered to test all of these recently, and read a bit of comment.c.
(Hint: table means different things in all three cases).

I'll also note that you included index in your list of what a
relation is, and omitted composite type -- this is exactly the
confusion I was trying to avoid. COMMENT ON COLUMN no longer supports
indexes, and does support composite types. Plus, I don't think we
should be designing docs so that only PG hackers know what's really
meant. That hasn't seemed to be the modus operandi of maintaining the
rest of the docs.

 Note that we don't use relation_name anywhere else in SQL command
 synopses.  So far, no one has complained that we don't mention that
 views are allowed in the SELECT command or the GRANT command.

I actually complained upthread about CREATE RULE using the term
table in its synopsis, when it really means table or view, but I
thought that was OK because there was an immediate clarification right
below the synopsis.

 I think table_name is fine, and if you are very worried, add below that
 a table_name also includes views (or whatever).

It includes tables, views, composite types, and foreign tables. Is
table really an appropriate description for all those objects?

 As a side note, backpatching this creates additional translation work in
 backbranches.  So please keep it to a minimum if it's not outright
 wrong.

That's a legitimate concern; I don't have a strong opinion about
whether stuff like this ought to be backpatched.

Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: bogus descriptions displayed by \d+

2011-07-27 Thread Robert Haas
On Wed, Jul 27, 2011 at 5:19 PM, Peter Eisentraut pete...@gmx.net wrote:
 I would like to argue for reverting this.  If you want to word-smith
 details like this, relation doesn't carry any additional meaning.  PG
 hackers know that internally, a relation is a table, view, index,
 sequence, etc., but for the user, it doesn't mean anything.

Well, I don't think we're going to do very well trying to get by
without a generic term of some sort.  Calling it a table is more
confusing, because the user might easily be forgiven for thinking that
he knows what the word table means and reading no further.  If you
say relation, then either (a) the user knows what that means, or (b)
he'll read the text and find out.  I am not very excited about the
idea of documenting table_name as either a table name, or the name
of some kind of object that isn't a table; I think that's just weird.

Also, while it may be true that we haven't used the term specifically
in SQL sypnoses, it's been extensively used in other parts of the
documentation, in the names of system functions such as
pg_relation_size(), and in user-visible error messages (cd
src/backend/po; git grep relation), so I think you may be trying to
close the barn door after the horse has got out.

 Note that we don't use relation_name anywhere else in SQL command
 synopses.  So far, no one has complained that we don't mention that
 views are allowed in the SELECT command or the GRANT command.

Well, for the record, I have previously been *extremely* confused by
both that documentation and the actual syntax.

 I think table_name is fine, and if you are very worried, add below that
 a table_name also includes views (or whatever).

 As a side note, backpatching this creates additional translation work in
 backbranches.  So please keep it to a minimum if it's not outright
 wrong.

I was on the fence about whether this was important enough to
back-patch, and I'll add translation effort to my list of things to
worry about in future cases.  We do pretty commonly back-patch
documentation corrections to the then-current major release, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for 9.2: enhanced errors

2011-07-27 Thread Florian Pflug
On Jul27, 2011, at 23:20 , Pavel Stehule wrote:
 this is a refreshed patch. Only constraints and RI is supported now.
 There is about 1000 ereport calls, where a enhanced diagnostics should
 be used, but probably we don't modify all in one time.

I wonder if it wouldn't be better to have something like the machinery
around ErrorContextCallback to fill in the constraint details. You'd then
only need to modify the places which initiate constraint checks, instead
of every single ereport() in the constraint implementations.

Just a wild idea, though - I haven't check whether this is actually
feasible or no.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is a heads-up in 9.1 in order regarding the XML-related changes in 9.2?

2011-07-27 Thread Tom Lane
Florian Pflug f...@phlo.org writes:
 Is there an establishes practice for situations like this, i.e. a behavior-
 changing bug-fix committed to X.Y+1 before X.Y is released?

Generally, we do nothing.  It's a bit premature (in fact a lot
premature) to assume that the current behavior of HEAD is exactly what
will be released in 9.2, but putting statements about it into 9.1 docs
would amount to assuming that.  It's the job of the 9.2 release notes
to point out incompatibilities, not the job of the 9.1 docs to guess
what will happen in the future.

If you think that the incompatibilities in question are so earth-shaking
as to require retroactive advance warnings, maybe we should reconsider
whether they're a good thing to do at all.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is a heads-up in 9.1 in order regarding the XML-related changes in 9.2?

2011-07-27 Thread Florian Pflug
On Jul28, 2011, at 01:28 , Tom Lane wrote:
 Florian Pflug f...@phlo.org writes:
 Is there an establishes practice for situations like this, i.e. a behavior-
 changing bug-fix committed to X.Y+1 before X.Y is released?
 
 Generally, we do nothing.  It's a bit premature (in fact a lot
 premature) to assume that the current behavior of HEAD is exactly what
 will be released in 9.2, but putting statements about it into 9.1 docs
 would amount to assuming that.  It's the job of the 9.2 release notes
 to point out incompatibilities, not the job of the 9.1 docs to guess
 what will happen in the future.

Fair enough.

 If you think that the incompatibilities in question are so earth-shaking
 as to require retroactive advance warnings, maybe we should reconsider
 whether they're a good thing to do at all.

Certainly not earth-shaking, no. Also an obvious improvement, and probably
equally likely to fix existing applications as they are to break them. So
let's by all means not revert them.

I simply though that putting a warning about XPATH() escaping deficiencies
might save some people the trouble of (a) finding out about that the hard
way and (b) developing work-arounds which are bound to be broken by 9.2.

I'm not saying we must absolutely do so - heck, I'm not even totally
convinced myself that we even should do so. I simply happened to realize
today that the timing was a bit unfortunate, figured it wouldn't hurt to
get additional opinions on this, and thus asked.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: move dumpUserConfig call in dumpRoles function of pg_dumpall.c

2011-07-27 Thread Tom Lane
Phil Sorber p...@omniti.com writes:
 Currently if you use 'ALTER ROLE rolename SET ROLE', pg_dumpall will
 dump an 'ALTER ROLE' out right after the 'CREATE ROLE' statement.

I think pg_dumpall is the very least of your problems if you do
something like that.  We probably ought to forbid it entirely.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Ripping out pg_restore's attempts to parse SQL before sending it

2011-07-27 Thread Tom Lane
In
http://archives.postgresql.org/message-id/201107270042.22427.jul...@mehnle.net
it's pointed out that pg_restore in direct-to-database mode is pretty
badly broken if standard_conforming_strings=on.  The reason is that it
tries to lex SQL commands well enough to find line boundaries, and the
code that does so has never heard of standard_conforming_strings.
While we could possibly teach it about that, I have a more radical
solution in mind: I think we should get rid of that code entirely.

The reason it's attempting to do this is so that it can switch between
send-SQL and send-copy-data modes at line boundaries.  However, there
really is no reason to do that, because COPY data is stored separately
from SQL commands in pg_dump archives, and has been for a very long
time.  So we could simply shove the whole archive entry into either
PQexec or PQputCopyData, saving a lot of complexity and probably a
nontrivial number of cycles.

While I've not yet done any excavation in the commit logs to confirm
this, the nearby comments in the code indicate that separation of COPY
data from SQL commands was adopted in archive format version 1.3, which
is ancient.  In fact, it's so ancient that there was never a production
release of pg_dump that created pre-1.3 archives --- only 7.1devel
versions ever produced those.  It's thus a fairly safe bet that no such
archive files exist in the field.  And just to add insult to injury,
there's this in RestoreArchive():

if (AH-version  K_VERS_1_3)
die_horribly(AH, modulename, direct database connections are not 
supported in pre-1.3 archives\n);

which means that we wouldn't get to the code that is trying to parse
the SQL commands if we did have a pre-1.3 archive to read.

So as far as I can tell, this code is as dead as code can possibly be,
and we ought to rip it out instead of adding still another layer of
complexity to it.

I don't have a problem with back-patching a fix of that ilk to 9.1,
but I'm less sure that it's appropriate to do so in pre-9.1 branches.
Given the lack of prior complaints, maybe we don't have to fix the
older branches, although certainly something must be done in 9.1.

Opinions?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] error: could not find pg_class tuple for index 2662

2011-07-27 Thread daveg

My client has been seeing regular instances of the following sort of problem:

...
 03:06:09.453 exec_simple_query, postgres.c:900
 03:06:12.042 XX000: could not find pg_class tuple for index 2662 at character 
13
 03:06:12.042 RelationReloadIndexInfo, relcache.c:1740
 03:06:12.042 INSERT INTO zzz_k(k) SELECT ...
 03:06:12.045 0: statement: ABORT
 03:06:12.045 exec_simple_query, postgres.c:900
 03:06:12.045 0: duration: 0.100 ms
 03:06:12.045 exec_simple_query, postgres.c:1128
 03:06:12.046 0: statement: INSERT INTO temp_807
  VALUES (...)
 03:06:12.046 exec_simple_query, postgres.c:900
 03:06:12.046 XX000: could not find pg_class tuple for index 2662 at character 
13
 03:06:12.046 RelationReloadIndexInfo, relcache.c:1740
 03:06:12.046 INSERT INTO temp_807
  VALUES (...)
 03:06:12.096 08P01: unexpected EOF on client connection
 03:06:12.096 SocketBackend, postgres.c:348
 03:06:12.096 XX000: could not find pg_class tuple for index 2662
 03:06:12.096 RelationReloadIndexInfo, relcache.c:1740
 03:06:12.121 0: disconnection: session time: 0:06:08.537 user=ZZZ 
database=ZZZ_01
 03:06:12.121 log_disconnections, postgres.c:4339


The above happens regularly (but not completely predictably) corresponding
with a daily cronjob that checks the catalogs for bloat and does vacuum full
and/or reindex as needed. Since some of the applications make very heavy
use of temp tables this will usually mean pg_class and pg_index get vacuum
full and reindex.

Sometimes queries will fail due to being unable to open a tables containing
file. On investigation the file will be absent in both the catalogs and the
filesystem so I don't know what table it refers to:

 20:41:19.063  ERROR:  could not open file 
pg_tblspc/16401/PG_9.0_201008051/16413/1049145092: No such file or directory
 20:41:19.063  STATEMENT:  insert into r_ar__30
   select aid, mid, pid, sum(wdata) as wdata, ...
--
 20:41:19.430  ERROR:  could not open file 
pg_tblspc/16401/PG_9.0_201008051/16413/1049145092: No such file or directory
 20:41:19.430  STATEMENT: SELECT nextval('j_id_seq')


Finallly, I have seen a several instances of failure to read data by
vacuum full itself:

 03:05:45.699 0: statement: vacuum full pg_catalog.pg_index;
 03:05:45.699 exec_simple_query, postgres.c:900
 03:05:46.142 XX001: could not read block 65 in file 
pg_tblspc/16401/PG_9.0_201008051/16416/1049146489: read only 0 of 8192 bytes
 03:05:46.142 mdread, md.c:656
 03:05:46.142 vacuum full pg_catalog.pg_index;

This occurs on postgresql 9.0.4. on 32 core 512GB Dell boxes. We have
identical systems still running 8.4.8 that do not have this issue, so I'm
assuming it is related to the vacuum full work done for 9.0. Oddly, we don't
see this on the smaller hosts (8 core, 64GB, slower cpus) running 9.0.4,
so it may be timing related.

This seems possibly related to the issues in:

  Bizarre buildfarm failure on baiji: can't find pg_class_oid_index
http://archives.postgresql.org/pgsql-hackers/2010-02/msg02038.php
  Broken HOT chains in system catalogs
http://archives.postgresql.org/pgsql-hackers/2011-04/msg00777.php

As far as I can tell from the logs I have, once a session sees one of these
errors any subsequent query will hit it again until the session exits.
However, it does not seem to harm other sessions or leave any persistant
damage (crossing fingers and hoping here).

I'm ready to do any testing/investigation/instrumented builds etc that may be
helpful in resolving this.

Regards

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] cheaper snapshots

2011-07-27 Thread Robert Haas
On Wed, Oct 20, 2010 at 10:07 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wonder whether we could do something involving WAL properties --- the
 current tuple visibility logic was designed before WAL existed, so it's
 not exploiting that resource at all.  I'm imagining that the kernel of a
 snapshot is just a WAL position, ie the end of WAL as of the time you
 take the snapshot (easy to get in O(1) time).  Visibility tests then
 reduce to did this transaction commit with a WAL record located before
 the specified position?.  You'd need some index datastructure that made
 it reasonably cheap to find out the commit locations of recently
 committed transactions, where recent means back to recentGlobalXmin.
 That seems possibly do-able, though I don't have a concrete design in
 mind.

I was mulling this idea over some more (the same ideas keep floating
back to the top...).  I don't think an LSN can actually work, because
there's no guarantee that the order in which the WAL records are
emitted is the same order in which the effects of the transactions
become visible to new snapshots.  For example:

1. Transaction A inserts its commit record, flushes WAL, and begins
waiting for sync rep.
2. A moment later, transaction B sets synchronous_commit=off, inserts
its commit record, requests a background WAL flush, and removes itself
from the ProcArray.
3. Transaction C takes a snapshot.

Sync rep doesn't create this problem; there's a race anyway.  The
order of acquisition for WALInsertLock needn't match that for
ProcArrayLock.  This has the more-than-slightly-odd characteristic
that you could end up with a snapshot on the master that can see A but
not B and a snapshot on the slave that can see B but not A.

But having said that an LSN can't work, I don't see why we can't just
use a 64-bit counter.  In fact, the predicate locking code already
does something much like this, using an SLRU, for serializable
transactions only.  In more detail, what I'm imagining is an array
with 4 billion entries, one per XID, probably broken up into files of
say 16MB each with 2 million entries per file.  Each entry is a 64-bit
value.  It is 0 if the XID has not yet started, is still running, or
has aborted.  Otherwise, it is the commit sequence number of the
transaction.  For reasons I'll explain below, I'm imagining starting
the commit sequence number counter at some very large value and having
it count down from there.  So the basic operations are:

- To take a snapshot, you just read the counter.
- To commit a transaction which has an XID, you read the counter,
stamp all your XIDs with that value, and decrement the counter.
- To find out whether an XID is visible to your snapshot, you look up
the XID in the array and get the counter value.  If the value you read
is greater than your snapshot value, it's visible.  If it's less, it's
not.

Now, is this algorithm any good, and how little locking can we get away with?

It seems to me that if we used an SLRU to store the array, the lock
contention would be even worse than it is under our current system,
wherein everybody fights over ProcArrayLock.  A system like this is
going to involve lots and lots of probes into the array (even if we
build a per-backend cache of some kind) and an SLRU will require at
least one LWLock acquire and release per probe.  Some kind of locking
is pretty much unavoidable, because you have to worry about pages
getting evicted from shared memory.  However, what if we used a set of
files (like SLRU) but mapped them separately into each backend's
address space?  I think this would allow both loads and stores from
the array to be done unlocked.  One fly in the ointment is that 8-byte
stores are apparently done as two 4-byte stores on some platforms.
But if the counter runs backward, I think even that is OK.  If you
happen to read an 8 byte value as it's being written, you'll get 4
bytes of the intended value and 4 bytes of zeros.  The value will
therefore appear to be less than what it should be.  However, if the
value was in the midst of being written, then it's still in the midst
of committing, which means that that XID wasn't going to be visible
anyway.  Accidentally reading a smaller value doesn't change the
answer.

Committing will require a lock on the counter.

Taking a snapshot can be done unlocked if (1) 8-byte reads are atomic
and either (2a) the architecture has strong memory ordering (no
store/store reordering) or (2b) you insert a memory fence between
stamping the XIDs and decrementing the counter.  Otherwise, taking a
snapshot will also require a lock on the counter.

Once a particular XID precedes RecentGlobalXmin, you no longer care
about the associated counter value.  You just need to know that it
committed; the order no longer matters.  So after a crash, assuming
that you have the CLOG bits available, you can just throw away all the
array contents and start the counter over at the highest possible
value.  And, as RecentGlobalXmin advances, you can 

Re: [HACKERS] Ripping out pg_restore's attempts to parse SQL before sending it

2011-07-27 Thread Tom Lane
I wrote:
 While I've not yet done any excavation in the commit logs to confirm
 this, the nearby comments in the code indicate that separation of COPY
 data from SQL commands was adopted in archive format version 1.3, which
 is ancient.  In fact, it's so ancient that there was never a production
 release of pg_dump that created pre-1.3 archives --- only 7.1devel
 versions ever produced those.

I've now poked around in the commit history, and found that pg_dump's
archive format, as well as pg_restore itself, were introduced in commit
500b62b05 of 2000-07-04.  The 1.3 format revision was added in commit
e8f69be05 of 2000-07-21.  So not only did the prior formats not get
written by any production releases, there was only an interval of a
couple of weeks where they'd even have been emitted by development HEAD.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for 9.2: enhanced errors

2011-07-27 Thread Pavel Stehule
2011/7/28 Florian Pflug f...@phlo.org:
 On Jul27, 2011, at 23:20 , Pavel Stehule wrote:
 this is a refreshed patch. Only constraints and RI is supported now.
 There is about 1000 ereport calls, where a enhanced diagnostics should
 be used, but probably we don't modify all in one time.

 I wonder if it wouldn't be better to have something like the machinery
 around ErrorContextCallback to fill in the constraint details. You'd then
 only need to modify the places which initiate constraint checks, instead
 of every single ereport() in the constraint implementations.

 Just a wild idea, though - I haven't check whether this is actually
 feasible or no.

I though about this too, but sometimes is relative difficult to
specify a fields before exception -- see a ri_triggers part.
TABLE_NAME and TABLE_SCHEMA should not contains a name of processed
table, but name of error, that is related to error. It can be
different. But if we would to use a enhanced errors for in
functions, then some injection into ErrorContextCallback should be
necessary - but again - the these fields are no related to function's
scope - so it mean a more manipulation with ErrorContext.

Regards

Pavel Stehule


 best regards,
 Florian Pflug


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers