Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-18 Thread Dean Rasheed
On 17 March 2015 at 23:25, Peter Geoghegan p...@heroku.com wrote:
 Possibly I'm missing something though.

 I think that you may have. Did you read the commit message/docs of the
 RLS commit 0004-*? You must consider the second point here, I believe:

 
 The 3 places that RLS policies are enforced are:

 * Against row actually inserted, after insertion proceeds successfully
   (INSERT-applicable policies only).

 * Against row in target table that caused conflict.  The implementation
   is careful not to leak the contents of that row in diagnostic
   messages (INSERT-applicable *and* UPDATE-applicable policies).

 * Against the version of the row added by to the relation after
   ExecUpdate() is called (INSERT-applicable *and* UPDATE-applicable
   policies).

 


Yes, I read that, and I agree with the intention to not leak data
according to both the INSERT and UPDATE policies, however...

 You're seeing a failure that applies to the target tuple of the UPDATE
 (the tuple that we can't leak the contents of). I felt it was best to
 check all policies against the target/existing tuple, including both
 WITH CHECK OPTIONS and USING quals (which are both enforced).


I think that's an incorrect implementation of the RLS UPDATE policy.
The WITH CHECK quals of a RLS policy are intended to be applied to the
NEW data, not the existing data. This patch is applying the WITH CHECK
quals to both the existing and NEW tuples, which runs counter to the
way RLS polices are normally enforced, and I think that will just lead
to confusion.

 I can see why you might not like that behavior, but it is the intended
 behavior. I thought that this whole intersection of RLS + UPSERT is
 complex enough that it would be best to be almost as conservative as
 possible in what fails and what succeeds. The one exception is when
 the insert path is actually taken, since the statement is an INSERT
 statement.

The problem with that is that the user will see errors saying that the
data violates the RLS WITH CHECK policy, when they might quite
reasonably argue that it doesn't. That's not really being
conservative. I'd argue it's a bug.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-18 Thread Pavel Stehule
2015-03-18 3:45 GMT+01:00 Jim Nasby jim.na...@bluetreble.com:

 On 3/17/15 8:06 PM, Alvaro Herrera wrote:

 My main question regarding this patch is whether the behavior with MD
 arrays is useful at all.  Suppose I give it this:

 alvherre=# select array_offset('{{{1,2},{3,4},{5,6}},{{2,3},{4,5},{6,7}}}',
 3);
   array_offset
 --
  3
 (1 fila)

 What can I do with the 3 value it returned?  Certainly not use it as
 an offset to get a slice of the original array.  The only thing that
 seems sensible to me here is to reject the whole thing with an error,
 that is, only accept 1-D arrays here.  We can later extend the function
 by allowing higher dimensionality as long as the second argument is an
 array one dimension less than the first argument.  But if we allow the
 case on its appearance, it's going to be difficult to change the
 behavior later.


This behave is consistent with unnest function, when all multidimensional
arrays are reduced to 1ND arrays.

Other argument for this behave is impossibility to design other behave.
array_offset function have to returns integer always. You cannot to returns
a array of integers, what is necessary for MD position. And one integer can
be only position in flatted  1ND array. I agree, so this is not user
friendly, but there is not any other possible solution - we have not
anyarray and anymdarray types. I designed this possibility (use ND arrays)
mainly for info, if some value exists or not.

I am thinking, so this behave is correct (there is no other possible), but
it is only corner case for this functionality - and if you are thinking, so
better to disallow it, I am not against. My main focus is 1ND array.

Regards

Pavel





 +1

  Has a case been made for the current behavior?


 Not that I remember. There was discussion about how this should properly
 support MD arrays.

 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: [HACKERS] MD5 authentication needs help -SCRAM

2015-03-18 Thread Abhijit Menon-Sen
At 2015-03-14 09:44:02 +0200, hlinn...@iki.fi wrote:

 Perhaps it would be time to restart the discussion on standardizing
 SRP as a SASL mechanism in IETF.

I haven't seen much evidence that there's any interest in doing this; in
fact, I can't remember the author of the draft you pointed to being very
active in the discussions either.

 Assume that the connection is not encrypted, and Eve captures the
 SCRAM handshake between Alice and Bob. Using the captured handshake,
 she can try to guess the password, offline. With a PAKE protocol, she
 cannot do that.

OK. I agree that this is a nice property. SCRAM made the design decision
to hinder such attacks by using PBKDF2 rather than a zero-knowledge key
exchange mechanism as SRP does. This was partly due to the trend that I
mentioned of wanting to require TLS everywhere.

I'm obviously biased in this matter, but I think it's acceptable for the
potential attack to be frustrated by the use of PBKDF2 and defeated by
the use of TLS (which is already possible with Postgres); and that in
the balance, SCRAM is easier to implement securely than SRP.

Of course, if you want to use x as your password everywhere, then SRP
is preferable. ;-)

-- Abhijit

P.S. I don't know why the SRP code was removed from LibreSSL; nor am I
sure how seriously to take that. It's possible that it's only because
it's (still) rather obscure.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using 128-bit integers for sum, avg and statistics aggregates

2015-03-18 Thread Andres Freund
On 2015-03-17 20:50:48 -0700, Peter Geoghegan wrote:
 On Mon, Mar 16, 2015 at 6:22 AM, Petr Jelinek p...@2ndquadrant.com wrote:
  Do you think it is ready for committer?
 
 
  In my opinion, yes.

 If it wasn't for the autoconf parts of this, I'd probably agree with
 you. I need to go over that more carefully.

I think it's a pretty direct copy of the 64bit code. I'm not entirely
sure why this needs a AC_TRY_RUN with a compile fallback (for cross) and
why a AC_TRY_LINK isn't sufficient? But then, you just copied that
decision.

Tom, it seems you have added the 64bit check to configure. All, the way
back in 1998 ;). C.f. 07ae591c87. Do you see a reason why we need to
actually run code?

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help -SCRAM

2015-03-18 Thread Abhijit Menon-Sen
As a followup, I spoke to an IETF friend who's used and implemented both
SRP and SCRAM. He agrees that SRP is cryptographically solid, that it's
significantly more difficult to implement (and therefore has a bit of a
monoculture risk overall, though of course that wouldn't apply to us if
we were to write the code from scratch).

Apparently the patent status is still not entirely clear. Two of the
patents expired, but there are others that may be relevant. Stanford
claims a patent, but apparently grant a free license if you do meet
certain conditions. But he doesn't know of anyone having to go to
court over the use of SRP.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance improvement for joins where outer side is unique

2015-03-18 Thread Kyotaro HORIGUCHI
Hello, The attached is non-search version of unique join.

It is not fully examined but looks to work as expected. Many
small changes make the patch larger but affected area is rather
small.

What do you think about this?

 Hello, I don't have enough time for now but made some
 considerations on this.
 
 It might be a way that marking unique join peer at bottom and
 propagate up, not searching from top of join list.
 Around create_join_clause might be a candidate for it.
 I'll investigate that later.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index a951c55..b8a68b5 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1151,9 +1151,16 @@ ExplainNode(PlanState *planstate, List *ancestors,
 		appendStringInfo(es-str,  %s Join, jointype);
 	else if (!IsA(plan, NestLoop))
 		appendStringInfoString(es-str,  Join);
+	if (((Join *)plan)-inner_unique)
+		appendStringInfoString(es-str, (inner unique));
+
 }
 else
+{
 	ExplainPropertyText(Join Type, jointype, es);
+	ExplainPropertyText(Inner unique, 
+			((Join *)plan)-inner_unique?true:false, es);
+}
 			}
 			break;
 		case T_SetOp:
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index 1d78cdf..d3b14e5 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -306,10 +306,11 @@ ExecHashJoin(HashJoinState *node)
 	}
 
 	/*
-	 * In a semijoin, we'll consider returning the first
-	 * match, but after that we're done with this outer tuple.
+	 * We'll consider returning the first match if the inner
+	 * is unique, but after that we're done with this outer
+	 * tuple.
 	 */
-	if (node-js.jointype == JOIN_SEMI)
+	if (node-js.inner_unique)
 		node-hj_JoinState = HJ_NEED_NEW_OUTER;
 
 	if (otherqual == NIL ||
@@ -451,6 +452,7 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
 	hjstate = makeNode(HashJoinState);
 	hjstate-js.ps.plan = (Plan *) node;
 	hjstate-js.ps.state = estate;
+	hjstate-js.inner_unique = node-join.inner_unique;
 
 	/*
 	 * Miscellaneous initialization
@@ -498,8 +500,10 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
 	/* set up null tuples for outer joins, if needed */
 	switch (node-join.jointype)
 	{
-		case JOIN_INNER:
 		case JOIN_SEMI:
+			hjstate-js.inner_unique = true;
+			/* fall through */
+		case JOIN_INNER:
 			break;
 		case JOIN_LEFT:
 		case JOIN_ANTI:
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index 15742c5..3c21ffe 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -840,10 +840,11 @@ ExecMergeJoin(MergeJoinState *node)
 	}
 
 	/*
-	 * In a semijoin, we'll consider returning the first
-	 * match, but after that we're done with this outer tuple.
+	 * We'll consider returning the first match if the inner
+	 * is unique, but after that we're done with this outer
+	 * tuple.
 	 */
-	if (node-js.jointype == JOIN_SEMI)
+	if (node-js.inner_unique)
 		node-mj_JoinState = EXEC_MJ_NEXTOUTER;
 
 	qualResult = (otherqual == NIL ||
@@ -1486,6 +1487,8 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
 	mergestate-js.ps.plan = (Plan *) node;
 	mergestate-js.ps.state = estate;
 
+	mergestate-js.inner_unique = node-join.inner_unique;
+
 	/*
 	 * Miscellaneous initialization
 	 *
@@ -1553,8 +1556,10 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
 
 	switch (node-join.jointype)
 	{
-		case JOIN_INNER:
 		case JOIN_SEMI:
+			mergestate-js.inner_unique = true;
+			/* fall through */
+		case JOIN_INNER:
 			mergestate-mj_FillOuter = false;
 			mergestate-mj_FillInner = false;
 			break;
diff --git a/src/backend/executor/nodeNestloop.c b/src/backend/executor/nodeNestloop.c
index e66bcda..342c448 100644
--- a/src/backend/executor/nodeNestloop.c
+++ b/src/backend/executor/nodeNestloop.c
@@ -247,10 +247,10 @@ ExecNestLoop(NestLoopState *node)
 			}
 
 			/*
-			 * In a semijoin, we'll consider returning the first match, but
-			 * after that we're done with this outer tuple.
+			 * We'll consider returning the first match if the inner is
+			 * unique, but after that we're done with this outer tuple.
 			 */
-			if (node-js.jointype == JOIN_SEMI)
+			if (node-js.inner_unique)
 node-nl_NeedNewOuter = true;
 
 			if (otherqual == NIL || ExecQual(otherqual, econtext, false))
@@ -310,6 +310,8 @@ ExecInitNestLoop(NestLoop *node, EState *estate, int eflags)
 	nlstate-js.ps.plan = (Plan *) node;
 	nlstate-js.ps.state = estate;
 
+	nlstate-js.inner_unique = node-join.inner_unique;
+
 	/*
 	 * Miscellaneous initialization
 	 *
@@ -354,8 +356,10 @@ ExecInitNestLoop(NestLoop *node, EState *estate, int eflags)
 
 	switch (node-join.jointype)
 	{
-		case JOIN_INNER:
 		case 

Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-18 Thread Robert Haas
On Wed, Mar 18, 2015 at 2:34 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
 So, overall consensus for the FDW hook location is just before the 
 set_chepest()
 at standard_join_search() and merge_clump(), isn't it?

Yes, I think so.

 Let me make a design of FDW hook to reduce code duplications for each FDW 
 driver,
 especially, to identify baserel/joinrel to be involved in this join.

Great, thanks!

One issue, which I think Ashutosh alluded to upthread, is that we need
to make sure it's not unreasonably difficult for foreign data wrappers
to construct the FROM clause of an SQL query to be pushed down to the
remote side.  It should be simple when there are only inner joins
involved, but when there are all outer joins it might be a bit
complex.  It would be very good if someone could try to write that
code, based on the new hook locations, and see how it turns out, so
that we can figure out how to address any issues that may crop up
there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Future directions for inheritance-hierarchy statistics

2015-03-18 Thread Robert Haas
On Tue, Mar 17, 2015 at 11:26 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 This would have one significant drawback, which is that planning for
 large inheritance trees (many children) would probably get noticeably
 slower.  (But in the common case that constraint exclusion limits a
 query to scanning just one or a few children, the hit would be small.)

 That's a pretty big drawback.  I'm not sure whether it's big enough to
 sink the whole idea, but we really need to make planning time on large
 inheritance trees cheaper, not more expensive.

 Ah, but note the point about how there's no added cost for partitions that
 are removed by constraint exclusion.  That should mean that in practical
 use it's not a huge problem.  (If you're going to scan K partitions, you
 should not be surprised that planning time is O(K).  It will be anyway
 thanks to other things such as index selection.)

 Also, you're ignoring the prospect of getting better estimates and hence
 better plans through having stats that dynamically adapt to the set of
 partitions being scanned.  Given the lousy state of maintenance of
 whole-tree stats, I really think that this consideration might outweigh
 even a significant planning-time hit.  Shaving planning time by producing
 crappy estimates isn't usually a good tradeoff.

Perhaps so, but I know that the planning time of large inheritance
trees has been a major issue for some of EnterpriseDB's customers.  In
fact, EnterpriseDB has run into a number of customer situations where
planning time even for non-inheritance queries is substantially higher
than, shall we say, a competing commercial product.  With inheritance,
even people who aren't making comparisons with other products start to
get unhappy.  I've always been very pleased with the quality of plans
that our planner generates, but it's becoming increasingly clear to me
that at least one other product is able to provide good plans at a
significantly lower CPU cost, and inheritance is particular trouble
spot.  I don't know exactly what we ought to do about that and perhaps
it's to one side of the issue you're raising here, but I do think it's
an issue that we (the PostgreSQL community) ought to be thinking
about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Left lateral join with for update and skip locked

2015-03-18 Thread Tom Lane
=?UTF-8?B?QmVuamFtaW4gQsO2cm5nZW4tU2NobWlkdA==?= benja...@boerngen-schmidt.de 
writes:
 The Lateral statement does return a result, which I do not expect. I 
 returns an end point multiple times for diverent start points. Why? I 
 thought, that the selected point by the lateral is locked by the FOR 
 UPDATE and if the lateral selects a point that is already locked it will 
 be skipped by the SKIP LOCKED.

It sounds like you think SKIP LOCKED means to skip rows locked by your own
transaction.  That's not what it does, AFAIK.  It skips rows you couldn't
get a lock on without waiting ... but if you already have a lock, that's
fine.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-18 Thread Pavel Stehule
2015-03-18 12:41 GMT+01:00 Marko Tiikkaja ma...@joh.to:

 On 3/18/15 12:27 PM, Pavel Stehule wrote:

 On 3/17/15 8:06 PM, Alvaro Herrera wrote:

  My main question regarding this patch is whether the behavior with MD
 arrays is useful at all.  Suppose I give it this:

 alvherre=# select array_offset('{{{1,2},{3,4},{
 5,6}},{{2,3},{4,5},{6,7}}}',
 3);
array_offset
 --
   3
 (1 fila)

 What can I do with the 3 value it returned?  Certainly not use it as
 an offset to get a slice of the original array.  The only thing that
 seems sensible to me here is to reject the whole thing with an error,
 that is, only accept 1-D arrays here.  We can later extend the function
 by allowing higher dimensionality as long as the second argument is an
 array one dimension less than the first argument.  But if we allow the
 case on its appearance, it's going to be difficult to change the
 behavior later.

  I designed this possibility (use ND arrays)
 mainly for info, if some value exists or not.


 Why not use =ANY() for that?


It is only partial solution - array_offset use IS NOT DISTINCT FROM
operator.


  I am thinking, so this behave is correct (there is no other possible), but
 it is only corner case for this functionality - and if you are thinking,
 so
 better to disallow it, I am not against. My main focus is 1ND array.


 A nonsensical answer for multi-dimensional arrays is worse than no answer
 at all.  I think raising an exception is better.


I have not problem with it.

Regards

Pavel




 .m



[HACKERS] Left lateral join with for update and skip locked

2015-03-18 Thread Benjamin Börngen-Schmidt
A few days ago I posted a question on general concerning the new feature 
SKIP LOCKED in PostgreSQL 9.5-dev.


For the orginal question can be found here: 
http://www.postgresql.org/message-id/54f723c3.1020...@boerngen-schmidt.de



What I'm trying to archieve is to match a point from my data which meets 
certain conditions to a given point. The given points are selected by 
'Select id as start ', then the left lateral join should go 
through each given point and find a corresponding point which is  not in 
the same region and between 5 and 10 km away. If I did understand the 
lateral join right, it will be executed like a loop for every previously 
selected row in a non lateral statement.


The Lateral statement does return a result, which I do not expect. I 
returns an end point multiple times for diverent start points. Why? I 
thought, that the selected point by the lateral is locked by the FOR 
UPDATE and if the lateral selects a point that is already locked it will 
be skipped by the SKIP LOCKED.


SELECT
   start,
   destination,
   ST_Distance(start_geom, end_geom) AS distance_meter
   FROM (
 SELECT id as start, geom as start_geom
 FROM de_sim_points_start
 WHERE NOT used AND rs = '057700032032'
 ORDER BY RANDOM()
 LIMIT 200
 FOR UPDATE SKIP LOCKED
   ) AS s
LEFT JOIN LATERAL (
 SELECT id as destination, geom as end_geom
 FROM de_sim_points_end
 WHERE NOT used AND rs IN (
   SELECT sk.rs
   FROM de_commuter_kreise ck
 INNER JOIN de_shp_kreise sk
   ON sk.rs = ck.rs AND ST_DWithin((SELECT ST_Union(geom) FROM
de_shp WHERE rs ='057700032032'), sk.geom, 5000)
   UNION
   SELECT cg.rs
   FROM de_commuter_gemeinden cg
 INNER JOIN de_shp_gemeinden sg
   ON sg.rs = cg.rs AND ST_DWithin((SELECT ST_Union(geom) FROM
de_shp WHERE rs = '057700032032'), sg.geom, 5000)
 )
AND NOT ST_DWithin(geom, start_geom, 2000) AND ST_DWithin(geom,
start_geom, 5000)
 FOR UPDATE SKIP LOCKED
 LIMIT 1
) AS e ON TRUE

What I think the query is doing:
1. It SELECTs the startpoints
2. Then for each row it selects in the lateral join a corresponding endpoint
2.1 skipping already locked endpoints and find the next not lock one
2.2 While it selects this endpoint it is lock due to the FOR UPDATE
3. Result is presented

BUT what happens is, that I get the same endpoint a couple of times. It 
seems to me that the lateral join does not evaluate the SKIP LOCKED 
right, since this endpoint occurs multiple times. Is this a bug or a 
feature?


p.s.
I also would be glad if you guys could tell me a better way to match 
points as I would really help me with my mastersthesis.


- Benjamin




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-18 Thread Kouhei Kaigai
 On Wed, Mar 18, 2015 at 2:34 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
  So, overall consensus for the FDW hook location is just before the 
  set_chepest()
  at standard_join_search() and merge_clump(), isn't it?
 
 Yes, I think so.
 
  Let me make a design of FDW hook to reduce code duplications for each FDW 
  driver,
  especially, to identify baserel/joinrel to be involved in this join.
 
 Great, thanks!
 
 One issue, which I think Ashutosh alluded to upthread, is that we need
 to make sure it's not unreasonably difficult for foreign data wrappers
 to construct the FROM clause of an SQL query to be pushed down to the
 remote side.  It should be simple when there are only inner joins
 involved, but when there are all outer joins it might be a bit
 complex.  It would be very good if someone could try to write that
 code, based on the new hook locations, and see how it turns out, so
 that we can figure out how to address any issues that may crop up
 there.

Here is an idea that provides a common utility function that break down
the supplied RelOptInfo of joinrel into a pair of join-type and a list of
baserel/joinrel being involved in the relations join. It intends to be
called by FDW driver to list up underlying relations.
IIUC, root-join_info_list will provide information of how relations are
combined to the upper joined relations, thus, I expect it is not
unreasonably complicated way to solve.
Once a RelOptInfo of the target joinrel is broken down into multiple sub-
relations (N=2 if all inner join, elsewhere N=2), FDW driver can
reference the RestrictInfo to be used in relations join.

Anyway, I'll try to investigate the existing code for more detail today,
to clarify whether the above approach is feasible.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-18 Thread Marko Tiikkaja

On 3/18/15 12:27 PM, Pavel Stehule wrote:

On 3/17/15 8:06 PM, Alvaro Herrera wrote:


My main question regarding this patch is whether the behavior with MD
arrays is useful at all.  Suppose I give it this:

alvherre=# select array_offset('{{{1,2},{3,4},{5,6}},{{2,3},{4,5},{6,7}}}',
3);
   array_offset
--
  3
(1 fila)

What can I do with the 3 value it returned?  Certainly not use it as
an offset to get a slice of the original array.  The only thing that
seems sensible to me here is to reject the whole thing with an error,
that is, only accept 1-D arrays here.  We can later extend the function
by allowing higher dimensionality as long as the second argument is an
array one dimension less than the first argument.  But if we allow the
case on its appearance, it's going to be difficult to change the
behavior later.


I designed this possibility (use ND arrays)
mainly for info, if some value exists or not.


Why not use =ANY() for that?


I am thinking, so this behave is correct (there is no other possible), but
it is only corner case for this functionality - and if you are thinking, so
better to disallow it, I am not against. My main focus is 1ND array.


A nonsensical answer for multi-dimensional arrays is worse than no 
answer at all.  I think raising an exception is better.



.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add LINE: hint when schemaname.typename is a non-existent schema

2015-03-18 Thread Tom Lane
Jeevan Chalke jeevan.cha...@gmail.com writes:
 I am more concerned about this:

 1.
 postgres=# create or replace function
 f1(a abc.test.id%type) returns int as
 $$ select 1; $$
 language sql;
 ERROR:  schema abc does not exist

 Is that expected?

Yes, or at least, if it's not what we want it's not this patch's fault.
That behavior is pre-existing.

 Also what about pushing setup_parser_errposition_callback() inside 
 func_get_detail() as well, just to limit it for namespace lookup?

Adding a pstate parameter to func_get_detail would be rather invasive;
not sure it's worth it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2015 - mentors, students and admins.

2015-03-18 Thread Thom Brown
On 9 February 2015 at 20:52, Thom Brown t...@linux.com wrote:
 Hi all,

 Google Summer of Code 2015 is approaching.  I'm intending on registering
 PostgreSQL again this year.

 Before I do that, I'd like to have an idea of how many people are interested
 in being either a student or a mentor.

 I've volunteered to be admin again, but if anyone else has a strong interest
 of seeing the projects through this year, please let yourself be known.

 Who would be willing to mentor projects this year?

 Project ideas are welcome, and I've copied many from last year's submissions
 into this year's wiki page.  Please feel free to add more (or delete any
 that stand no chance or are redundant):
 https://wiki.postgresql.org/wiki/GSoC_2015

 Students can find more information about GSoC here:
 http://www.postgresql.org/developer/summerofcode

Any students planning on participating this year will need to register
by 19:00 UTC Friday next week (27th March).  Please ensure that you
also submit your proposal.  The proposals will be reviewed based on
how well planned they are (description, use case, implementation plan,
milestones etc.), the feasibility of completing it as part of a GSoC
project, and whether the proposal is desirable.  Please do not make
submissions without a clear proposal of what you plan to do as these
will not be considered.

If you need assistance with anything in the meantime, please get in contact.

Thanks

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Future directions for inheritance-hierarchy statistics

2015-03-18 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Mar 17, 2015 at 11:26 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Also, you're ignoring the prospect of getting better estimates and hence
 better plans through having stats that dynamically adapt to the set of
 partitions being scanned.  Given the lousy state of maintenance of
 whole-tree stats, I really think that this consideration might outweigh
 even a significant planning-time hit.  Shaving planning time by producing
 crappy estimates isn't usually a good tradeoff.

 Perhaps so, but I know that the planning time of large inheritance
 trees has been a major issue for some of EnterpriseDB's customers.  In
 fact, EnterpriseDB has run into a number of customer situations where
 planning time even for non-inheritance queries is substantially higher
 than, shall we say, a competing commercial product.  With inheritance,
 even people who aren't making comparisons with other products start to
 get unhappy.  I've always been very pleased with the quality of plans
 that our planner generates, but it's becoming increasingly clear to me
 that at least one other product is able to provide good plans at a
 significantly lower CPU cost, and inheritance is particular trouble
 spot.  I don't know exactly what we ought to do about that and perhaps
 it's to one side of the issue you're raising here, but I do think it's
 an issue that we (the PostgreSQL community) ought to be thinking
 about.

Well, we know that the current approach to inheritance isn't very well
attuned to standard partitioning situations, because it treats every
inheritance child as a de novo problem.  I continue to maintain that
the right fix for that is a partitioning feature that forbids any schema
variation across partitions, which the planner would use to avoid doing
O(N) work when dealing with an N-partition table.  Worrying about
changes that would already be involving less than O(N) work is rather
pointless in this context, IMO.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-18 Thread Robert Haas
On Wed, Mar 18, 2015 at 9:33 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
 On Wed, Mar 18, 2015 at 2:34 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
  So, overall consensus for the FDW hook location is just before the 
  set_chepest()
  at standard_join_search() and merge_clump(), isn't it?

 Yes, I think so.

  Let me make a design of FDW hook to reduce code duplications for each FDW 
  driver,
  especially, to identify baserel/joinrel to be involved in this join.

 Great, thanks!

 One issue, which I think Ashutosh alluded to upthread, is that we need
 to make sure it's not unreasonably difficult for foreign data wrappers
 to construct the FROM clause of an SQL query to be pushed down to the
 remote side.  It should be simple when there are only inner joins
 involved, but when there are all outer joins it might be a bit
 complex.  It would be very good if someone could try to write that
 code, based on the new hook locations, and see how it turns out, so
 that we can figure out how to address any issues that may crop up
 there.

 Here is an idea that provides a common utility function that break down
 the supplied RelOptInfo of joinrel into a pair of join-type and a list of
 baserel/joinrel being involved in the relations join. It intends to be
 called by FDW driver to list up underlying relations.
 IIUC, root-join_info_list will provide information of how relations are
 combined to the upper joined relations, thus, I expect it is not
 unreasonably complicated way to solve.
 Once a RelOptInfo of the target joinrel is broken down into multiple sub-
 relations (N=2 if all inner join, elsewhere N=2), FDW driver can
 reference the RestrictInfo to be used in relations join.

 Anyway, I'll try to investigate the existing code for more detail today,
 to clarify whether the above approach is feasible.

Sounds good.  Keep in mind that, while the parse tree will obviously
reflect the way that the user actually specified the join
syntactically, it's not the job of the join_info_list to make it
simple to reconstruct that information.  To the contrary,
join_info_list is supposed to be structured in a way that makes it
easy to determine whether *a particular join order is one of the legal
join orders* not *whether it is the specific join order selected by
the user*.  See join_is_legal().

For FDW pushdown, I think it's sufficient to be able to identify *any
one* legal join order, not necessarily the same order the user
originally entered.  For exampe, if the user entered A LEFT JOIN B ON
A.x = B.x LEFT JOIN C ON A.y = C.y and the FDW generates a query that
instead does A LEFT JOIN C ON a.y = C.y LEFT JOIN B ON A.x = B.x, I
suspect that's just fine.  Particular FDWs might wish to try to be
smart about what they emit based on knowledge of what the remote
side's optimizer is likely to do, and that's fine.  If the remote side
is PostgreSQL, it shouldn't matter much.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch for GROUPING SETS phase 1

2015-03-18 Thread Svenne Krap
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:tested, passed

This is a midway review, a later will complete it. 

The patch applies against 8d1f239003d0245dda636dfa6cf0add13bee69d6 and builds 
correctly. Make installcheck-world fails, but it seems to be somewhere totally 
unrelated (TCL pl)... 

The documentation is very well-written and the patch implements the documented 
syntax.

I still need to check against the standard and I will run it against a 
non-trivival production load... hopefully I will finish up my review shortly 
after the weekend...



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-18 Thread Robert Haas
On Tue, Mar 17, 2015 at 6:27 PM, Peter Geoghegan p...@heroku.com wrote:
 On Tue, Mar 17, 2015 at 12:11 PM, Heikki Linnakangas hlinn...@iki.fi wrote:
 I'm still not sure the way the speculative locking works is the best
 approach. Instead of clearing xmin on super-deletion, a new flag on the heap
 tuple seems more straightforward. And we could put the speculative insertion
 token in t_ctid, instead of stashing it in the PGPROC array. That would
 again seem more straightforward.

 I see the appeal of that approach. What concerns me about that
 approach is that it makes it the problem of the inserter to confirm
 its insertion, even though overwhelmingly, speculative insertions
 succeeds (the race window is small due to the pre-check). The current
 speculative token is legitimately a very short lived thing, a thing
 that I hesitate to write to disk at all. So our optimistic approach to
 inserting speculatively loses its optimism, which I don't like, if you
 know what I mean.

Modifying a shared buffer is not the same as writing to disk.

 If I'm reading this correctly, if there are multiple indexes that match the
 unique index inference specification, we pick the cheapest one. Isn't that
 unstable? Two backends running the same INSERT ON CONFLICT statement might
 pick different indexes, and the decision might change over time as the table
 is analyzed. I think we should have a more robust rule. Could we easily just
 use all matching indexes?

 Robert feel pretty strongly that there should be a costing aspect to
 this. Initially I was skeptical of this, but just did what he said all
 the same. But I've since come around to his view entirely because we
 now support partial unique indexes.

I think Heikki's concern is something different, although I am not
altogether up to speed on this and may be confused.  The issue is:
suppose that process A and process B are both furiously upserting into
the same table with the same key column but, because they have
different costing parameters, process A consistently chooses index X
and process B consistently chooses index Y.  In that situation, will
we deadlock, livelock, error out, bloat, or get any other undesirable
behavior, or is that A-OK?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC idea - Simulated annealing to search for query plans

2015-03-18 Thread Grzegorz Parka

 I think someone already mentioned it, but it would be very neat if the
 optimizer could be pluggable. Then many different algorithms could be
 evaluated more easily.


Does it mean just to make the join order optimizer pluggable? If yes then
it is already pluggable as an extension. Is this the desired approach so
far?
I'm thinking of testing and improving SAIO as an extension before reaching
a satisfactory quality of code and returned plans.

Would then the destination be the /contrib and then main source tree or
would it ever stay as an extension?

Best regards,
Grzegorz

2015-02-27 15:03 GMT+01:00 k...@rice.edu k...@rice.edu:

 On Thu, Feb 26, 2015 at 10:59:50PM +0100, Grzegorz Parka wrote:
  Dear Hackers,
 
  I'm Grzegorz Parka, BSc Engineer of Technical Physics and student of
  Computer Science at WUT, Poland. Last year I've been a bit into
  evolutionary algorithms and during my research I found out about GEQO in
  Postgres. I also found out that there are plans to try a different
 attempt
  to find optimal query plans and thought it could be a good thing to use
 it
  as a project for GSoC.
 
  I'm interested in one of old TODO items related to the optimizer -
  'Consider compressed annealing to search for query plans'. I believe this
  would be potentially beneficial to Postgres to check if such a heuristic
  could really choose better query plans than GEQO does. Judging by the
  results that simulated annealing gives on the travelling salesman
 problem,
  it looks like a simpler and potentially more effective way of
 combinatorial
  optimization.
 
  As deliverables of such a project I would provide a simple implementation
  of basic simulated annealing optimizer and some form of quantitative
  comparison with GEQO.
 
  I see that this may be considerably bigger than most of GSoC projects,
 but
  I would like to know your opinion. Do you think that this would be
  beneficial enough to make a proper GSoC project? I would also like to
 know
  if you have any additional ideas about this project.
 
  Best regards,
  Grzegorz Parka

 Hi,

 I think someone already mentioned it, but it would be very neat if the
 optimizer could be pluggable. Then many different algorithms could be
 evaluated more easily.

 Regards,
 Ken



Re: [HACKERS] MD5 authentication needs help -SCRAM

2015-03-18 Thread Alvaro Herrera
Abhijit Menon-Sen wrote:

 P.S. I don't know why the SRP code was removed from LibreSSL; nor am I
 sure how seriously to take that. It's possible that it's only because
 it's (still) rather obscure.

As I recall, the working principle of the LibreSSL guys is to remove
everything that can't be understood quickly, to reduce the code base to
the minimum required to support the basic features they want, and still
be sure that there are little or no security holes.  In a later stage
their intention is to re-add interesting features as they have time to
audit the code.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg9.4 relpages of child tables

2015-03-18 Thread Justin Pryzby
On Wed, Mar 18, 2015 at 12:11:22PM -0400, Tom Lane wrote:
 Justin Pryzby pry...@telsasoft.com writes:
  I believe there's been a behavior change, and not sure if it's deliberate.  
  I
  don't think there's a negative consequence for our production use, but it
  confused me while summing relpages for analysis purposes, as our 9.4 
  customers
  behaved differently.
 
 I suspect that you're getting confused because autovacuum kicked in on the

It seems you're right.  I was additionally confused because the autovacuum had
processed the most recent tables (triggered by insertion; partitioned by date),
but had not processed some of last month's tables (which I was querying for
relpages, since it's a complete month), since this DB was upgraded to pg9.4
last month.

Thanks,
Justin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg9.4 relpages of child tables

2015-03-18 Thread Justin Pryzby
I believe there's been a behavior change, and not sure if it's deliberate.  I
don't think there's a negative consequence for our production use, but it
confused me while summing relpages for analysis purposes, as our 9.4 customers
behaved differently.

Documentation indicates that in pg9.0, ANALYZE of a parent table included
statistics of its children.

Under both pg9.3 and 9.4, this returns no stats rows, after the parent table is
analyzed.  It returns stats if the child is analyzed.
SELECT * FROM pg_statistic WHERE starelid='.._2014_01'::regclass

However, in pg9.4, the child's pg_class.relpages is 0 (apparently, 1 for
indices) even after the parent is analyzed (and is an approximate number of
pages if the child is analyzed).

On pg93:
pryzbyj=# create table t (i int);
pryzbyj=# create table t2 (like t) inherits(t);
pryzbyj=# insert into t2(SELECT generate_series(1,10) ORDER BY RANDOM());
pryzbyj=# select relpages from pg_class where relname='t2';
 = 0
pryzbyj=# analyze t;
pryzbyj=# select relpages from pg_class where relname='t2';
 = 885

On pg94:
ts=# create table t (i int);
ts=# create table t2 (like t) inherits(t);
ts=# insert into t2(SELECT generate_series(1,10) ORDER BY RANDOM());
ts=# select relpages from pg_class where relname='t2';
 = 0
ts=# analyze t;
ts=# select relpages from pg_class where relname='t2'; -- this changed
 = 0
ts=# analyze t2;
ts=# select relpages from pg_class where relname='t2';
 = 443

Is that a deliberate change, and if so, is there any documentation of it?  I'd
prefer to avoid analyzing all our child tables, as all queries hit the parents,
which include statistics on the children.

Thanks,
Justin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-03-18 Thread Robert Haas
On Wed, Mar 18, 2015 at 2:22 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 Can you try this:

 diff --git a/src/backend/postmaster/bgworker.c
 b/src/backend/postmaster/bgworker.c
 index f80141a..39b919f 100644
 --- a/src/backend/postmaster/bgworker.c
 +++ b/src/backend/postmaster/bgworker.c
 @@ -244,6 +244,8 @@ BackgroundWorkerStateChange(void)
 rw-rw_terminate = true;
 if (rw-rw_pid != 0)
 kill(rw-rw_pid, SIGTERM);
 +   else
 +   ReportBackgroundWorkerPID(rw);
 }
 continue;
 }


 It didn't fix the problem.  IIUC, you have done this to ensure that
 if worker is not already started, then update it's pid, so that we
 can get the required status in WaitForBackgroundWorkerShutdown().
 As this is a timing issue, it can so happen that before Postmaster
 gets a chance to report the pid, backend has already started waiting
 on WaitLatch().

I think I figured out the problem.  That fix only helps in the case
where the postmaster noticed the new registration previously but
didn't start the worker, and then later notices the termination.
What's much more likely to happen is that the worker is started and
terminated so quickly that both happen before we create a
RegisteredBgWorker for it.  The attached patch fixes that case, too.

Assuming this actually fixes the problem, I think we should back-patch
it into 9.4.  To recap, the problem is that, at present, if you
register a worker and then terminate it before it's launched,
GetBackgroundWorkerPid() will still return BGWH_NOT_YET_STARTED, which
it makes it seem like we're still waiting for it to start.  But when
or if the slot is reused for an unrelated registration, then
GetBackgroundWorkerPid() will switch to returning BGWH_STOPPED.  It's
hard to believe that's the behavior anyone wants.  With this patch,
the return value will always be BGWH_STOPPED in this situation.  That
has the virtue of being consistent, and practically speaking I think
it's the behavior that everyone will want, because the case where this
matters is when you are waiting for workers to start or waiting for
worker to stop, and in either case you will want to treat a worker
that was marked for termination before the postmaster actually started
it as already-stopped rather than not-yet-started.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


stop-notify-fix-v2.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg9.4 relpages of child tables

2015-03-18 Thread Tom Lane
Justin Pryzby pry...@telsasoft.com writes:
 I believe there's been a behavior change, and not sure if it's deliberate.  I
 don't think there's a negative consequence for our production use, but it
 confused me while summing relpages for analysis purposes, as our 9.4 customers
 behaved differently.

I don't see any difference in the behavior of HEAD and 9.0 on this point.

 Documentation indicates that in pg9.0, ANALYZE of a parent table included
 statistics of its children.

Well, ANALYZE on a parent table will collect statistics for that table
as well as some whole tree statistics that cover parent plus children;
but the latter are just data distribution stats entered into pg_statistic.
I don't see any indication that any PG version will update the childrens'
relpages values while doing that.

I suspect that you're getting confused because autovacuum kicked in on the
child and updated those stats behind your back.  For me, the child's
relpages reads as zero even after the ANALYZE, but if I wait a minute or
so, it changes to a nonzero value because the autovacuum daemon updated
it.

See also the future directions thread nearby.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can pg_dump make use of CURRENT/SESSION_USER

2015-03-18 Thread Robert Haas
On Tue, Mar 17, 2015 at 1:24 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 If we ever implement something like

 COMMENT ON CURRENT_DATABASE IS ...

 it will be useful, because you will be able to restore a dump into
 another database and have the comment apply to the target database.
 (Also, I wonder about
 ALTER USER foo IN DATABASE current_database ...
 because that will let us dump per-database user options too.)

+1 for both of those ideas.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Left lateral join with for update and skip locked

2015-03-18 Thread Alvaro Herrera
Benjamin Börngen-Schmidt wrote:

 The Lateral statement does return a result, which I do not expect. I returns
 an end point multiple times for diverent start points. Why? I thought, that
 the selected point by the lateral is locked by the FOR UPDATE and if the
 lateral selects a point that is already locked it will be skipped by the
 SKIP LOCKED.

So you want the LATERAL to lock a row, such that that row is not
returned by the s arm of the left join in the same query because of
SKIP LOCKED?  That seems flawed to me: the row lock is considered
automatically granted if the would-be locker is the same transaction as
the lock holder.

I am too lazy to reverse engineer your schema.  Are de_sim_points_end
and de_sim_points_start views on the same table(s), or something like
that?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add LINE: hint when schemaname.typename is a non-existent schema

2015-03-18 Thread Alvaro Herrera
Jeevan Chalke wrote:
 Álvaro,
 
 I think, there are few open questions here and thus marking it back to 
 Waiting on Author.
 
 Please have your views on the review comments already posted.

For some reason I missed your previous email.

 Also make changes as Tom suggested about placing pstate at the beginning.

Sure.

 1.
 postgres=# create or replace function
 f1(a abc.test.id%type) returns int as
 $$ select 1; $$
 language sql;
 ERROR:  schema abc does not exist
 
 Is that expected?

Type resolution for function arguments happens at execution time,
in interpret_function_parameter_list() as called by CreateFunction().
We don't have a pstate at that point which is why location is not
reported.  Fixing that seems like a whole new project.

 2.
 Also what about pushing setup_parser_errposition_callback() inside 
 func_get_detail() as well, just to limit it for namespace lookup?

Hm, that sounds more reasonable.  Actually it means we have to drill
all the way down to FuncnameGetCandidates.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-18 Thread Robert Haas
On Tue, Mar 17, 2015 at 3:11 PM, Heikki Linnakangas hlinn...@iki.fi wrote:
 I've been thinking that it would be nice to be able to specify a constraint
 name. Naming an index directly feels wrong, as in relational and SQL
 philosophy, indexes are just an implementation detail, but naming a
 constraint is a fair game. It would also be nice to be able to specify use
 the primary key.

Intuitively, I think you should specify an operator name, not a
constraint name.  That's what we do for, e.g., exclusion constraints,
and it feels right.  People sometimes create and drop indexes (and
thus, perhaps, the constraints that depend on them) for maintenance
reasons where a change in semantics will be unwelcome.  But I don't
accept Peter's argument that it's OK to be indifferent to which
particular equality semantics are being used.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rethinking the parameter access hooks for plpgsql's benefit

2015-03-18 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 Seriously? In my opinion it has to be possible to doubt whether a patch
 should be committed in certain release without it being interpreted as a
 personal attack.

I don't think anyone's said anything in this thread that amounts to a
personal attack.  We have a difference of opinion on policy, and what
I'm saying is that the policy ultimately reduces to trusting individual
committers to use their best judgment.  If someone's going to tell me
that my judgment about when to push something is not acceptable, then
they probably ought to be calling for removal of my commit privileges.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC idea - Simulated annealing to search for query plans

2015-03-18 Thread Tom Lane
Grzegorz Parka grzegorz.pa...@gmail.com writes:
 I'm thinking of testing and improving SAIO as an extension before reaching
 a satisfactory quality of code and returned plans.

 Would then the destination be the /contrib and then main source tree or
 would it ever stay as an extension?

I'd like to push it into the main tree, replacing GEQO, if as we suspect
it turns out to dominate GEQO on speed/plan quality.  There's no reason
not to test it in the manner you describe though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 authentication needs help -SCRAM

2015-03-18 Thread Stephen Frost
* Abhijit Menon-Sen (a...@2ndquadrant.com) wrote:
 As a followup, I spoke to an IETF friend who's used and implemented both
 SRP and SCRAM. He agrees that SRP is cryptographically solid, that it's
 significantly more difficult to implement (and therefore has a bit of a
 monoculture risk overall, though of course that wouldn't apply to us if
 we were to write the code from scratch).

There is also 'JPAKE':

http://en.wikipedia.org/wiki/Password_Authenticated_Key_Exchange_by_Juggling

Which had been in OpenSSH and OpenSSL and is still in NSS and Firefox
Sync.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Rethinking the parameter access hooks for plpgsql's benefit

2015-03-18 Thread Robert Haas
On Wed, Mar 18, 2015 at 1:12 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Mar 17, 2015 at 8:01 PM, Bruce Momjian br...@momjian.us wrote:
 Basically, the same rules apply to all commitfests, i.e. a committer can
 apply anything during that period.  I think the only restriction for the
 last commitfest is that the committer can not apply a new patch that
 would have been too big to be submitted to the last commitfest. If
 enough people feel that this committer behavior during the last
 commitfest is a problem, we can discuss changing that policy.

 One thing that's crystal clear here is that we don't all agree on what
 the policy actually is.

 Indeed.  In this case, since the patch in question is one that
 improves/simplifies a patch that's already in the current commitfest,
 I'm going to go ahead and push it.  If you want to call a vote on
 revoking my commit bit, go right ahead.

One might almost get the impression you don't think we're all on the
same team, here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can pg_dump make use of CURRENT/SESSION_USER

2015-03-18 Thread Fabrízio de Royes Mello
On Wed, Mar 18, 2015 at 1:24 PM, Robert Haas robertmh...@gmail.com wrote:

 On Tue, Mar 17, 2015 at 1:24 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  If we ever implement something like
 
  COMMENT ON CURRENT_DATABASE IS ...
 
  it will be useful, because you will be able to restore a dump into
  another database and have the comment apply to the target database.

I think it's simple to implement, but how about pg_dump... we need to add
new option (like --use-current-database) or am I missing something ?


  (Also, I wonder about
  ALTER USER foo IN DATABASE current_database ...
  because that will let us dump per-database user options too.)

 +1 for both of those ideas.


Can you explain more about this idea?

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello


Re: [HACKERS] Strange assertion using VACOPT_FREEZE in vacuum.c

2015-03-18 Thread Alvaro Herrera
Michael Paquier wrote:

 I had a look at your modified version, and it looks good to me.

Thanks, pushed.  (Without the va_cols change proposed downthread.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can pg_dump make use of CURRENT/SESSION_USER

2015-03-18 Thread Fabrízio de Royes Mello
On Wed, Mar 18, 2015 at 2:42 PM, Fabrízio de Royes Mello 
fabriziome...@gmail.com wrote:



 On Wed, Mar 18, 2015 at 2:24 PM, Robert Haas robertmh...@gmail.com
wrote:
 
  On Wed, Mar 18, 2015 at 1:23 PM, Fabrízio de Royes Mello
  fabriziome...@gmail.com wrote:
If we ever implement something like
   
COMMENT ON CURRENT_DATABASE IS ...
   
it will be useful, because you will be able to restore a dump into
another database and have the comment apply to the target database.
  
   I think it's simple to implement, but how about pg_dump... we need to
add
   new option (like --use-current-database) or am I missing something ?
 
  I think we'd just change it to use the new syntax, full stop.  I see
  no need for an option.
 

 Ok.


(Also, I wonder about
ALTER USER foo IN DATABASE current_database ...
because that will let us dump per-database user options too.)
  
   +1 for both of those ideas.
  
   Can you explain more about this idea?
 
  Uh, what do you want explained?
 

 There's no need, sorry. I just misunderstood it first.

 I'll provide the patches.


Just one last doubt. Do we expose a new function called current_database
like current_catalog, current_user, ... ?

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello


Re: [HACKERS] Rethinking the parameter access hooks for plpgsql's benefit

2015-03-18 Thread Robert Haas
On Wed, Mar 18, 2015 at 2:01 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andres Freund and...@2ndquadrant.com writes:
 Seriously? In my opinion it has to be possible to doubt whether a patch
 should be committed in certain release without it being interpreted as a
 personal attack.

 I don't think anyone's said anything in this thread that amounts to a
 personal attack.  We have a difference of opinion on policy, and what
 I'm saying is that the policy ultimately reduces to trusting individual
 committers to use their best judgment.  If someone's going to tell me
 that my judgment about when to push something is not acceptable, then
 they probably ought to be calling for removal of my commit privileges.

Neither I nor anyone else is prepared to do that on the basis of what
happens to this one patch.  But if we adopt a project policy that says
a committer can ignore the contrary opinions of other people, even
when those other people include multiple other committers, this
project will not, in the long term, be successful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-03-18 Thread Robert Haas
On Sat, Mar 14, 2015 at 1:04 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 # EXPLAIN SELECT DISTINCT bid FROM pgbench_accounts;
 ERROR:  too many dynamic shared memory segments

 This happens because we have maximum limit on the number of
 dynamic shared memory segments in the system.

 In function dsm_postmaster_startup(), it is defined as follows:

 maxitems = PG_DYNSHMEM_FIXED_SLOTS
 + PG_DYNSHMEM_SLOTS_PER_BACKEND * MaxBackends;

 In the above case, it is choosing parallel plan for each of the
 AppendRelation,
 (because of seq_page_cost = 1000) and that causes the test to
 cross max limit of dsm segments.

The problem here is, of course, that each parallel sequential scan is
trying to create an entirely separate group of workers.  Eventually, I
think we should fix this by rejiggering things so that when there are
multiple parallel nodes in a plan, they all share a pool of workers.
So each worker would actually get a list of plan nodes instead of a
single plan node.  Maybe it works on the first node in the list until
that's done, and then moves onto the next, or maybe it round-robins
among all the nodes and works on the ones where the output tuple
queues aren't currently full, or maybe the master somehow notifies the
workers which nodes are most useful to work on at the present time.
But I think trying to figure this out is far too ambitious for 9.5,
and I think we can have a useful feature without implementing any of
it.

But, we can't just ignore the issue right now, because erroring out on
a large inheritance hierarchy is no good.  Instead, we should fall
back to non-parallel operation in this case.  By the time we discover
the problem, it's too late to change the plan, because it's already
execution time.  So we are going to be stuck executing the parallel
node - just with no workers to help.  However, what I think we can do
is use a slab of backend-private memory instead of a dynamic shared
memory segment, and in that way avoid this error.  We do something
similar when starting the postmaster in stand-alone mode: the main
shared memory segment is replaced by a backend-private allocation with
the same contents that the shared memory segment would normally have.
The same fix will work here.

Even once we make the planner and executor smarter, so that they don't
create lots of shared memory segments and lots of separate worker
pools in this type of case, it's probably still useful to have this as
a fallback approach, because there's always the possibility that some
other client of the dynamic shared memory system could gobble up all
the segments.  So, I'm going to go try to figure out the best way to
implement this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can pg_dump make use of CURRENT/SESSION_USER

2015-03-18 Thread Robert Haas
On Wed, Mar 18, 2015 at 1:23 PM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:
  If we ever implement something like
 
  COMMENT ON CURRENT_DATABASE IS ...
 
  it will be useful, because you will be able to restore a dump into
  another database and have the comment apply to the target database.

 I think it's simple to implement, but how about pg_dump... we need to add
 new option (like --use-current-database) or am I missing something ?

I think we'd just change it to use the new syntax, full stop.  I see
no need for an option.

  (Also, I wonder about
  ALTER USER foo IN DATABASE current_database ...
  because that will let us dump per-database user options too.)

 +1 for both of those ideas.

 Can you explain more about this idea?

Uh, what do you want explained?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add LINE: hint when schemaname.typename is a non-existent schema

2015-03-18 Thread Alvaro Herrera
Jeevan,

Thanks for the review.

Jeevan Chalke wrote:

 I think, there are few open questions here and thus marking it back to 
 Waiting on Author.
 
 Please have your views on the review comments already posted.
 Also make changes as Tom suggested about placing pstate at the beginning.

Pushed with that one change.

 postgres=# create or replace function
 f1(a abc.test.id%type) returns int as
 $$ select 1; $$
 language sql;
 ERROR:  schema abc does not exist
 
 Is that expected?

We can leave this for a future patch.

 2.
 Also what about pushing setup_parser_errposition_callback() inside 
 func_get_detail() as well, just to limit it for namespace lookup?

Seemed messy enough that I left this as-is; and as Tom said elsewhere, I
think it's okay to have cursor position in other errors too.  At the
very least, the user will know for certain what's the function being
processed that caused whatever failure.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can pg_dump make use of CURRENT/SESSION_USER

2015-03-18 Thread Robert Haas
On Wed, Mar 18, 2015 at 2:09 PM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:
 Just one last doubt. Do we expose a new function called current_database
 like current_catalog, current_user, ... ?

Why would we do that?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Install shared libs in lib/ and bin/ with MSVC (Was: install libpq.dll in bin directory on Windows / Cygwin)

2015-03-18 Thread Alvaro Herrera
Michael Paquier wrote:
 On Wed, Mar 18, 2015 at 3:13 AM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:

  I mean, can't we just do the push unconditionally here?
 
 Why should we install unnecessary stuff? This complicates the
 installation contents, the point being to have only shared libraries's
 dll installed in bin/, and make things consistent with what MinGW
 does.

I was expecting that if no .dll or .lib file is expected to be
installed, then the file would not be present in the first place.
I just wanted to avoid what seems fragile coding.

  Surely if there are no lib/dll files in the subdirectory, nothing will
  happen, right?  (I haven't actually tried.)
 
 No, it fails. And we should actually have a bin/lib that has been
 correctly generated. Do you think this is a problem? Normally we
 *should* fail IMO, meaning that the build process has broken what it
 should have done.

Makes sense.

I have pushed your patch; we'll see what the buildfarm thinks of it.
(Sadly, the number of MSVC members is rather small and they don't run
often.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch : Allow toast tables to be moved to a different tablespace

2015-03-18 Thread Andreas Karlsson

On 03/17/2015 09:00 AM, Julien Tachoires wrote:

Here is a new version fixing this issue. I've added a new kind of TOC
entry for being able to handle pg_restore --no-tablespace case.


Looks good but I think one minor improvement could be to set the table 
space of the toast entires to the same as the tablespace of the table to 
reduce the amount of SET default_tablespace. What do you think?


--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange assertion using VACOPT_FREEZE in vacuum.c

2015-03-18 Thread Michael Paquier
On Thu, Mar 19, 2015 at 2:44 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Michael Paquier wrote:

 I had a look at your modified version, and it looks good to me.

 Thanks, pushed.  (Without the va_cols change proposed downthread.)

Thanks a lot! I will shortly work on the rebase for the other patch.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using 128-bit integers for sum, avg and statistics aggregates

2015-03-18 Thread Peter Geoghegan
On Wed, Mar 18, 2015 at 3:16 PM, Andres Freund and...@2ndquadrant.com wrote:
 For another, Andreas has chosen to lump together __int128 and unsigned
 __int128 into one test, where the latter really doesn't receive
 coverage.

 On my urging actually. It's pretty darn unlikely that only one variant
 will work.

I think that makes sense. Since we're targeting GCC here, and we're
comfortable with that, I guess that's okay after all.

 The reason we need a link test (vs just a compile test) is that gcc
 links to helper functions to do math - even if they're not present on
 the target platform. Compiling will succeed, but linking won't.

Okay. Attached revision has a few tweaks that reflect the status of
int128/uint128 as specialized types that are basically only useful for
this optimization, or other similar optimizations on compilers that
either are GCC, or aim to be compatible with it. I don't think
Andreas' V9 reflected that sufficiently.

Also, I now always use PolyNumAggState (the typedef), even for #define
HAVE_INT128 code, which I think is a bit clearer. Note that I have
generated a minimal diff, without the machine generated changes that
are ordinarily included in the final commit when autoconf tests are
added, mostly because I do not have the exact version of autoconf on
my development machine required to do this without creating irrelevant
cruft.

Marked Ready for committer.

A committer that has the exact right version of autoconf (GNU Autoconf
2.69), or perhaps Andreas can build the machine generated parts.
Andres may wish to take another look at the autoconf changes.

-- 
Peter Geoghegan
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 509f961..2db04ab 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -125,6 +125,43 @@ undefine([Ac_cachevar])dnl
 ])# PGAC_TYPE_64BIT_INT
 
 
+# PGAC_HAVE_GCC__INT128
+# -
+# Check if __int128 is a working 128 bit integer type, and if so define
+# HAVE_GCC__INT128.  This is a GCC extension, unlike the other integer types,
+# which C99 defines.
+AC_DEFUN([PGAC_HAVE_GCC__INT128],
+[AC_CACHE_CHECK(for __int128, pgac_cv_have_gcc__int128,
+[AC_TRY_RUN([
+__int128 a = 2001;
+__int128 b = 4005;
+
+int does__int128_work()
+{
+  __int128 c,d;
+
+  c = a * b;
+  d = (c + b) / b;
+  if (d != a+1)
+return 0;
+  return 1;
+}
+
+main() {
+  exit(! does__int128_work());
+}],
+[pgac_cv_have_gcc__int128=yes],
+[pgac_cv_have_gcc__int128=no],
+[# If cross-compiling, check the size reported by the compiler and
+# trust that the arithmetic works.
+AC_COMPILE_IFELSE([AC_LANG_BOOL_COMPILE_TRY([], [sizeof(__int128) == 16])],
+  pgac_cv_have_gcc__int128=yes,
+  pgac_cv_have_gcc__int128=no)])])
+
+if test x$pgac_cv_have_gcc__int128 = xyes ; then
+  AC_DEFINE(HAVE_GCC__INT128, 1, [Define to 1 if the system has the type `__int128'.])
+fi])# PGAC_HAVE_GCC__INT128
+
 
 # PGAC_C_FUNCNAME_SUPPORT
 # ---
diff --git a/configure.in b/configure.in
index ca29e93..6b1c251 100644
--- a/configure.in
+++ b/configure.in
@@ -1771,6 +1771,10 @@ AC_CHECK_TYPES([int8, uint8, int64, uint64], [], [],
 # C, but is missing on some old platforms.
 AC_CHECK_TYPES(sig_atomic_t, [], [], [#include signal.h])
 
+# Check for extension offering the integer scalar type __int128.  This is
+# supported for targets which have an integer mode wide enough to hold 128 bits.
+PGAC_HAVE_GCC__INT128
+
 # Check for various atomic operations now that we have checked how to declare
 # 64bit integers.
 PGAC_HAVE_GCC__SYNC_CHAR_TAS
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index 715917b..b925d29 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -402,7 +402,10 @@ static void apply_typmod(NumericVar *var, int32 typmod);
 
 static int32 numericvar_to_int4(NumericVar *var);
 static bool numericvar_to_int8(NumericVar *var, int64 *result);
-static void int8_to_numericvar(int64 val, NumericVar *var);
+static void int64_to_numericvar(int64 val, NumericVar *var);
+#ifdef HAVE_INT128
+static void int128_to_numericvar(int128 val, NumericVar *var);
+#endif
 static double numeric_to_double_no_overflow(Numeric num);
 static double numericvar_to_double_no_overflow(NumericVar *var);
 
@@ -1414,7 +1417,7 @@ width_bucket_numeric(PG_FUNCTION_ARGS)
 	init_var(count_var);
 
 	/* Convert 'count' to a numeric, for ease of use later */
-	int8_to_numericvar((int64) count, count_var);
+	int64_to_numericvar((int64) count, count_var);
 
 	switch (cmp_numerics(bound1, bound2))
 	{
@@ -2083,14 +2086,14 @@ numeric_fac(PG_FUNCTION_ARGS)
 	init_var(fact);
 	init_var(result);
 
-	int8_to_numericvar(num, result);
+	int64_to_numericvar(num, result);
 
 	for (num = num - 1; num  1; num--)
 	{
 		/* this loop can take awhile, so allow it to be interrupted */
 		CHECK_FOR_INTERRUPTS();
 
-		int8_to_numericvar(num, fact);
+		int64_to_numericvar(num, fact);
 
 		mul_var(result, fact, 

Re: [HACKERS] Install shared libs in lib/ and bin/ with MSVC (Was: install libpq.dll in bin directory on Windows / Cygwin)

2015-03-18 Thread Michael Paquier
On Thu, Mar 19, 2015 at 3:30 AM, Alvaro Herrera wrote:
 Makes sense.

 I have pushed your patch; we'll see what the buildfarm thinks of it.

Thanks.

 (Sadly, the number of MSVC members is rather small and they don't run
 often.)

Once Windows 10 is out, it could be installed on a Raspberry PI 2.
This would make a nice buildfarm member with MSVC 2015.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel mode and parallel contexts

2015-03-18 Thread Andres Freund
On 2015-03-18 12:02:07 -0400, Robert Haas wrote:
 diff --git a/src/backend/access/heap/heapam.c 
 b/src/backend/access/heap/heapam.c
 index cb6f8a3..173f0ba 100644
 --- a/src/backend/access/heap/heapam.c
 +++ b/src/backend/access/heap/heapam.c
 @@ -2234,6 +2234,17 @@ static HeapTuple
  heap_prepare_insert(Relation relation, HeapTuple tup, TransactionId xid,
   CommandId cid, int options)
  {
 + /*
 +  * For now, parallel operations are required to be strictly read-only.
 +  * Unlike heap_update() and heap_delete(), an insert should never create
 +  * a combo CID, so it might be possible to relax this restrction, but
 +  * not without more thought and testing.
 +  */
 + if (IsInParallelMode())
 + ereport(ERROR,
 + (errcode(ERRCODE_INVALID_TRANSACTION_STATE),
 +  errmsg(cannot insert tuples during a parallel 
 operation)));
 +

Minor nitpick: Should we perhaps move the ereport to a separate
function? Akin to PreventTransactionChain()? Seems a bit nicer to not
have the same message everywhere.

 +void
 +DestroyParallelContext(ParallelContext *pcxt)
 +{
 + int i;
 +
 + /*
 +  * Be careful about order of operations here!  We remove the parallel
 +  * context from the list before we do anything else; otherwise, if an
 +  * error occurs during a subsequent step, we might try to nuke it again
 +  * from AtEOXact_Parallel or AtEOSubXact_Parallel.
 +  */
 + dlist_delete(pcxt-node);

Hm. I'm wondering about this. What if we actually fail below? Like in
dsm_detach() or it's callbacks? Then we'll enter abort again at some
point (during proc_exit at the latest) but will not wait for the child
workers. Right?
 +/*
 + * End-of-subtransaction cleanup for parallel contexts.
 + *
 + * Currently, it's forbidden to enter or leave a subtransaction while
 + * parallel mode is in effect, so we could just blow away everything.  But
 + * we may want to relax that restriction in the future, so this code
 + * contemplates that there may be multiple subtransaction IDs in pcxt_list.
 + */
 +void
 +AtEOSubXact_Parallel(bool isCommit, SubTransactionId mySubId)
 +{
 + while (!dlist_is_empty(pcxt_list))
 + {
 + ParallelContext *pcxt;
 +
 + pcxt = dlist_head_element(ParallelContext, node, pcxt_list);
 + if (pcxt-subid != mySubId)
 + break;
 + if (isCommit)
 + elog(WARNING, leaked parallel context);
 + DestroyParallelContext(pcxt);
 + }
 +}

 +/*
 + * End-of-transaction cleanup for parallel contexts.
 + */
 +void
 +AtEOXact_Parallel(bool isCommit)
 +{
 + while (!dlist_is_empty(pcxt_list))
 + {
 + ParallelContext *pcxt;
 +
 + pcxt = dlist_head_element(ParallelContext, node, pcxt_list);
 + if (isCommit)
 + elog(WARNING, leaked parallel context);
 + DestroyParallelContext(pcxt);
 + }
 +}

Is there any reason to treat the isCommit case as a warning? To me that
sounds like a pretty much guaranteed programming error. If your're going
to argue that a couple resource leakage warnings do similarly: I don't
think that counts for that much. For one e.g. a leaked tupdesc has much
less consequences, for another IIRC those warnings were introduced
after the fact.

 + * When running as a parallel worker, we place only a single
 + * TransactionStateData is placed on the parallel worker's
 + * state stack,

'we place .. is placed'

  /*
 + *   EnterParallelMode
 + */
 +void
 +EnterParallelMode(void)
 +{
 + TransactionState s = CurrentTransactionState;
 +
 + Assert(s-parallelModeLevel = 0);
 +
 + ++s-parallelModeLevel;
 +}
 +
 +/*
 + *   ExitParallelMode
 + */
 +void
 +ExitParallelMode(void)
 +{
 + TransactionState s = CurrentTransactionState;
 +
 + Assert(s-parallelModeLevel  0);
 + Assert(s-parallelModeLevel  1 || !ParallelContextActive());
 +
 + --s-parallelModeLevel;
 +
 + if (s-parallelModeLevel == 0)
 + CheckForRetainedParallelLocks();
 +}

Hm. Is it actually ok for nested parallel mode to retain locks on their
own? Won't that pose a problem? Or did you do it that way just because
we don't have more state? If so that deserves a comment explaining that
htat's the case and why that's acceptable.
 @@ -1769,6 +1904,9 @@ CommitTransaction(void)
  {
   TransactionState s = CurrentTransactionState;
   TransactionId latestXid;
 + boolparallel;
 +
 + parallel = (s-blockState == TBLOCK_PARALLEL_INPROGRESS);

 + /* If we might have parallel workers, clean them up now. */
 + if (IsInParallelMode())
 + {
 + CheckForRetainedParallelLocks();
 + AtEOXact_Parallel(true);
 + s-parallelModeLevel = 0;
 + }

'parallel' looks strange when we're also, rightly so, do stuff like
checking 

Re: [HACKERS] Using 128-bit integers for sum, avg and statistics aggregates

2015-03-18 Thread Peter Geoghegan
On Wed, Mar 18, 2015 at 4:07 PM, Andres Freund and...@2ndquadrant.com wrote:
 Given that we don't rely on C99, I don't think that actually
 matters. Lots of our platforms build on pre C99 compilers... I think it
 makes sense to say that this currently only tests for a gcc extension
 and might be extended in the future. But nothing more.

Works for me.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reduce pinning in btree indexes

2015-03-18 Thread Greg Stark
On Sat, Mar 14, 2015 at 4:07 PM, Simon Riggs si...@2ndquadrant.com wrote:

 -1 for a time based setting.

 After years of consideration, bloat is now controllable by altering
 the size of the undo tablespace.


Hm. Well, fwiw the situation is rather more complicated than that. You're
correct that you can create a fixed size undo tablespace in which case
Oracle will tell you approximately how much retention period you can rely
on given that size. But alternately you can set the undo tablespace to be
autoextend and Oracle will tune it to ensure there's enough retention to
cover the longest-running query in the system. You can also set a
time-based parameter (undo_retention) to ensure there's always at least
enough data retained to cover a fixed amount of time for flashback
(Oracle's equivalent to our time travel feature in ancient versions of
Postgres).


-- 
greg


Re: [HACKERS] Using 128-bit integers for sum, avg and statistics aggregates

2015-03-18 Thread Andres Freund
On 2015-03-18 15:59:52 -0700, Peter Geoghegan wrote:
 Okay. Attached revision has a few tweaks that reflect the status of
 int128/uint128 as specialized types that are basically only useful for
 this optimization, or other similar optimizations on compilers that
 either are GCC, or aim to be compatible with it. I don't think
 Andreas' V9 reflected that sufficiently.

Given that we don't rely on C99, I don't think that actually
matters. Lots of our platforms build on pre C99 compilers... I think it
makes sense to say that this currently only tests for a gcc extension
and might be extended in the future. But nothing more.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using 128-bit integers for sum, avg and statistics aggregates

2015-03-18 Thread Andreas Karlsson

On 03/18/2015 11:59 PM, Peter Geoghegan wrote:

Okay. Attached revision has a few tweaks that reflect the status of
int128/uint128 as specialized types that are basically only useful for
this optimization, or other similar optimizations on compilers that
either are GCC, or aim to be compatible with it. I don't think
Andreas' V9 reflected that sufficiently.

Also, I now always use PolyNumAggState (the typedef), even for #define
HAVE_INT128 code, which I think is a bit clearer. Note that I have
generated a minimal diff, without the machine generated changes that
are ordinarily included in the final commit when autoconf tests are
added, mostly because I do not have the exact version of autoconf on
my development machine required to do this without creating irrelevant


Thanks,

I have attached a patch where I have ran autoconf.

--
Andreas Karlsson
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 509f961..2db04ab 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -125,6 +125,43 @@ undefine([Ac_cachevar])dnl
 ])# PGAC_TYPE_64BIT_INT
 
 
+# PGAC_HAVE_GCC__INT128
+# -
+# Check if __int128 is a working 128 bit integer type, and if so define
+# HAVE_GCC__INT128.  This is a GCC extension, unlike the other integer types,
+# which C99 defines.
+AC_DEFUN([PGAC_HAVE_GCC__INT128],
+[AC_CACHE_CHECK(for __int128, pgac_cv_have_gcc__int128,
+[AC_TRY_RUN([
+__int128 a = 2001;
+__int128 b = 4005;
+
+int does__int128_work()
+{
+  __int128 c,d;
+
+  c = a * b;
+  d = (c + b) / b;
+  if (d != a+1)
+return 0;
+  return 1;
+}
+
+main() {
+  exit(! does__int128_work());
+}],
+[pgac_cv_have_gcc__int128=yes],
+[pgac_cv_have_gcc__int128=no],
+[# If cross-compiling, check the size reported by the compiler and
+# trust that the arithmetic works.
+AC_COMPILE_IFELSE([AC_LANG_BOOL_COMPILE_TRY([], [sizeof(__int128) == 16])],
+  pgac_cv_have_gcc__int128=yes,
+  pgac_cv_have_gcc__int128=no)])])
+
+if test x$pgac_cv_have_gcc__int128 = xyes ; then
+  AC_DEFINE(HAVE_GCC__INT128, 1, [Define to 1 if the system has the type `__int128'.])
+fi])# PGAC_HAVE_GCC__INT128
+
 
 # PGAC_C_FUNCNAME_SUPPORT
 # ---
diff --git a/configure b/configure
index 379dab1..f61d980 100755
--- a/configure
+++ b/configure
@@ -13803,6 +13803,77 @@ _ACEOF
 fi
 
 
+# Check for extension offering the integer scalar type __int128.  This is
+# supported for targets which have an integer mode wide enough to hold 128 bits.
+{ $as_echo $as_me:${as_lineno-$LINENO}: checking for __int128 5
+$as_echo_n checking for __int128...  6; }
+if ${pgac_cv_have_gcc__int128+:} false; then :
+  $as_echo_n (cached)  6
+else
+  if test $cross_compiling = yes; then :
+  # If cross-compiling, check the size reported by the compiler and
+# trust that the arithmetic works.
+cat confdefs.h - _ACEOF conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+static int test_array [1 - 2 * !(sizeof(__int128) == 16)];
+test_array [0] = 0;
+return test_array [0];
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile $LINENO; then :
+  pgac_cv_have_gcc__int128=yes
+else
+  pgac_cv_have_gcc__int128=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+else
+  cat confdefs.h - _ACEOF conftest.$ac_ext
+/* end confdefs.h.  */
+
+__int128 a = 2001;
+__int128 b = 4005;
+
+int does__int128_work()
+{
+  __int128 c,d;
+
+  c = a * b;
+  d = (c + b) / b;
+  if (d != a+1)
+return 0;
+  return 1;
+}
+
+main() {
+  exit(! does__int128_work());
+}
+_ACEOF
+if ac_fn_c_try_run $LINENO; then :
+  pgac_cv_have_gcc__int128=yes
+else
+  pgac_cv_have_gcc__int128=no
+fi
+rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \
+  conftest.$ac_objext conftest.beam conftest.$ac_ext
+fi
+
+fi
+{ $as_echo $as_me:${as_lineno-$LINENO}: result: $pgac_cv_have_gcc__int128 5
+$as_echo $pgac_cv_have_gcc__int128 6; }
+
+if test x$pgac_cv_have_gcc__int128 = xyes ; then
+
+$as_echo #define HAVE_GCC__INT128 1 confdefs.h
+
+fi
+
 # Check for various atomic operations now that we have checked how to declare
 # 64bit integers.
 { $as_echo $as_me:${as_lineno-$LINENO}: checking for builtin __sync char locking functions 5
diff --git a/configure.in b/configure.in
index ca29e93..6b1c251 100644
--- a/configure.in
+++ b/configure.in
@@ -1771,6 +1771,10 @@ AC_CHECK_TYPES([int8, uint8, int64, uint64], [], [],
 # C, but is missing on some old platforms.
 AC_CHECK_TYPES(sig_atomic_t, [], [], [#include signal.h])
 
+# Check for extension offering the integer scalar type __int128.  This is
+# supported for targets which have an integer mode wide enough to hold 128 bits.
+PGAC_HAVE_GCC__INT128
+
 # Check for various atomic operations now that we have checked how to declare
 # 64bit integers.
 PGAC_HAVE_GCC__SYNC_CHAR_TAS
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index 715917b..b925d29 100644
--- a/src/backend/utils/adt/numeric.c
+++ 

[HACKERS] Help needed for PL/Ruby

2015-03-18 Thread Devrim Gündüz

Hi,

Background info first: PL/Ruby was originally maintained by Guy Decoux,
who passed away in 2008: https://www.ruby-forum.com/topic/166658 . After
his death, Akinori MUSHA forked the project and maintained it until
2010: https://github.com/knu/postgresql-plruby . Last release was on Jan
2010, and recent distros started throwing build errors.

I was having similar build issues while trying to RPMify PL/Ruby, and
finally stepped up the plate and tried to fix those build issues by
forking the project:

https://github.com/devrimgunduz/postgresql-plruby/

I mainly applied patches from Fedora, and also did some basic cleanup.
However, I don't know Ruby and have limited C skills, so I need some
help on these:

* Complete the extension support: I committed initial infrastructure for
it, but I don't think it is ready yet.

* Fix the FIXME:
https://github.com/devrimgunduz/postgresql-plruby/blob/master/src/plpl.c#L844

* Documentation review and update: The recent example is against
PostgreSQL 8.0. A recent Ruby example would be good, and also we need
updates for creating the language.

Any contributions are welcome. I recently released 0.5.5 that at least
is ready for testing.

I want to remind that I am not a Ruby guy, so this is really a community
stuff for me.

Thanks by now.

Regards,
-- 
Devrim GÜNDÜZ
Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Twitter: @DevrimGunduz , @DevrimGunduzTR



signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] Table-level log_autovacuum_min_duration

2015-03-18 Thread Fujii Masao
On Fri, Mar 6, 2015 at 1:07 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Fri, Mar 6, 2015 at 12:44 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Thu, Mar 5, 2015 at 9:49 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 On Thu, Mar 5, 2015 at 7:10 PM, Fujii Masao wrote:
 With the patch, VACUUM ANALYZE VERBOSE doesn't emit any verbose message.
 Why did you remove that functionality?

 Oops. Sorry about that. In gram.y, the combination of VacuumStmt with
 AnalyzeStmt overwrote the value of log_min_duration incorrectly. I
 also found another bug related to logging of ANALYZE not working
 correctly because of the use of IsAutoVacuumWorkerProcess() instead of
 VACOPT_VERBOSE (this is reducing the diffs of the patch btw). All
 those things are fixed in the attached.

 Thanks for updating the patch!

 Why does log_min_duration need to be set even when manual VACUUM command is
 executed? Per the latest version of the patch, log_min_duration is checked 
 only
 when the process is autovacuum worker. So ISTM that log_min_duration doesn't
 need to be set in gram.y. It's even confusing to me. Or if you're going to
 implement something like VACUUM VERBOSE DURATION n (i.e., verbose message
 is output if n seconds have been elapsed), that might be necessary, though...

 Thanks for reminding. The DURATION-like clause was exactly a point
 mentioned by Anzai-san upthread, and it made sense to me to be in-line
 with the other parameters controlling the freeze (discussion somewhat
 related to that =
 http://www.postgresql.org/message-id/CAB7nPqRZX7Pv2B-R7xHmAh52tfjAQGfy9btqwFstgQgXks=i...@mail.gmail.com)
 but we can live without it for this patch as VACOPT_VERBOSE is used
 only by manual VACUUM and not by autovacuum to choose the log elevel.

Are you planning to update the patch so that it's based on the commit 0d83138?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] ERRCODE_T_R_DEADLOCK_DETECTED

2015-03-18 Thread Tatsuo Ishii
The error code is used in two places:

ereport(ERROR,
(errcode(ERRCODE_T_R_DEADLOCK_DETECTED),
 errmsg(canceling statement due to conflict with 
recovery),
   errdetail(User transaction caused buffer deadlock with 
recovery.)));

ereport(ERROR,
(errcode(ERRCODE_T_R_DEADLOCK_DETECTED),
 errmsg(deadlock detected),
 errdetail_internal(%s, clientbuf.data),
 errdetail_log(%s, logbuf.data),
 errhint(See server log for query details.)));

The latter is a normal deadlock and can be obseved by stats because
pgstat_report_deadlock() is called.

The former is using the same error code but the meaning is pretty
different and users might be confused IMO.

I am not sure sharing the same error code is the best idea here.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cancelling statement due to user request error occurs but the transaction has committed.

2015-03-18 Thread Bruce Momjian
On Wed, Sep 10, 2014 at 08:10:45PM -0400, Bruce Momjian wrote:
 On Tue, Jun 10, 2014 at 10:30:24AM -0400, Robert Haas wrote:
  On Tue, Jun 10, 2014 at 10:18 AM, Tom Lane t...@sss.pgh.pa.us wrote:
   Robert Haas robertmh...@gmail.com writes:
   I don't agree with this analysis.  If the connection is closed after
   the client sends a COMMIT and before it gets a response, then the
   client must indeed be smart enough to figure out whether or not the
   commit happened.  But if the server sends a response, the client
   should be able to rely on that response being correct.  In this case,
   an ERROR is getting sent but the transaction is getting committed;
   yuck.  I'm not sure whether the fix is right, but this definitely
   seems like a bug.
  
   In general, the only way to avoid that sort of behavior for a post-commit
   error would be to PANIC ... and even then, the transaction got committed,
   which might not be the expectation of a client that got an error message,
   even if it said PANIC.  So this whole area is a minefield, and the only
   attractive thing we can do is to try to reduce the number of errors that
   can get thrown post-commit.  We already, for example, do not treat
   post-commit file unlink failures as ERROR, though we surely would prefer
   to do that.
  
  We could treated it as a lost-communication scenario.  The appropriate
  recovery actions from the client's point of view are identical.
  
   So from this standpoint, redefining SIGINT as not throwing an error when
   we're in post-commit seems like a good idea.  I'm not endorsing any
   details of the patch here, but the 2-foot view seems generally sound.
  
  Cool, that makes sense to me also.
 
 Did we ever do anything about this?

I have researched this issue originally reported in June of 2014 and
implemented a patch to ignore cancel while we are completing a commit. 
I am not clear if this is the proper place for this code, though a
disable_timeout() call on the line above suggests I am close.  :-)
(The disable_timeout disables internal timeouts, but it doesn't disable
cancels coming from the client.)

The first patch is for testing and adds a sleep(5) to the end of the
TRUNCATE command, to give the tester time to press Control-C from psql,
and enables log_duration so the cancel is checked.

The second patch is the patch that disables cancel when we are in the
process of committing;  before:

test= CREATE TABLE test(x INT);
CREATE TABLE
test= INSERT INTO test VALUES (3);
INSERT 0 1
test= TRUNCATE test;
^CCancel request sent
-- ERROR:  canceling statement due to user request
test= SELECT * FROM test;
 x
---
(0 rows)

and with both patches:

test= CREATE TABLE test(x INT);
CREATE TABLE
test= INSERT INTO test VALUES (3);
INSERT 0 1
test= TRUNCATE test;
^CCancel request sent
-- TRUNCATE TABLE
test= SELECT * FROM test;
 x
---
(0 rows)

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
new file mode 100644
index 623e6bf..a5d66d8
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
***
*** 13,18 
--- 13,19 
   *-
   */
  #include postgres.h
+ #include unistd.h
  
  #include access/genam.h
  #include access/heapam.h
*** ExecuteTruncate(TruncateStmt *stmt)
*** 1265,1270 
--- 1266,1272 
  
  		heap_close(rel, NoLock);
  	}
+ 	sleep(5);
  }
  
  /*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
new file mode 100644
index 26275bd..9147a79
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
*** extern const struct config_enum_entry dy
*** 408,414 
  /*
   * GUC option variables that are exported from this module
   */
! bool		log_duration = false;
  bool		Debug_print_plan = false;
  bool		Debug_print_parse = false;
  bool		Debug_print_rewritten = false;
--- 408,414 
  /*
   * GUC option variables that are exported from this module
   */
! bool		log_duration = true;
  bool		Debug_print_plan = false;
  bool		Debug_print_parse = false;
  bool		Debug_print_rewritten = false;
*** static struct config_bool ConfigureNames
*** 1082,1088 
  			NULL
  		},
  		log_duration,
! 		false,
  		NULL, NULL, NULL
  	},
  	{
--- 1082,1088 
  			NULL
  		},
  		log_duration,
! 		true,
  		NULL, NULL, NULL
  	},
  	{
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
new file mode 100644
index 110983f..82eca10
*** a/src/backend/utils/misc/postgresql.conf.sample
--- b/src/backend/utils/misc/postgresql.conf.sample

Re: [HACKERS] Table-level log_autovacuum_min_duration

2015-03-18 Thread Michael Paquier
On Thu, Mar 19, 2015 at 12:23 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Are you planning to update the patch so that it's based on the commit 0d83138?

Yes... Very soon.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-03-18 Thread Amit Kapila
On Wed, Mar 18, 2015 at 9:14 PM, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Mar 18, 2015 at 2:22 AM, Amit Kapila amit.kapil...@gmail.com
wrote:
 
  It didn't fix the problem.  IIUC, you have done this to ensure that
  if worker is not already started, then update it's pid, so that we
  can get the required status in WaitForBackgroundWorkerShutdown().
  As this is a timing issue, it can so happen that before Postmaster
  gets a chance to report the pid, backend has already started waiting
  on WaitLatch().

 I think I figured out the problem.  That fix only helps in the case
 where the postmaster noticed the new registration previously but
 didn't start the worker, and then later notices the termination.
 What's much more likely to happen is that the worker is started and
 terminated so quickly that both happen before we create a
 RegisteredBgWorker for it.  The attached patch fixes that case, too.


Patch fixes the problem and now for Rescan, we don't need to Wait
for workers to finish.

 Assuming this actually fixes the problem, I think we should back-patch
 it into 9.4.

+1


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] deparsing utility commands

2015-03-18 Thread Stephen Frost
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
 One thing that Stephen commented on was the ALTER TABLE preparatory
 patch.  He said why not return ObjectAddress from the subcommand
 routines instead of just Oid/attnum?  The reason is that to interpret
 the address correctly you will still need a lot more context; the OID
 and attnum are only part of the truth anyway.  I think there are a small
 number of cases where we could meaningfully return an ObjectAddress and
 have the whole thing be useful, but I'm not sure it's worth the bother.

Alright, fair enough.

 In patch 0004, I removed most of the Stash() calls in ProcessUtility,
 instead adding one at the bottom that takes care of most of the simple
 cases.  That means that a developer adding support for something new in
 ProcessUtilitySlow without realizing that something must be added to the
 command stash will get an error fairly early, which I think is helpful.

Right, I like this.

 Patch 0021 (fixing a bug in SECURITY LABEL support) I'm not really sure
 about.  I don't like modifying a parse node during execution -- seems a
 bad habit.  It seems better to me to return the chosen security label as
 an ObjectAddress in ExecSecLabelStmt, as pass that as secondaryOid to
 the deparse_utility.c routine.

I agree, changing a parse node during execution definitely seems like a
bad idea, particularly if there's a way you could avoid doing so, which
it sounds like there is.  Any reason to not follow up on that?

 In patch 0023 (CREATE/ALTER POLICY support) I ran into trouble.  I
 represent the role in the json like this:
   tmp = new_objtree_VA(TO %{role:, }I, 0);
 which means that role names are quoted as identifiers.  This means it
 doesn't work as soon as we get a command referencing PUBLIC (which many
 in the regression test do, because when the TO clause is omitted, PUBLIC
 is the default).  So this ends up as PUBLIC and everything goes
 downhill.  I'm not sure what to do about this, except to invent a new
 %{} code.

How is this any different from the GRANT/REVOKE cases..?  Surely you
have to deal with 'public' being special there also.

 Subject: [PATCH 01/29] add secondaryOid to DefineTSConfiguration()

This looks fine to me.

 Subject: [PATCH 02/29] deparse/core: have ALTER TABLE return table OID and
  other data
[...]
 diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
 index 5ce4395..9675d21 100644
 @@ -1853,7 +1853,7 @@ heap_drop_with_catalog(Oid relid)
  /*
   * Store a default expression for column attnum of relation rel.
   */
 -void
 +Oid
  StoreAttrDefault(Relation rel, AttrNumber attnum,
Node *expr, bool is_internal)
  {

Missing a comment about the return value?

 @@ -2074,7 +2083,7 @@ StoreConstraints(Relation rel, List 
 *cooked_constraints, bool is_internal)
   int numchecks = 0;
   ListCell   *lc;
  
 - if (!cooked_constraints)
 + if (list_length(cooked_constraints) == 0)
   return; /* nothing to do */
  
   /*

While this is used in a few places, it's by far more common to use:

if (cooked_constraints == NIL)

or to keep it the way it is..


 diff --git a/src/backend/commands/tablecmds.c 
 b/src/backend/commands/tablecmds.c
 index 623e6bf..fc2c8a9 100644
 @@ -3416,78 +3418,95 @@ static void
   case AT_DropColumn: /* DROP COLUMN */
   ATExecDropColumn(wqueue, rel, cmd-name,
 -  cmd-behavior, false, false, 
 cmd-missing_ok, lockmode);
 +  cmd-behavior, false, 
 false,
 +  cmd-missing_ok, 
 lockmode);
   break;
   case AT_DropColumnRecurse:  /* DROP COLUMN with 
 recursion */
   ATExecDropColumn(wqueue, rel, cmd-name,
 -   cmd-behavior, true, false, 
 cmd-missing_ok, lockmode);
 +  cmd-behavior, true, 
 false,
 +  cmd-missing_ok, 
 lockmode);
   break;

[...]

   case AT_ReplicaIdentity:
 - ATExecReplicaIdentity(rel, (ReplicaIdentityStmt *) 
 cmd-def, lockmode);
 + ATExecReplicaIdentity(rel, (ReplicaIdentityStmt *) 
 cmd-def,
 +   lockmode);
   break;
   case AT_EnableRowSecurity:
   ATExecEnableRowSecurity(rel);

I realize the whole thing is changing anyway, but the random whitespace
changes don't make reviewing any easier.

 -static void
 +static AttrNumber
  ATExecAddColumn(List **wqueue, AlteredTableInfo *tab, Relation rel,
   ColumnDef *colDef, bool isOid,
   bool recurse, 

Re: [HACKERS] Parallel Seq Scan

2015-03-18 Thread Amit Kapila
On Wed, Mar 18, 2015 at 10:45 PM, Robert Haas robertmh...@gmail.com wrote:

 On Sat, Mar 14, 2015 at 1:04 AM, Amit Kapila amit.kapil...@gmail.com
wrote:
  # EXPLAIN SELECT DISTINCT bid FROM pgbench_accounts;
  ERROR:  too many dynamic shared memory segments
 
  This happens because we have maximum limit on the number of
  dynamic shared memory segments in the system.
 
  In function dsm_postmaster_startup(), it is defined as follows:
 
  maxitems = PG_DYNSHMEM_FIXED_SLOTS
  + PG_DYNSHMEM_SLOTS_PER_BACKEND * MaxBackends;
 
  In the above case, it is choosing parallel plan for each of the
  AppendRelation,
  (because of seq_page_cost = 1000) and that causes the test to
  cross max limit of dsm segments.

 The problem here is, of course, that each parallel sequential scan is
 trying to create an entirely separate group of workers.  Eventually, I
 think we should fix this by rejiggering things so that when there are
 multiple parallel nodes in a plan, they all share a pool of workers.
 So each worker would actually get a list of plan nodes instead of a
 single plan node.  Maybe it works on the first node in the list until
 that's done, and then moves onto the next, or maybe it round-robins
 among all the nodes and works on the ones where the output tuple
 queues aren't currently full, or maybe the master somehow notifies the
 workers which nodes are most useful to work on at the present time.

Good idea. I think for this particular case, we might want to optimize
the work distribution such each worker gets one independent relation
segment to scan.

 But I think trying to figure this out is far too ambitious for 9.5,
 and I think we can have a useful feature without implementing any of
 it.


Agreed.

 But, we can't just ignore the issue right now, because erroring out on
 a large inheritance hierarchy is no good.  Instead, we should fall
 back to non-parallel operation in this case.  By the time we discover
 the problem, it's too late to change the plan, because it's already
 execution time.  So we are going to be stuck executing the parallel
 node - just with no workers to help.  However, what I think we can do
 is use a slab of backend-private memory instead of a dynamic shared
 memory segment, and in that way avoid this error.  We do something
 similar when starting the postmaster in stand-alone mode: the main
 shared memory segment is replaced by a backend-private allocation with
 the same contents that the shared memory segment would normally have.
 The same fix will work here.

 Even once we make the planner and executor smarter, so that they don't
 create lots of shared memory segments and lots of separate worker
 pools in this type of case, it's probably still useful to have this as
 a fallback approach, because there's always the possibility that some
 other client of the dynamic shared memory system could gobble up all
 the segments.  So, I'm going to go try to figure out the best way to
 implement this.


Thanks.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Can pg_dump make use of CURRENT/SESSION_USER

2015-03-18 Thread Fabrízio de Royes Mello
On Wed, Mar 18, 2015 at 3:13 PM, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Mar 18, 2015 at 2:09 PM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:
  Just one last doubt. Do we expose a new function called
current_database
  like current_catalog, current_user, ... ?

 Why would we do that?


We don't need it... it's another patch.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello


Re: [HACKERS] Can pg_dump make use of CURRENT/SESSION_USER

2015-03-18 Thread Fabrízio de Royes Mello
On Wed, Mar 18, 2015 at 3:13 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 =?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
  Just one last doubt. Do we expose a new function called
current_database
  like current_catalog, current_user, ... ?

 There already is one.  But that would have nothing to do with the proposed
 patch anyway, because the bits of syntax in question are not expressions
 and you should not try to turn them into things that would call functions.


Yeah... I'll focus on the current scope.

Thanks,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello


Re: [HACKERS] Rethinking the parameter access hooks for plpgsql's benefit

2015-03-18 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Mar 17, 2015 at 8:01 PM, Bruce Momjian br...@momjian.us wrote:
 Basically, the same rules apply to all commitfests, i.e. a committer can
 apply anything during that period.  I think the only restriction for the
 last commitfest is that the committer can not apply a new patch that
 would have been too big to be submitted to the last commitfest. If
 enough people feel that this committer behavior during the last
 commitfest is a problem, we can discuss changing that policy.

 One thing that's crystal clear here is that we don't all agree on what
 the policy actually is.

Indeed.  In this case, since the patch in question is one that
improves/simplifies a patch that's already in the current commitfest,
I'm going to go ahead and push it.  If you want to call a vote on
revoking my commit bit, go right ahead.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rethinking the parameter access hooks for plpgsql's benefit

2015-03-18 Thread Andres Freund
On 2015-03-18 13:12:11 -0400, Tom Lane wrote:
 Indeed.  In this case, since the patch in question is one that
 improves/simplifies a patch that's already in the current commitfest,
 I'm going to go ahead and push it.  If you want to call a vote on
 revoking my commit bit, go right ahead.

Seriously? In my opinion it has to be possible to doubt whether a patch
should be committed in certain release without it being interpreted as a
personal attack.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rethinking the parameter access hooks for plpgsql's benefit

2015-03-18 Thread Peter Geoghegan
On Wed, Mar 18, 2015 at 10:26 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2015-03-18 13:12:11 -0400, Tom Lane wrote:
 Indeed.  In this case, since the patch in question is one that
 improves/simplifies a patch that's already in the current commitfest,
 I'm going to go ahead and push it.  If you want to call a vote on
 revoking my commit bit, go right ahead.

 Seriously? In my opinion it has to be possible to doubt whether a patch
 should be committed in certain release without it being interpreted as a
 personal attack.

+1


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can pg_dump make use of CURRENT/SESSION_USER

2015-03-18 Thread Fabrízio de Royes Mello
On Wed, Mar 18, 2015 at 2:24 PM, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Mar 18, 2015 at 1:23 PM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:
   If we ever implement something like
  
   COMMENT ON CURRENT_DATABASE IS ...
  
   it will be useful, because you will be able to restore a dump into
   another database and have the comment apply to the target database.
 
  I think it's simple to implement, but how about pg_dump... we need to
add
  new option (like --use-current-database) or am I missing something ?

 I think we'd just change it to use the new syntax, full stop.  I see
 no need for an option.


Ok.


   (Also, I wonder about
   ALTER USER foo IN DATABASE current_database ...
   because that will let us dump per-database user options too.)
 
  +1 for both of those ideas.
 
  Can you explain more about this idea?

 Uh, what do you want explained?


There's no need, sorry. I just misunderstood it first.

I'll provide the patches.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello


Re: [HACKERS] Can pg_dump make use of CURRENT/SESSION_USER

2015-03-18 Thread Tom Lane
=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
 Just one last doubt. Do we expose a new function called current_database
 like current_catalog, current_user, ... ?

There already is one.  But that would have nothing to do with the proposed
patch anyway, because the bits of syntax in question are not expressions
and you should not try to turn them into things that would call functions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rethinking the parameter access hooks for plpgsql's benefit

2015-03-18 Thread Andres Freund
On 2015-03-18 14:01:41 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  Seriously? In my opinion it has to be possible to doubt whether a patch
  should be committed in certain release without it being interpreted as a
  personal attack.
 
 I don't think anyone's said anything in this thread that amounts to a
 personal attack.

From where I am sitting you appear somewhat offended. If not, great.

 If someone's going to tell me that my judgment about when to push
 something is not acceptable, then they probably ought to be calling
 for removal of my commit privileges.

I think this is making the issue far bigger than it is. I think every
committer and most contributors in this project have swallowed more
decisions they didn't agree with than they care to count. And that's
normal. Let's try to keep it a bit more low key.

 We have a difference of opinion on policy, and what I'm saying is that
 the policy ultimately reduces to trusting individual committers to use
 their best judgment.

I think the issue is much less, if at all, about trusting committers,
than about a different view about how to hanlde our problems with
release management. Quite apparently 9.4 didn't work out greatly, which
probably excerbates preexisting (perhaps unknowingly so) differences in
opinion about until when new patches should be allowed.

There's also the problem, as you noted, that we're all fucking tired of
neverending 2014-06 commitfest. Robert is, apparently at least, of the
opinion that that's best handled by stopping to accept new patches and
trying reduce the size of the backlog. That doesn't sound entirely
unreasonable. It also doesn't sound unreasonable to deal with that
backlog by doing the stuff oneself deems important.  I personally am
*really* worried about the current state; without having a short/mid
term solution. I think this makes postgres really unattractive to
contribute to. And that has, and will, cost us a lot of help.


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-18 Thread Peter Geoghegan
On Wed, Mar 18, 2015 at 11:41 AM, Heikki Linnakangas hlinn...@iki.fi wrote:
 AFAICS, there is no need to go and clear the tag after the insert has
 completed.

 Here's what I had in mind: the inserter tags the tuple with the speculative
 insertion token, by storing the token in the t_ctid field. If the inserter
 needs to super-delete the tuple, it sets xmax like in a regular deletion,
 but also sets another flag to indicate that it was a super-deletion.

 When another backend inserts, and notices that it has a potential conflict
 with the first tuple, it tries to acquire a hw-lock on the token. In most
 cases, the inserter has long since completed the insertion, and the
 acquisition succeeds immediately but you have to check because the token is
 not cleared on a completed insertion. It would be slightly faster for the
 second backend if the inserter actually removed the token after the
 insertion has completed: it wouldn't need to check the lock if the tuple was
 not tagged in the first place. But that'd be just a tiny optimization, for
 the rare case that there is a conflict, and surely not worth it.

 Hmm, I just realized a little fly in that ointment: if the inserter inserts
 enough tuples to wrap around the token counter (4 billion?) in a single
 transaction, another backend that inserts could confuse a new speculative
 insertion with one that completed a long time ago. The risk seems small
 enough that we could just accept it. I don't think anything particularly bad
 would happen on a false positive here. Or we could also use all 48 bits
 available in the t_ctid to push it truly in the realm of ain't-gonna-happen.
 Or we could use (relfilenode, block,  token) as the lock tag for the
 SpeculativeInsertionLock.

Maybe we'd use all 48-bits anyway, but it's not a major concern either
way. Speculative token locks (by design) are only held for an instant.
I think a spurious wait would be entirely benign, and very unlikely.

 Regarding the physical layout: We can use a magic OffsetNumber value above
 MaxOffsetNumber to indicate that the t_ctid field stores a token rather than
 a regular ctid value. And another magic t_ctid value to indicate that a
 tuple has been super-deleted. The token and the super-deletion flag are
 quite ephemeral, they are not needed after the inserting transaction has
 completed, so it's nice to not consume the valuable infomask bits for these
 things. Those states are conveniently not possible on an updated tuple, when
 we would need the t_ctid field for it's current purpose.

You seem pretty convinced that this is the way to go. I'll try and
produce a revision that works this way shortly. I don't think that it
will be all that hard to come up with something to prove the idea out.

 I think Heikki's concern is something different, although I am not
 altogether up to speed on this and may be confused.  The issue is:
 suppose that process A and process B are both furiously upserting into
 the same table with the same key column but, because they have
 different costing parameters, process A consistently chooses index X
 and process B consistently chooses index Y.  In that situation, will
 we deadlock, livelock, error out, bloat, or get any other undesirable
 behavior, or is that A-OK?

 Right, that's what I had in mind.

Oh, I see. I totally failed to understand that that was the concern.

I think it'll be fine. The pre-check is going to look for a heap tuple
using one or the other of (say) a pair of equivalent indexes. We might
miss the heap tuple because we picked an index that had yet to have a
physical index tuple inserted, and then hit a conflict on optimistic
insertion (the second phase). But there is no reason to think that
that won't happen anyway. The ordering of operations isn't critical.

The one issue you might still have is a duplicate violation, because
you happened to infer the unique index that does not get to tolerate
unique violations as conflicts that can be recovered from (and there
was a race, which is unlikely). I don't really care about this,
though. You really are inferring one particular unique index, and for
reasons like this I think it would be a mistake to try to pretend that
the unique index is 100% an implementation detail. That's why I called
the new clause a unique index inference specification.

This hypothetical set of unique indexes will always have n-1 redundant
unique indexes - they must closely match. That's something that calls
into question why the user wants things this way to begin with. Also,
note that one unique index will consistently win, since the
insertion order is stable (the relcache puts them in OID order). So it
will not be all over the map.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-18 Thread Heikki Linnakangas

On 03/18/2015 06:23 PM, Robert Haas wrote:

On Tue, Mar 17, 2015 at 6:27 PM, Peter Geoghegan p...@heroku.com wrote:

On Tue, Mar 17, 2015 at 12:11 PM, Heikki Linnakangas hlinn...@iki.fi wrote:

I'm still not sure the way the speculative locking works is the best
approach. Instead of clearing xmin on super-deletion, a new flag on the heap
tuple seems more straightforward. And we could put the speculative insertion
token in t_ctid, instead of stashing it in the PGPROC array. That would
again seem more straightforward.


I see the appeal of that approach. What concerns me about that
approach is that it makes it the problem of the inserter to confirm
its insertion, even though overwhelmingly, speculative insertions
succeeds (the race window is small due to the pre-check). The current
speculative token is legitimately a very short lived thing, a thing
that I hesitate to write to disk at all. So our optimistic approach to
inserting speculatively loses its optimism, which I don't like, if you
know what I mean.


Modifying a shared buffer is not the same as writing to disk.


AFAICS, there is no need to go and clear the tag after the insert has 
completed.


Here's what I had in mind: the inserter tags the tuple with the 
speculative insertion token, by storing the token in the t_ctid field. 
If the inserter needs to super-delete the tuple, it sets xmax like in a 
regular deletion, but also sets another flag to indicate that it was a 
super-deletion.


When another backend inserts, and notices that it has a potential 
conflict with the first tuple, it tries to acquire a hw-lock on the 
token. In most cases, the inserter has long since completed the 
insertion, and the acquisition succeeds immediately but you have to 
check because the token is not cleared on a completed insertion. It 
would be slightly faster for the second backend if the inserter actually 
removed the token after the insertion has completed: it wouldn't need to 
check the lock if the tuple was not tagged in the first place. But 
that'd be just a tiny optimization, for the rare case that there is a 
conflict, and surely not worth it.


Hmm, I just realized a little fly in that ointment: if the inserter 
inserts enough tuples to wrap around the token counter (4 billion?) in a 
single transaction, another backend that inserts could confuse a new 
speculative insertion with one that completed a long time ago. The risk 
seems small enough that we could just accept it. I don't think anything 
particularly bad would happen on a false positive here. Or we could also 
use all 48 bits available in the t_ctid to push it truly in the realm of 
ain't-gonna-happen. Or we could use (relfilenode, block,  token) as the 
lock tag for the SpeculativeInsertionLock.


Regarding the physical layout: We can use a magic OffsetNumber value 
above MaxOffsetNumber to indicate that the t_ctid field stores a token 
rather than a regular ctid value. And another magic t_ctid value to 
indicate that a tuple has been super-deleted. The token and the 
super-deletion flag are quite ephemeral, they are not needed after the 
inserting transaction has completed, so it's nice to not consume the 
valuable infomask bits for these things. Those states are conveniently 
not possible on an updated tuple, when we would need the t_ctid field 
for it's current purpose.



If I'm reading this correctly, if there are multiple indexes that match the
unique index inference specification, we pick the cheapest one. Isn't that
unstable? Two backends running the same INSERT ON CONFLICT statement might
pick different indexes, and the decision might change over time as the table
is analyzed. I think we should have a more robust rule. Could we easily just
use all matching indexes?


Robert feel pretty strongly that there should be a costing aspect to
this. Initially I was skeptical of this, but just did what he said all
the same. But I've since come around to his view entirely because we
now support partial unique indexes.


I think Heikki's concern is something different, although I am not
altogether up to speed on this and may be confused.  The issue is:
suppose that process A and process B are both furiously upserting into
the same table with the same key column but, because they have
different costing parameters, process A consistently chooses index X
and process B consistently chooses index Y.  In that situation, will
we deadlock, livelock, error out, bloat, or get any other undesirable
behavior, or is that A-OK?


Right, that's what I had in mind.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: searching in array function - array_position

2015-03-18 Thread Alvaro Herrera
Pavel Stehule wrote:
 2015-03-18 12:41 GMT+01:00 Marko Tiikkaja ma...@joh.to:

  I am thinking, so this behave is correct (there is no other
  possible), but it is only corner case for this functionality - and
  if you are thinking, so better to disallow it, I am not against. My
  main focus is 1ND array.
 
  A nonsensical answer for multi-dimensional arrays is worse than no answer
  at all.  I think raising an exception is better.
 
 
 I have not problem with it.

Pushed after adding error checks there and fixing the docs to match.
Please verify.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Zero-padding and zero-masking fixes for to_char(float)

2015-03-18 Thread Bruce Momjian
In September, while researching the to_char() buffer overflow bugs fixed
in 9.4.1 (commit 0150ab567bcf5e5913e2b62a1678f84cc272441f), I found an
inconsistency in how to_char() does zero-padding for float4/8 values. 
Now that 9.4.1 is released and I am home for a while, I am ready to
address this.

For example, to_char(int4) properly pads with trailing zeros, e.g.

SELECT to_char(int4 '19', 'D' || repeat('9', 
1000));
--
19.00...

Numeric does the same thing:

SELECT to_char(numeric '999', 'D' || 
repeat('9', 1000));
--
999.0...

However, float4/8 do not supply the requested zero padding:

SELECT to_char(float4 '999', 'D' || repeat('9', 
1000));
--
9997952

and

SELECT to_char(float8 '999', 'D' || repeat('9', 
1000));
--
999.

float4/8 are padding to the internal precision, while int4/numeric are
padding based on the requested precision.  This is inconsistent.

The first attached patch fixes this, and also zeros the junk digits
which exceed the precision of the underlying type:

SELECT to_char(float4 '999', 'D' || repeat('9', 
1000));
--
990.0...

SELECT to_char(float8 '999', 'D' || repeat('9', 
1000));
--
999.

This junk digit zeroing matches the Oracle behavior:

SELECT to_char(1.123456789123456789123456789d, 
'9.9') as x from dual;
--
1.12345678912345680

Our output with the patch would be:

SELECT to_char(float8 '1.123456789123456789123456789', 
'9.9');
--
1.12345678912345000

which is pretty close.

The second patch adds regression tests for these.

I would like to apply this for 9.5 while I remember what I was doing,
but I guess now that I have written this email, I will be able to keep
it for 9.6 if people prefer.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
new file mode 100644
index 40a353f..25b247e
*** a/src/backend/utils/adt/formatting.c
--- b/src/backend/utils/adt/formatting.c
***
*** 113,125 
  #define DCH_MAX_ITEM_SIZ	   12		/* max localized day name		*/
  #define NUM_MAX_ITEM_SIZ		8		/* roman number (RN has 15 chars)	*/
  
- /* --
-  * More is in float.c
-  * --
-  */
- #define MAXFLOATWIDTH	60
- #define MAXDOUBLEWIDTH	500
- 
  
  /* --
   * Format parser structs
--- 113,118 
*** int4_to_char(PG_FUNCTION_ARGS)
*** 5214,5221 
  		/* we can do it easily because float8 won't lose any precision */
  		float8		val = (float8) value;
  
! 		orgnum = (char *) palloc(MAXDOUBLEWIDTH + 1);
! 		snprintf(orgnum, MAXDOUBLEWIDTH + 1, %+.*e, Num.post, val);
  
  		/*
  		 * Swap a leading positive sign for a space.
--- 5207,5213 
  		/* we can do it easily because float8 won't lose any precision */
  		float8		val = (float8) value;
  
! 		orgnum = psprintf(%+.*e, Num.post, val);
  
  		/*
  		 * Swap a leading positive sign for a space.
*** float4_to_char(PG_FUNCTION_ARGS)
*** 5414,5420 
  		numstr = orgnum = int_to_roman((int) rint(value));
  	else if (IS_(Num))
  	{
- 		numstr = orgnum = (char *) palloc(MAXDOUBLEWIDTH + 1);
  		if (isnan(value) || is_infinite(value))
  		{
  			/*
--- 5406,5411 
*** float4_to_char(PG_FUNCTION_ARGS)
*** 5428,5442 
  		}
  		else
  		{
! 			snprintf(orgnum, MAXDOUBLEWIDTH + 1, %+.*e, Num.post, value);
  
  			/*
  			 * Swap a leading positive sign for a space.
  			 */
! 			if (*orgnum == '+')
! *orgnum = ' ';
! 
! 			numstr = orgnum;
  		}
  	}
  	else
--- 5419,5447 
  		}
  		else
  		{
! 			numstr = psprintf(%+.*e, Num.post, value);
! 
! 			/* prevent the display of imprecise/junk digits */
! 			if (Num.pre + Num.post  FLT_DIG)
! 			{
! int		digits = 0;
! char   *numstr_p;
! 
! for (numstr_p = numstr; *numstr_p  *numstr_p != 'e'; numstr_p++)
! {
! 	if (isdigit(*numstr_p))
! 	{
! 		if (++digits  FLT_DIG)
! 			*numstr_p = '0';
! 	}
! }
! 			}
  
  			/*
  			 * Swap a leading positive sign for a space.
  			 */
! 			if (*numstr == '+')
! *numstr = ' ';
  		}
  	}
  	else
*** float4_to_char(PG_FUNCTION_ARGS)
*** 5452,5467 
  			Num.pre += Num.multi;
  		}
  
! 		orgnum = (char *) palloc(MAXFLOATWIDTH + 1);
! 		snprintf(orgnum, MAXFLOATWIDTH + 1, %.0f, fabs(val));
! 		

Re: [HACKERS] Using 128-bit integers for sum, avg and statistics aggregates

2015-03-18 Thread Andres Freund
On 2015-03-18 14:00:51 -0700, Peter Geoghegan wrote:
 Anyway, I think that it's not quite the same. For one thing, we're
 talking about a GCC extension, not a type described by C99. We don't
 care about snprintf support, for example.

I don't see that that has any consequence wrt Andreas' test.

 For another, Andreas has chosen to lump together __int128 and unsigned
 __int128 into one test, where the latter really doesn't receive
 coverage.

On my urging actually. It's pretty darn unlikely that only one variant
will work. Useless configure tests just cost time. We're testing a gcc
extension here, as you point out, it'll not just stop working for
unsigned vs signed.

The reason we need a link test (vs just a compile test) is that gcc
links to helper functions to do math - even if they're not present on
the target platform. Compiling will succeed, but linking won't.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-18 Thread Peter Geoghegan
On Wed, Mar 18, 2015 at 9:19 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Mar 17, 2015 at 3:11 PM, Heikki Linnakangas hlinn...@iki.fi wrote:
 I've been thinking that it would be nice to be able to specify a constraint
 name. Naming an index directly feels wrong, as in relational and SQL
 philosophy, indexes are just an implementation detail, but naming a
 constraint is a fair game. It would also be nice to be able to specify use
 the primary key.

 Intuitively, I think you should specify an operator name, not a
 constraint name.  That's what we do for, e.g., exclusion constraints,
 and it feels right.  People sometimes create and drop indexes (and
 thus, perhaps, the constraints that depend on them) for maintenance
 reasons where a change in semantics will be unwelcome.

I think we should use a constraint name. That is the plain reality of
what we're doing, and is less ambiguous than an operator. 99% of the
time you'll be happy with an unspecified, across-the-board IGNORE, or
won't be using exclusion constraints anyway (so we can infer a unique
index).

A constraint name covers all reasonable cases, since partial unique
indexes are otherwise covered (partial unique indexes do not have a
pg_constraint entry). Oracle has a hint for ignoring particular, named
unique indexes (not constraints). I realize that Oracle hints are not
supposed to affect semantics, but this is actually true (Google it).
This is a bit ugly, but less ugly as the hint, since as Heikki points
out we're only naming a constraint. Naming a constraint reflects the
reality of how the feature needs to work, and has a sort of precedent
from other systems.

 But I don't
 accept Peter's argument that it's OK to be indifferent to which
 particular equality semantics are being used.

I am not suggesting actual indifference makes sense. I am leaving it
up to the definition of available indexes. And there are no known
cases where it could matter anyway, unless you consider the ===
operator for tuples to be a counter example. And you need multiple
conflicting unique indexes on the exact same attributes/expressions on
attributes to begin with. Isn't that a highly esoteric thing to have
to worry about? Perhaps to the extent that literally no one will ever
have to care anyway? It's an oddball use-case, if ever I saw one.

Note: the issue of caring about equality semantics across B-Tree
opclasses of the same type, and the issue of naming unique indexes are
separate issues, AFAICT. No one should confuse them. The only
crossover is that the oddball use-case mentioned could use the named
constraint thing as an escape hatch.

As I've said, I think it's misguided to try to make unique indexes
100% an implementation detail. It's going to fall apart in edge cases,
like the one with multiple unique indexes that I mentioned in my last
e-mail. No one thinks of them that way, including users.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and rsync

2015-03-18 Thread Bruce Momjian
On Fri, Mar  6, 2015 at 12:19:36PM -0500, Bruce Momjian wrote:
 On Fri, Mar  6, 2015 at 10:50:27AM +0300, Vladimir Borodin wrote:
  What I have done is to update the pg_upgrade instructions to add this
  required step.  Updated doc patch attached.  (I also added the --delete
  flag to rsync.)  Thanks so much for your detailed report.
  
  
  It seems to work fine now. The only thing that would be nice to change is
  to explicitly write that these instructions are correct for hot standby
  installations too.
  
  + para
  +  If you have Log-Shipping Standby Servers (xref
  +  linkend=warm-standby), follow these steps to upgrade them (before
  +  starting any servers):
  + /para
  
  Actually, I’ve entered this thread because it is not obvious from the 
  paragraph
  above or any other places.
 
 Oh, very good point.  I was trying to match the wording we use in the
 docs, but forgot that log shipping and streaming replication are
 specified separately.
 
 Updated patch attached.  You can view the output at:
 
   http://momjian.us/tmp/pgsql/pgupgrade.html
 
 Thanks much!

Patch applied to head.  While it will work all the way back to 9.0, the
instructions need a shaking out during beta.  I will blog that people
can try it.  Thanks to Stephen Frost for coming up with this solution.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] GSoC - Idea Discussion

2015-03-18 Thread hitesh ramani
Hello devs,
As stated earlier I was thinking to propose the integration of Postgres and 
CUDA for faster execution of order by queries thru optimizing the sorting code 
and sorting it with CUDA. I saw and tried to run PG Strom and ran into issues. 
Moreover, PG Strom is implemented in OpenCL, not CUDA.
I have hardware to run CUDA and currently I'm at a point where I have almost 
integrated Postgres and CUDA. This opens up gates for a lot of features which 
can be optimized thru CUDA and parallel processing, though here I only want to 
focus on sorting, hence kind of feasible for the time period.
As I did some research, I found CUDA is more efficient in not just the parallel 
performance but data transfer latency too. My idea is to create a branch of 
Postgres with the CUDA integrated code.
For the feasibility, I guess it's very much feasible because I've almost 
integrated CUDA execution and the code needs to be optimized as per CUDA.

Please give in your valuable suggestions and views on this.
Thanks and Regards,Hitesh Ramani  

Re: [HACKERS] parallel mode and parallel contexts

2015-03-18 Thread Andres Freund
Hi,

Reading the README first, the rest later. So you can comment on my
comments, while I actually look at the code. Parallelism, yay!

On 2015-03-18 12:02:07 -0400, Robert Haas wrote:
 +Instead, we take a more pragmatic approach: we try to make as many of the
 +operations that are safe outside of parallel mode work correctly in parallel
 +mode as well, and we try to prohibit the rest via suitable error
 checks.

I'd say that we'd try to prohibit a bunch of common cases. Among them
the ones that are triggerable from SQL. We don't really try to prohibit
many kinds of traps, as you describe.

 +  - The authenticated user ID and current database.  Each parallel worker
 +will connect to the same database as the initiating backend, using the
 +same user ID.

This sentence immediately makes me wonder why only the authenticated
user, and not the currently active role, is mentioned.

 +  - The values of all GUCs.  Accordingly, permanent changes to the value of
 +any GUC are forbidden while in parallel mode; but temporary changes,
 +such as entering a function with non-NULL proconfig, are potentially OK.

potentially OK sounds odd to me. Which danger are you seing that isn't
relevan tfor norm

 +  - The combo CID mappings.  This is needed to ensure consistent answers to
 +tuple visibility checks.  The need to synchronize this data structure is
 +a major reason why we can't support writes in parallel mode: such writes
 +might create new combo CIDs, and we have now way to let other workers
 +(or the initiating backend) know about them.

Absolutely *not* in the initial version, but we really need to find a
better solution for this.

 +  - The currently active user ID and security context.  Note that this is
 +the fourth user ID we restore: the initial step of binding to the correct
 +database also involves restoring the authenticated user ID.  When GUC
 +values are restored, this incidentally sets SessionUserId and OuterUserId
 +to the correct values.  This final step restores CurrentUserId.

Ah. That's the answer for above. Could you just move it next to the
other user bit?

 +We could copy the entire transaction state stack,
 +but most of it would be useless: for example, you can't roll back to a
 +savepoint from within a parallel worker, and there are no resources to
 +associated with the memory contexts or resource owners of intermediate
 +subtransactions.

I do wonder if we're not going to have to change this in the not too far
away future. But then, this isn't externally visible at all, so
whatever.

 +At the end of a parallel operation, which can happen either because it
 +completed successfully or because it was interrupted by an error, parallel
 +workers associated with that operation exit.  In the error case, transaction
 +abort processing in the parallel leader kills of any remaining workers, and
 +the parallel leader then waits for them to die.

Very slightly awkward because first you talk about successful *or* error
and then about abort processing.

 +  - Cleanup of pg_temp namespaces is not done.  The initiating backend is
 +responsible for this, too.

How could a worker have its own pg_temp namespace?

 +Lock Management
 +===
 +
 +Certain heavyweight locks that the initiating backend holds at the beginning
 +of parallelism are copied to each worker, which unconditionally acquires 
 them.
 +The parallel backends acquire - without waiting - each such lock that the
 +leader holds, even if that lock is self-exclusive.  This creates the unusual
 +situation that a lock which could normally only be held by a single backend
 +can be shared among several backends in a parallel group.
 +
 +Obviously, this presents significant hazards that would not be present in
 +normal execution.  If, for example, a backend were to initiate parallelism
 +while ReindexIsProcessingIndex() were true for some index, the parallel
 +backends launched at that time would neither share this state nor be excluded
 +from accessing the index via the heavyweight lock mechanism.  It is therefore
 +imperative that backends only initiate parallelism from places where it will
 +be safe for parallel workers to access the relations on which they hold 
 locks.
 +It is also important that they not afterwards do anything which causes access
 +to those relations to become unsafe, or at least not until after parallelism
 +has concluded.  The fact that parallelism is strictly read-only means that 
 the
 +opportunities for such mishaps are few and far between; furthermore, most
 +operations which present these hazards are DDL operations, which will be
 +rejected by CheckTableNotInUse() if parallel mode is active.
 +
 +Only relation locks, locks on database or shared objects, and perhaps the 
 lock
 +on our transaction ID are copied to workers.

perhaps?

 Advisory locks are not copied,
 +but the leader may hold them at the start of parallelism; they cannot
 +subsequently be manipulated 

Re: [HACKERS] Using 128-bit integers for sum, avg and statistics aggregates

2015-03-18 Thread Peter Geoghegan
On Wed, Mar 18, 2015 at 1:05 AM, Andres Freund and...@2ndquadrant.com wrote:
 I think it's a pretty direct copy of the 64bit code. I'm not entirely
 sure why this needs a AC_TRY_RUN with a compile fallback (for cross) and
 why a AC_TRY_LINK isn't sufficient? But then, you just copied that
 decision.

Good point.

Anyway, I think that it's not quite the same. For one thing, we're
talking about a GCC extension, not a type described by C99. We don't
care about snprintf support, for example. For another, Andreas has
chosen to lump together __int128 and unsigned __int128 into one test,
where the latter really doesn't receive coverage. This only makes
sense from the point of view of this optimization, since we need both
anyway, as int128_to_numericvar() uses the uint128 type. I think it's
fine to only use int128/uint128 for this optimization, and similar
optimizations, and consequently I think it's fine to lump the two
types together, but lets be clear that that's all we mean to do.

I'm going to keep things that way, and try and test unsigned __int128
too (but as part of the same, single configure test). It's not merely
a matter of mechanically copying what we do for int64 (or uint64),
though.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add LINE: hint when schemaname.typename is a non-existent schema

2015-03-18 Thread Jeevan Chalke
Álvaro,

I think, there are few open questions here and thus marking it back to Waiting 
on Author.

Please have your views on the review comments already posted.
Also make changes as Tom suggested about placing pstate at the beginning.

I am more concerned about this:

1.
postgres=# create or replace function
f1(a abc.test.id%type) returns int as
$$ select 1; $$
language sql;
ERROR:  schema abc does not exist

Is that expected?

2.
Also what about pushing setup_parser_errposition_callback() inside 
func_get_detail() as well, just to limit it for namespace lookup?

Thanks

The new status of this patch is: Waiting on Author


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about TEMP tables

2015-03-18 Thread David G. Johnston
On Tuesday, March 17, 2015, Воронин Дмитрий carriingfat...@yandex.ru
wrote:

   Make sure to show your full command(s) and the full, exact text of any
 errors.

 OK, I use PostgreSQL version 9.4.1.

 I create cluster 'main' and connect to it. After cluster init we have
 those shemas:

 postgres=# SELECT nspname FROM pg_namespace ;
   nspname
 
  pg_toast
  pg_temp_1
  pg_toast_temp_1
  pg_catalog
  public
  information_schema
 (6 rows)

 Owner of those schemas is postgres (OID 10).

 Now we try to create TEMP TABLE, for example:

 postgres=# CREATE TEMP TABLE temptable();
 CREATE TABLE

 Show namespaces:

 postgres=# SELECT nspname FROM pg_namespace ;
   nspname
 
  pg_toast
  pg_temp_1
  pg_toast_temp_1
  pg_catalog
  public
  information_schema
  pg_temp_2
  pg_toast_temp_2
 (8 rows)

 Now we create a new database testdb and connect to it:

 CREATE DATABASE testdb;
 \c testdb

 SHOW namespaces of testdb (we already connect to it):

 testdb=# SELECT nspname FROM pg_namespace ;
   nspname
 
  pg_toast
  pg_temp_1
  pg_toast_temp_1
  pg_catalog
  public
  information_schema
 (6 rows)

 OK, namespaces pg_temp_2 and pg_toast_temp_2 are not visible. But
 pg_temp_1 and pg_toast_temp_1 are visible. WHY?

 If we create some temp objects in testdb Postgres wiil create namespaces
 pg_temp_3 and pg_toast_temp_3.

 Try to create temp table at pg_temp_1:


As I note below, you don't get to choose; you just say CREATE TEMP TABLE
schemaless_name


 CREATE TEMP TABLE pg_temp_1.temptable();
 ERROR: cannot create relations in temporary schemas of other sessions

 I catch those error if I create some TEMP objects in postgres database.

 --
 Best regards, Dmitry Voronin


Schemas are not global and so can vary between databases.

You do not specify the schema in which temp tables are created.  The system
auto-assigns them, and also creates them based on need.

Temporary objects only survive for the life of the session creating them.

Empty temp schemas are ugly but aside from ignoring/hiding them from your
viewer there isn't much worth doing.  The system will just recreate them if
you drop them manually.

It will create numbers potentially up to the number of simultaneous
connections you allow.  It my have affinity but that is an implementation
detail you shouldn't care about.

David J.


Re: [HACKERS] Reduce pinning in btree indexes

2015-03-18 Thread Simon Riggs
On 18 March 2015 at 03:01, Jim Nasby jim.na...@bluetreble.com wrote:
 On 3/16/15 11:47 AM, Robert Haas wrote:

 I am sure there are more sophisticated things to be done here, but I
 guess my feeling is that time is a good way to go here for a first cut
 - lots of people have suggested it, and there's clearly a use case for
 it.  If the setting turns out to be popular, we can fine-tune it more
 once we've got some experience with it.  But I'm nervous about trying
 to to design something more complicated than that right now,
 especially so late in the release cycle.  We know that this setting,
 with time-based units, will meet the needs of the customer for whom it
 was developed, and that is a good-enough reason to think that time is
 a reasonable metric for this, even if we eventually have others.


 +1. As you said, time is easy for people to understand, and I think it will
 handle a large chunk of the use cases.

If it wasn't clear, I had already said that my idea was for next
release, before Robert said that.


 I used to have this quote (or something close to it) on my whiteboard... I
 think it's appropriate here ;)

 The perfect is the enemy of the good. -Simon Riggs

Interesting to hear I was anybody's pinup ;-)

Not sure that is a saying of mine though, sounds more like Robert to me.

I do believe in something now, something better later. I'd call that
incremental benefit, as opposed to incremental development where the
benefit may not be realisable until the last point. Which is why I am
backing this patch for 9.5, even though the UI is less useful than I
think is eventually necessary.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, RemoteDBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-03-18 Thread Amit Kapila
On Tue, Mar 17, 2015 at 7:54 PM, Robert Haas robertmh...@gmail.com wrote:

 On Tue, Mar 17, 2015 at 1:42 AM, Amit Kapila amit.kapil...@gmail.com
wrote:
  The problem occurs in second loop inside DestroyParallelContext()
  where it calls WaitForBackgroundWorkerShutdown().  Basically
  WaitForBackgroundWorkerShutdown() just checks for BGWH_STOPPED
  status, refer below code in parallel-mode patch:
 
  + status = GetBackgroundWorkerPid(handle, pid);
  + if (status == BGWH_STOPPED)
  + return status;
 
  So if the status here returned is BGWH_NOT_YET_STARTED, then it
  will go for WaitLatch and will there forever.
 
  I think fix is to check if status is BGWH_STOPPED or
 BGWH_NOT_YET_STARTED,
  then just return the status.
 
  What do you say?

 No, that's not right.  If we return when the status is
 BGWH_NOT_YET_STARTED, then the postmaster could subsequently start the
 worker.

 Can you try this:

 diff --git a/src/backend/postmaster/bgworker.c
 b/src/backend/postmaster/bgworker.c
 index f80141a..39b919f 100644
 --- a/src/backend/postmaster/bgworker.c
 +++ b/src/backend/postmaster/bgworker.c
 @@ -244,6 +244,8 @@ BackgroundWorkerStateChange(void)
 rw-rw_terminate = true;
 if (rw-rw_pid != 0)
 kill(rw-rw_pid, SIGTERM);
 +   else
 +   ReportBackgroundWorkerPID(rw);
 }
 continue;
 }


It didn't fix the problem.  IIUC, you have done this to ensure that
if worker is not already started, then update it's pid, so that we
can get the required status in WaitForBackgroundWorkerShutdown().
As this is a timing issue, it can so happen that before Postmaster
gets a chance to report the pid, backend has already started waiting
on WaitLatch().


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-03-18 Thread Kouhei Kaigai
 Robert Haas robertmh...@gmail.com writes:
  On Tue, Mar 17, 2015 at 10:11 AM, Kouhei Kaigai kai...@ak.jp.nec.com 
  wrote:
  It might be an idea if foreign-scan path is not wiped out regardless of the
  estimated cost, we will be able to construct an entirely remote-join path
  even if intermediation path is expensive than local join.
  A problem is, how to distinct these special paths from usual paths that are
  eliminated on the previous stage once its path is more expensive.
 
  Any solution that is based on not eliminating paths that would
  otherwise be discarded based on cost seems to me to be unlikely to be
  feasible.  We can't complicate the core path-cost-comparison stuff for
  the convenience of FDW or custom-scan pushdown.
 
 I concur.  I'm not even so worried about the cost of add_path as such;
 the real problem with not discarding paths as aggressively as possible
 is that it will result in a combinatorial explosion in the number of
 path combinations that have to be examined at higher join levels.

I'm inclined to agree. It is also conclusion of the discussion with Hanada-san
yesterday, due to the number of paths to be considered and combination problems,
as you mentioned above.

So, overall consensus for the FDW hook location is just before the set_chepest()
at standard_join_search() and merge_clump(), isn't it?
Let me make a design of FDW hook to reduce code duplications for each FDW 
driver,
especially, to identify baserel/joinrel to be involved in this join.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers