Re: [HACKERS] Function array_agg(array)

2014-10-25 Thread Ali Akbar
2014-10-25 10:29 GMT+07:00 Ali Akbar the.ap...@gmail.com:

 I fixed small issue in regress tests and I enhanced tests for varlena
 types and null values.

 Thanks.

 it is about 15% faster than original implementation.

 15% faster than array_agg(scalar)? I haven't verify the performance, but
 because the internal array data and null bitmap is copied as-is, that will
 be faster.

 2014-10-25 1:51 GMT+07:00 Pavel Stehule pavel.steh...@gmail.com:

 Hi Ali


 I checked a code. I am thinking so code organization is not good.
 accumArrayResult is too long now. makeMdArrayResult will not work, when
 arrays was joined (it is not consistent now). I don't like a usage of
 state-is_array_accum in array_userfunc.c -- it is signal of wrong
 wrapping.

 Yes, i was thinking the same. Attached WIP patch to reorganizate the code.
 makeMdArrayResult works now, with supplied arguments act as override from
 default values calculated from ArrayBuildStateArray.

 In array_userfunc.c, because we don't want to override ndims, dims and
 lbs, i copied array_agg_finalfn and only change the call to
 makeMdArrayResult (we don't uses makeArrayResult because we want to set
 release to false). Another alternative is to create new
 makeArrayResult-like function that has parameter bool release.


adding makeArrayResult1 (do you have better name for this?), that accepts
bool release parameter. array_agg_finalfn becomes more clean, and no
duplicate code in array_agg_anyarray_finalfn.


 next question: there is function array_append(anyarray, anyelement). Isn't
 time to define array_append(anyarray, anyarray) now?


 There is array_cat(anyarray, anyarray):

 /*-
  * array_cat :
  * concatenate two nD arrays to form an nD array, or
  * push an (n-1)D array onto the end of an nD array

  *
  */


Regards,
-- 
Ali Akbar
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7e5bcd9..f59738a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12046,6 +12046,22 @@ NULL baz/literallayout(3 rows)/entry
  row
   entry
indexterm
+primaryarray_agg/primary
+   /indexterm
+   functionarray_agg(replaceable class=parameteranyarray/replaceable)/function
+  /entry
+  entry
+   any
+  /entry
+  entry
+   the same array type as input type
+  /entry
+  entryinput arrays, aggregated into higher-order multidimesional array. Rejects NULL and empty array as input./entry
+ /row
+
+ row
+  entry
+   indexterm
 primaryaverage/primary
/indexterm
indexterm
diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index 2f0680f..8c182a4 100644
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@@ -2238,6 +2238,11 @@ SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%');
  array
 ---
  {2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31,2412,2413}
+
+SELECT ARRAY(SELECT array(select i) FROM generate_series(1,5) a(i));
+ array
+---
+ {{1},{2},{3},{4},{5}}
 (1 row)
 /programlisting
The subquery must return a single column. The resulting
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 41e973b..0261fcb 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -108,12 +108,16 @@ exprType(const Node *expr)
 	type = exprType((Node *) tent-expr);
 	if (sublink-subLinkType == ARRAY_SUBLINK)
 	{
-		type = get_array_type(type);
-		if (!OidIsValid(type))
-			ereport(ERROR,
-	(errcode(ERRCODE_UNDEFINED_OBJECT),
-	 errmsg(could not find array type for data type %s,
-			format_type_be(exprType((Node *) tent-expr);
+		if (!OidIsValid(get_element_type(type)))
+		{
+			/* not array, so check for its array type */
+			type = get_array_type(type);
+			if (!OidIsValid(type))
+ereport(ERROR,
+		(errcode(ERRCODE_UNDEFINED_OBJECT),
+		 errmsg(could not find array type for data type %s,
+format_type_be(exprType((Node *) tent-expr);
+		}
 	}
 }
 else if (sublink-subLinkType == MULTIEXPR_SUBLINK)
@@ -139,12 +143,16 @@ exprType(const Node *expr)
 	type = subplan-firstColType;
 	if (subplan-subLinkType == ARRAY_SUBLINK)
 	{
-		type = get_array_type(type);
-		if (!OidIsValid(type))
-			ereport(ERROR,
-	(errcode(ERRCODE_UNDEFINED_OBJECT),
-	 errmsg(could not find array type for data type %s,
-	format_type_be(subplan-firstColType;
+		if (!OidIsValid(get_element_type(type)))
+		{
+			/* not array, so check for its array type */
+			type = get_array_type(type);
+			if (!OidIsValid(type))
+ereport(ERROR,
+		

Re: [HACKERS] Function array_agg(array)

2014-10-25 Thread Pavel Stehule
2014-10-25 8:19 GMT+02:00 Ali Akbar the.ap...@gmail.com:

 2014-10-25 10:29 GMT+07:00 Ali Akbar the.ap...@gmail.com:

 I fixed small issue in regress tests and I enhanced tests for varlena
 types and null values.

 Thanks.

 it is about 15% faster than original implementation.

 15% faster than array_agg(scalar)? I haven't verify the performance, but
 because the internal array data and null bitmap is copied as-is, that will
 be faster.

 2014-10-25 1:51 GMT+07:00 Pavel Stehule pavel.steh...@gmail.com:

 Hi Ali


 I checked a code. I am thinking so code organization is not good.
 accumArrayResult is too long now. makeMdArrayResult will not work, when
 arrays was joined (it is not consistent now). I don't like a usage of
 state-is_array_accum in array_userfunc.c -- it is signal of wrong
 wrapping.

 Yes, i was thinking the same. Attached WIP patch to reorganizate the
 code. makeMdArrayResult works now, with supplied arguments act as override
 from default values calculated from ArrayBuildStateArray.

 In array_userfunc.c, because we don't want to override ndims, dims and
 lbs, i copied array_agg_finalfn and only change the call to
 makeMdArrayResult (we don't uses makeArrayResult because we want to set
 release to false). Another alternative is to create new
 makeArrayResult-like function that has parameter bool release.


 adding makeArrayResult1 (do you have better name for this?), that accepts
 bool release parameter. array_agg_finalfn becomes more clean, and no
 duplicate code in array_agg_anyarray_finalfn.


 makeArrayResult1 - I have no better name now

I found one next minor detail.

you reuse a array_agg_transfn function. Inside is a message
array_agg_transfn called in non-aggregate context. It is not correct for
array_agg_anyarray_transfn

probably specification dim and lbs in array_agg_finalfn is useless, when
you expect a natural result. A automatic similar to
array_agg_anyarray_finalfn should work there too.

makeMdArrayResultArray and appendArrayDatum  should be marked as static



 next question: there is function array_append(anyarray, anyelement).
 Isn't time to define array_append(anyarray, anyarray) now?


 There is array_cat(anyarray, anyarray):

 /*-
  * array_cat :
  * concatenate two nD arrays to form an nD array, or
  * push an (n-1)D array onto the end of an nD array

  
 *
  */


 Regards,
 --
 Ali Akbar



Re: [HACKERS] Function array_agg(array)

2014-10-25 Thread Pavel Stehule
2014-10-25 8:33 GMT+02:00 Pavel Stehule pavel.steh...@gmail.com:



 2014-10-25 8:19 GMT+02:00 Ali Akbar the.ap...@gmail.com:

 2014-10-25 10:29 GMT+07:00 Ali Akbar the.ap...@gmail.com:

 I fixed small issue in regress tests and I enhanced tests for varlena
 types and null values.

 Thanks.

 it is about 15% faster than original implementation.

 15% faster than array_agg(scalar)? I haven't verify the performance, but
 because the internal array data and null bitmap is copied as-is, that will
 be faster.

 2014-10-25 1:51 GMT+07:00 Pavel Stehule pavel.steh...@gmail.com:

 Hi Ali


 I checked a code. I am thinking so code organization is not good.
 accumArrayResult is too long now. makeMdArrayResult will not work, when
 arrays was joined (it is not consistent now). I don't like a usage of
 state-is_array_accum in array_userfunc.c -- it is signal of wrong
 wrapping.

 Yes, i was thinking the same. Attached WIP patch to reorganizate the
 code. makeMdArrayResult works now, with supplied arguments act as override
 from default values calculated from ArrayBuildStateArray.

 In array_userfunc.c, because we don't want to override ndims, dims and
 lbs, i copied array_agg_finalfn and only change the call to
 makeMdArrayResult (we don't uses makeArrayResult because we want to set
 release to false). Another alternative is to create new
 makeArrayResult-like function that has parameter bool release.


 adding makeArrayResult1 (do you have better name for this?), that accepts
 bool release parameter. array_agg_finalfn becomes more clean, and no
 duplicate code in array_agg_anyarray_finalfn.


  makeArrayResult1 - I have no better name now

 I found one next minor detail.

 you reuse a array_agg_transfn function. Inside is a message
 array_agg_transfn called in non-aggregate context. It is not correct for
 array_agg_anyarray_transfn

 probably specification dim and lbs in array_agg_finalfn is useless, when
 you expect a natural result. A automatic similar to
 array_agg_anyarray_finalfn should work there too.

 makeMdArrayResultArray and appendArrayDatum  should be marked as static


I am thinking so change of makeArrayResult is wrong. It is developed for 1D
array. When we expect somewhere ND result now, then we should to use
makeMdArrayResult there. So makeArrayResult should to return 1D array in
all cases. Else a difference between makeArrayResult and makeMdArrayResult
hasn't big change and we have a problem with naming.

Regards

Pavel






 next question: there is function array_append(anyarray, anyelement).
 Isn't time to define array_append(anyarray, anyarray) now?


 There is array_cat(anyarray, anyarray):

 /*-
  * array_cat :
  * concatenate two nD arrays to form an nD array, or
  * push an (n-1)D array onto the end of an nD array

  
 *
  */


 Regards,
 --
 Ali Akbar





Re: [HACKERS] Function array_agg(array)

2014-10-25 Thread Ali Akbar

  makeArrayResult1 - I have no better name now

 I found one next minor detail.

 you reuse a array_agg_transfn function. Inside is a message
 array_agg_transfn called in non-aggregate context. It is not correct for
 array_agg_anyarray_transfn


Fixed.


 probably specification dim and lbs in array_agg_finalfn is useless, when
 you expect a natural result. A automatic similar to
 array_agg_anyarray_finalfn should work there too.

 makeMdArrayResultArray and appendArrayDatum  should be marked as static


ok, marked all of that as static.


 I am thinking so change of makeArrayResult is wrong. It is developed for
 1D array. When we expect somewhere ND result now, then we should to use
 makeMdArrayResult there. So makeArrayResult should to return 1D array in
 all cases. Else a difference between makeArrayResult and makeMdArrayResult
 hasn't big change and we have a problem with naming.


I'm thinking it like this:
- if we want to accumulate array normally, use accumArrayResult and
makeArrayResult. If we accumulate scalar the result will be 1D array. if we
accumulate array, the resulting dimension is incremented by 1.
- if we want, somehow to affect the normal behavior, and change the
dimensions, use makeMdArrayResult.

Searching through the postgres' code, other than internal use in
arrayfuncs.c, makeMdArrayResult is used only
in src/pl/plperl/plperl.c:plperl_array_to_datum, while converting perl
array to postgres array.

So if somehow we will accumulate array other than in array_agg, i think the
most natural way is using accumArrayResult and then makeArrayResult.

CMIIW

Regards,
--
Ali Akbar
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7e5bcd9..f59738a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12046,6 +12046,22 @@ NULL baz/literallayout(3 rows)/entry
  row
   entry
indexterm
+primaryarray_agg/primary
+   /indexterm
+   functionarray_agg(replaceable class=parameteranyarray/replaceable)/function
+  /entry
+  entry
+   any
+  /entry
+  entry
+   the same array type as input type
+  /entry
+  entryinput arrays, aggregated into higher-order multidimesional array. Rejects NULL and empty array as input./entry
+ /row
+
+ row
+  entry
+   indexterm
 primaryaverage/primary
/indexterm
indexterm
diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index 2f0680f..8c182a4 100644
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@@ -2238,6 +2238,11 @@ SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%');
  array
 ---
  {2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31,2412,2413}
+
+SELECT ARRAY(SELECT array(select i) FROM generate_series(1,5) a(i));
+ array
+---
+ {{1},{2},{3},{4},{5}}
 (1 row)
 /programlisting
The subquery must return a single column. The resulting
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 41e973b..0261fcb 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -108,12 +108,16 @@ exprType(const Node *expr)
 	type = exprType((Node *) tent-expr);
 	if (sublink-subLinkType == ARRAY_SUBLINK)
 	{
-		type = get_array_type(type);
-		if (!OidIsValid(type))
-			ereport(ERROR,
-	(errcode(ERRCODE_UNDEFINED_OBJECT),
-	 errmsg(could not find array type for data type %s,
-			format_type_be(exprType((Node *) tent-expr);
+		if (!OidIsValid(get_element_type(type)))
+		{
+			/* not array, so check for its array type */
+			type = get_array_type(type);
+			if (!OidIsValid(type))
+ereport(ERROR,
+		(errcode(ERRCODE_UNDEFINED_OBJECT),
+		 errmsg(could not find array type for data type %s,
+format_type_be(exprType((Node *) tent-expr);
+		}
 	}
 }
 else if (sublink-subLinkType == MULTIEXPR_SUBLINK)
@@ -139,12 +143,16 @@ exprType(const Node *expr)
 	type = subplan-firstColType;
 	if (subplan-subLinkType == ARRAY_SUBLINK)
 	{
-		type = get_array_type(type);
-		if (!OidIsValid(type))
-			ereport(ERROR,
-	(errcode(ERRCODE_UNDEFINED_OBJECT),
-	 errmsg(could not find array type for data type %s,
-	format_type_be(subplan-firstColType;
+		if (!OidIsValid(get_element_type(type)))
+		{
+			/* not array, so check for its array type */
+			type = get_array_type(type);
+			if (!OidIsValid(type))
+ereport(ERROR,
+		(errcode(ERRCODE_UNDEFINED_OBJECT),
+		 errmsg(could not find array type for data type %s,
+		format_type_be(subplan-firstColType;
+		}
 	}
 }
 else if (subplan-subLinkType == MULTIEXPR_SUBLINK)
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 

Re: [HACKERS] Function array_agg(array)

2014-10-25 Thread Pavel Stehule
2014-10-25 10:16 GMT+02:00 Ali Akbar the.ap...@gmail.com:

  makeArrayResult1 - I have no better name now

 I found one next minor detail.

 you reuse a array_agg_transfn function. Inside is a message
 array_agg_transfn called in non-aggregate context. It is not correct for
 array_agg_anyarray_transfn


 Fixed.


 probably specification dim and lbs in array_agg_finalfn is useless, when
 you expect a natural result. A automatic similar to
 array_agg_anyarray_finalfn should work there too.

 makeMdArrayResultArray and appendArrayDatum  should be marked as static


 ok, marked all of that as static.


 I am thinking so change of makeArrayResult is wrong. It is developed for
 1D array. When we expect somewhere ND result now, then we should to use
 makeMdArrayResult there. So makeArrayResult should to return 1D array in
 all cases. Else a difference between makeArrayResult and makeMdArrayResult
 hasn't big change and we have a problem with naming.


 I'm thinking it like this:
 - if we want to accumulate array normally, use accumArrayResult and
 makeArrayResult. If we accumulate scalar the result will be 1D array. if we
 accumulate array, the resulting dimension is incremented by 1.
 - if we want, somehow to affect the normal behavior, and change the
 dimensions, use makeMdArrayResult.

 Searching through the postgres' code, other than internal use in
 arrayfuncs.c, makeMdArrayResult is used only
 in src/pl/plperl/plperl.c:plperl_array_to_datum, while converting perl
 array to postgres array.

 So if somehow we will accumulate array other than in array_agg, i think
 the most natural way is using accumArrayResult and then makeArrayResult.



ok, there is more variants and I can't to decide. But I am not satisfied
with this API. We do some wrong in structure. makeMdArrayResult is now
ugly.





 CMIIW

 Regards,
 --
 Ali Akbar



Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys

2014-10-25 Thread Thom Brown
On 24 October 2012 18:17, Tom Lane t...@sss.pgh.pa.us wrote:

 Marco Nenciarini marco.nenciar...@2ndquadrant.it writes:
  Please find the attached refreshed patch (v2) which fixes the loose ends
  you found.

 Attached is a v3 patch that updates the syntax per discussion, uses what
 seems to me to be a saner (more extensible) catalog representation, and
 contains assorted other code cleanup.  I have not touched the
 documentation at all except for catalogs.sgml, so it still explains the
 old syntax.  I have to stop working on this now, because I've already
 expended more time on it than I should, and it still has the serious
 problems mentioned in
 http://archives.postgresql.org/message-id/16787.1351053...@sss.pgh.pa.us
 and
 http://archives.postgresql.org/message-id/28389.1351094...@sss.pgh.pa.us

 I'm going to mark this Returned With Feedback for the current CF.


Does anyone have any intention of resurrecting this at this stage?

-- 
Thom


Re: [HACKERS] Function array_agg(array)

2014-10-25 Thread Ali Akbar
2014-10-25 15:43 GMT+07:00 Pavel Stehule pavel.steh...@gmail.com:



 2014-10-25 10:16 GMT+02:00 Ali Akbar the.ap...@gmail.com:

  makeArrayResult1 - I have no better name now

 I found one next minor detail.

 you reuse a array_agg_transfn function. Inside is a message
 array_agg_transfn called in non-aggregate context. It is not correct for
 array_agg_anyarray_transfn


 Fixed.


 probably specification dim and lbs in array_agg_finalfn is useless, when
 you expect a natural result. A automatic similar to
 array_agg_anyarray_finalfn should work there too.

 makeMdArrayResultArray and appendArrayDatum  should be marked as
 static


 ok, marked all of that as static.


 I am thinking so change of makeArrayResult is wrong. It is developed for
 1D array. When we expect somewhere ND result now, then we should to use
 makeMdArrayResult there. So makeArrayResult should to return 1D array in
 all cases. Else a difference between makeArrayResult and makeMdArrayResult
 hasn't big change and we have a problem with naming.


 I'm thinking it like this:
 - if we want to accumulate array normally, use accumArrayResult and
 makeArrayResult. If we accumulate scalar the result will be 1D array. if we
 accumulate array, the resulting dimension is incremented by 1.
 - if we want, somehow to affect the normal behavior, and change the
 dimensions, use makeMdArrayResult.

 Searching through the postgres' code, other than internal use in
 arrayfuncs.c, makeMdArrayResult is used only
 in src/pl/plperl/plperl.c:plperl_array_to_datum, while converting perl
 array to postgres array.

 So if somehow we will accumulate array other than in array_agg, i think
 the most natural way is using accumArrayResult and then makeArrayResult.


 ok, there is more variants and I can't to decide. But I am not satisfied
 with this API. We do some wrong in structure. makeMdArrayResult is now
 ugly.


One approach that i can think is we cleanly separate the structures and
API. We don't touch existing ArrayBuildState, accumArrayResult,
makeArrayResult and makeMdArrayResult, and we create new functions:
ArrayBuildStateArray, accumArrayResultArray, makeArrayResultArray and
makeArrayResultArrayCustom.

But if we do that, the array(subselect) implementation will not be as
simple as current patch. We must also change all the ARRAY_SUBLINK-related
accumArrayResult call.

Regarding current patch implementation, the specific typedefs are declared
as:
+typedef struct ArrayBuildStateScalar
+{
+ ArrayBuildState astate;
...
+} ArrayBuildStateArray;

so it necessities rather ugly code like this:
+ result-elemtype = astate-astate.element_type;

Can we use C11 feature of unnamed struct like this? :
+typedef struct ArrayBuildStateScalar
+{
+ ArrayBuildState;
...
+} ArrayBuildStateArray;

so the code can be a little less ugly by doing it like this:
+ result-elemtype = astate-element_type;

I don't know whether all currently supported compilers implements this
feature..


Can you suggest a better structure for this?

If we can't settle on a better structure, i think i'll reimplement
array_agg without the performance improvement, using deconstruct_array and
unchanged accumArrayResult  makeMdArrayResult.

Thanks,
-- 
Ali Akbar


Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-10-25 Thread Alvaro Herrera
Peter Geoghegan wrote:
 On Fri, Oct 24, 2014 at 4:39 PM, Petr Jelinek p...@2ndquadrant.com wrote:

  Ugh, you want to auto-magically detect what value is behind the EXCLUDED
  based on how/where it's used in the UPDATE? That seems like quite a bad
  idea.
 
 That's *exactly* how DEFAULT works within UPDATE targetlists. There
 might be a few more details to work out here, but not terribly many,
 and that's going to be true no matter what. 95%+ of the time, it'll
 just be val = EXCLUDED anyway.

Petr's thought mirrors mine.  What are you going to do the other 5% of
the time?  Is there some other way to refer to the columns of the
excluded row?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-10-25 Thread Alvaro Herrera
Robert Haas wrote:
 On Fri, Oct 24, 2014 at 4:46 PM, Jim Nasby jim.na...@bluetreble.com wrote:
  On 10/24/14, 12:21 PM, Robert Haas wrote:
  - What should we call dsm_unkeep_mapping, if not that?
 
  Only option I can think of beyond unkeep would be
  dsm_(un)register_keep_mapping. Dunno that it's worth it.
 
 Hmm, we could rename dsm_keep_mapping() to dsm_unregister_mapping(),
 since it's arranging to keep it by unregistering it from the resource
 owner.  And then we could call the new function
 dsm_register_mapping().  That has the appeal that unregister is a
 word, whereas unkeep isn't, but it's a little confusing otherwise,
 because the sense is reversed vs. the current naming.  Or we could
 just leave dsm_keep_mapping() alone and call the new function
 dsm_register_mapping().  A little non-orthogonal, but I think it'd be
 OK.

I do think that dsm_keep_mapping is a strange name for what it does.  If
it were a single function maybe it'd be okay because you read the
comments and know what it does, but if we have to invent the unkeep
stuff then it gets weird enough that a better solution seems worthwhile
to me.  So +1 for renaming it to something else.  I like the pin
analogy we use for buffers; so if you pin a mapping (dsm_pin_mapping)
then it doesn't go away automatically with the resowner, and then you
unpin it (dsm_unpin_mapping) to make the system collect it.

(Not sure what this means for the per-segment equivalent function.  It
might be worth keeping both namings consistent.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function array_agg(array)

2014-10-25 Thread Pavel Stehule
2014-10-25 12:20 GMT+02:00 Ali Akbar the.ap...@gmail.com:

 2014-10-25 15:43 GMT+07:00 Pavel Stehule pavel.steh...@gmail.com:



 2014-10-25 10:16 GMT+02:00 Ali Akbar the.ap...@gmail.com:

  makeArrayResult1 - I have no better name now

 I found one next minor detail.

 you reuse a array_agg_transfn function. Inside is a message
 array_agg_transfn called in non-aggregate context. It is not correct for
 array_agg_anyarray_transfn


 Fixed.


 probably specification dim and lbs in array_agg_finalfn is useless,
 when you expect a natural result. A automatic similar to
 array_agg_anyarray_finalfn should work there too.

 makeMdArrayResultArray and appendArrayDatum  should be marked as
 static


 ok, marked all of that as static.


 I am thinking so change of makeArrayResult is wrong. It is developed
 for 1D array. When we expect somewhere ND result now, then we should to use
 makeMdArrayResult there. So makeArrayResult should to return 1D array in
 all cases. Else a difference between makeArrayResult and makeMdArrayResult
 hasn't big change and we have a problem with naming.


 I'm thinking it like this:
 - if we want to accumulate array normally, use accumArrayResult and
 makeArrayResult. If we accumulate scalar the result will be 1D array. if we
 accumulate array, the resulting dimension is incremented by 1.
 - if we want, somehow to affect the normal behavior, and change the
 dimensions, use makeMdArrayResult.

 Searching through the postgres' code, other than internal use in
 arrayfuncs.c, makeMdArrayResult is used only
 in src/pl/plperl/plperl.c:plperl_array_to_datum, while converting perl
 array to postgres array.

 So if somehow we will accumulate array other than in array_agg, i think
 the most natural way is using accumArrayResult and then makeArrayResult.


 ok, there is more variants and I can't to decide. But I am not satisfied
 with this API. We do some wrong in structure. makeMdArrayResult is now
 ugly.


 One approach that i can think is we cleanly separate the structures and
 API. We don't touch existing ArrayBuildState, accumArrayResult,
 makeArrayResult and makeMdArrayResult, and we create new functions:
 ArrayBuildStateArray, accumArrayResultArray, makeArrayResultArray and
 makeArrayResultArrayCustom.


yes, I am thinking about separatate path, that will be joined in
constructMdArray

In reality, there are two different array builders - with different API and
better to respect it.



 But if we do that, the array(subselect) implementation will not be as
 simple as current patch. We must also change all the ARRAY_SUBLINK-related
 accumArrayResult call.

 Regarding current patch implementation, the specific typedefs are declared
 as:
 +typedef struct ArrayBuildStateScalar
 +{
 + ArrayBuildState astate;
 ...
 +} ArrayBuildStateArray;

 so it necessities rather ugly code like this:
 + result-elemtype = astate-astate.element_type;

 Can we use C11 feature of unnamed struct like this? :


no, what I know a C11 is prohibited


 +typedef struct ArrayBuildStateScalar
 +{
 + ArrayBuildState;
 ...
 +} ArrayBuildStateArray;

 so the code can be a little less ugly by doing it like this:
 + result-elemtype = astate-element_type;

 I don't know whether all currently supported compilers implements this
 feature..


 Can you suggest a better structure for this?

 If we can't settle on a better structure, i think i'll reimplement
 array_agg without the performance improvement, using deconstruct_array and
 unchanged accumArrayResult  makeMdArrayResult.


you can check it? We can test, how performance lost we get. As second
benefit we can get numbers for introduction new optimized array builder

Regards

Pavel



 Thanks,
 --
 Ali Akbar



Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2014-10-25 Thread Amit Kapila
On Tue, Oct 7, 2014 at 11:10 AM, Dilip kumar dilip.ku...@huawei.com wrote:

 On 26 September 2014 12:24, Amit Kapila Wrote,

 I don't think this can handle cancel requests properly because

 you are just setting it in GetIdleSlot() what if the cancel

 request came during GetQueryResult() after sending sql for

 all connections (probably thats the reason why Jeff is not able

 to cancel the vacuumdb when using parallel option).



 You are right, I have fixed, it in latest patch, please check latest
patch @ (
4205e661176a124faf891e0a6ba9135266363...@szxeml509-mbs.china.huawei.com)



***
*** 358,363  handle_sigint(SIGNAL_ARGS)
--- 358,364 

  /* Send QueryCancel if we are processing a database query */
  if (cancelConn != NULL)
  {
+ inAbort = true;
  if (PQcancel(cancelConn, errbuf, sizeof(errbuf)))
  fprintf(stderr, _(Cancel request sent\n));
  else

Do we need to set inAbort flag incase PQcancel is successful?
Basically if PQCancel fails due to any reason, I think behaviour
can be undefined as the executing thread can assume that cancel is
done.

*** 391,396  consoleHandler(DWORD dwCtrlType)
--- 392,399 
  EnterCriticalSection
(cancelConnLock);
  if (cancelConn != NULL)
  {
+ inAbort =
true;
+

You have set this flag in case of windows handler, however the same
is never used incase of windows, are you expecting any use of this
flag for windows?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys

2014-10-25 Thread Alvaro Herrera
Thom Brown wrote:
 On 24 October 2012 18:17, Tom Lane t...@sss.pgh.pa.us wrote:
 
  Marco Nenciarini marco.nenciar...@2ndquadrant.it writes:
   Please find the attached refreshed patch (v2) which fixes the loose ends
   you found.
 
  Attached is a v3 patch that updates the syntax per discussion, uses what
  seems to me to be a saner (more extensible) catalog representation, and
  contains assorted other code cleanup.  I have not touched the
  documentation at all except for catalogs.sgml, so it still explains the
  old syntax.  I have to stop working on this now, because I've already
  expended more time on it than I should, and it still has the serious
  problems mentioned in
  http://archives.postgresql.org/message-id/16787.1351053...@sss.pgh.pa.us
  and
  http://archives.postgresql.org/message-id/28389.1351094...@sss.pgh.pa.us
 
  I'm going to mark this Returned With Feedback for the current CF.
 
 
 Does anyone have any intention of resurrecting this at this stage?

Not in this room.  Do you?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys

2014-10-25 Thread Alvaro Herrera
Alvaro Herrera wrote:
 Thom Brown wrote:

  Does anyone have any intention of resurrecting this at this stage?
 
 Not in this room.  Do you?

I should have added some more context so that people realizes that this
room contains the 2ndQuadrant people involved in writing this patch.
Also I wanted to say that I find the remaining problems as outlined by
Tom very interesting and I would consider attacking them, except that I
have a few other time commitments at the moment.  But if there's anyone
out there with an inclination towards interesting problems, it might be
possible to get them to lend a hand here.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting rid of accept incoming network connections prompts on OS X

2014-10-25 Thread Peter Eisentraut
On 10/24/14 10:27 AM, Tom Lane wrote:
 Some clients (eg JDBC) don't support Unix-socket connections AFAIK, so
 this seems like a rather restricted solution.

While this is a valid point, they're actually working on fixing that.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting rid of accept incoming network connections prompts on OS X

2014-10-25 Thread Peter Eisentraut
On 10/24/14 9:39 AM, Tom Lane wrote:
 Peter, Dave: maybe you have tweaked things to keep listen_addresses
 empty and rely only on Unix-socket connections?

I can confirm that I do get the popup when starting an installed
postmaster with the default settings.

Given that this doesn't affect make check anymore, I'm unsure about
this patch.  There is a lot of magic in the configure change.  I don't
know what to pass as the configure option argument, so can't really
evaluate that.  I'd like to see an explanation for what is done there.

I'm afraid there is security ridicule potential.  We are essentially
adding an option to patch out an operating system security feature that
the user chose.  Some might find that neat and ship binaries built that
way.  Because it's --with-codesign and not
--with-codesign-for-devel-dont-use-in-production.

Have we dug deep enough into the firewall configuration to evaluate
other options?  Can we, for example, exclude a port range?

I could see adding this as a contrib script if we don't find a better way.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function array_agg(array)

2014-10-25 Thread Ali Akbar

 you can check it? We can test, how performance lost we get. As second
 benefit we can get numbers for introduction new optimized array builder


array_agg(anyarray) with deconstruct_array, unchanged accumArrayResult and
makeMdArrayResult:

 INSERT 0 1
 Time: 852,527 ms
 INSERT 0 1
 Time: 844,275 ms
 INSERT 0 1
 Time: 858,855 ms
 INSERT 0 1
 Time: 861,072 ms
 INSERT 0 1
 Time: 952,006 ms
 INSERT 0 1
 Time: 953,918 ms
 INSERT 0 1
 Time: 926,945 ms
 INSERT 0 1
 Time: 923,692 ms
 INSERT 0 1
 Time: 940,916 ms
 INSERT 0 1
 Time: 948,700 ms
 INSERT 0 1
 Time: 933,333 ms
 INSERT 0 1
 Time: 948,869 ms
 INSERT 0 1
 Time: 847,113 ms
 INSERT 0 1
 Time: 908,572 ms



Total: 12776.83

Avg: 912,63


with last patch (v10):

 INSERT 0 1
 Time: 643,339 ms
 INSERT 0 1
 Time: 608,010 ms
 INSERT 0 1
 Time: 610,465 ms
 INSERT 0 1
 Time: 613,931 ms
 INSERT 0 1
 Time: 616,466 ms
 INSERT 0 1
 Time: 634,754 ms
 INSERT 0 1
 Time: 683,566 ms
 INSERT 0 1
 Time: 656,665 ms
 INSERT 0 1
 Time: 630,096 ms
 INSERT 0 1
 Time: 607,564 ms
 INSERT 0 1
 Time: 610,353 ms
 INSERT 0 1
 Time: 626,816 ms
 INSERT 0 1
 Time: 610,450 ms
 INSERT 0 1
 Time: 614,342 ms



Total: 8842,7
 Avg: 631,6


It's 30% faster (i tried varlena element - text). I tried several times and
it's consistent in +/- 30%.


quick  dirty non-optimized patch and the test script attached.

Regards,
--
Ali Akbar
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7e5bcd9..f59738a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12046,6 +12046,22 @@ NULL baz/literallayout(3 rows)/entry
  row
   entry
indexterm
+primaryarray_agg/primary
+   /indexterm
+   functionarray_agg(replaceable class=parameteranyarray/replaceable)/function
+  /entry
+  entry
+   any
+  /entry
+  entry
+   the same array type as input type
+  /entry
+  entryinput arrays, aggregated into higher-order multidimesional array. Rejects NULL and empty array as input./entry
+ /row
+
+ row
+  entry
+   indexterm
 primaryaverage/primary
/indexterm
indexterm
diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index 2f0680f..8c182a4 100644
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@@ -2238,6 +2238,11 @@ SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%');
  array
 ---
  {2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31,2412,2413}
+
+SELECT ARRAY(SELECT array(select i) FROM generate_series(1,5) a(i));
+ array
+---
+ {{1},{2},{3},{4},{5}}
 (1 row)
 /programlisting
The subquery must return a single column. The resulting
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 41e973b..0261fcb 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -108,12 +108,16 @@ exprType(const Node *expr)
 	type = exprType((Node *) tent-expr);
 	if (sublink-subLinkType == ARRAY_SUBLINK)
 	{
-		type = get_array_type(type);
-		if (!OidIsValid(type))
-			ereport(ERROR,
-	(errcode(ERRCODE_UNDEFINED_OBJECT),
-	 errmsg(could not find array type for data type %s,
-			format_type_be(exprType((Node *) tent-expr);
+		if (!OidIsValid(get_element_type(type)))
+		{
+			/* not array, so check for its array type */
+			type = get_array_type(type);
+			if (!OidIsValid(type))
+ereport(ERROR,
+		(errcode(ERRCODE_UNDEFINED_OBJECT),
+		 errmsg(could not find array type for data type %s,
+format_type_be(exprType((Node *) tent-expr);
+		}
 	}
 }
 else if (sublink-subLinkType == MULTIEXPR_SUBLINK)
@@ -139,12 +143,16 @@ exprType(const Node *expr)
 	type = subplan-firstColType;
 	if (subplan-subLinkType == ARRAY_SUBLINK)
 	{
-		type = get_array_type(type);
-		if (!OidIsValid(type))
-			ereport(ERROR,
-	(errcode(ERRCODE_UNDEFINED_OBJECT),
-	 errmsg(could not find array type for data type %s,
-	format_type_be(subplan-firstColType;
+		if (!OidIsValid(get_element_type(type)))
+		{
+			/* not array, so check for its array type */
+			type = get_array_type(type);
+			if (!OidIsValid(type))
+ereport(ERROR,
+		(errcode(ERRCODE_UNDEFINED_OBJECT),
+		 errmsg(could not find array type for data type %s,
+		format_type_be(subplan-firstColType;
+		}
 	}
 }
 else if (subplan-subLinkType == MULTIEXPR_SUBLINK)
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 3e7dc85..8fc8b49 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -668,10 +668,16 @@ build_subplan(PlannerInfo *root, Plan *plan, PlannerInfo *subroot,
 
 		Assert(!te-resjunk);
 		Assert(testexpr == NULL);
-		

[HACKERS] [WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates

2014-10-25 Thread Andreas Karlsson

Hi,

There was recently talk about if we should start using 128-bit integers 
(where available) to speed up the aggregate functions over integers 
which uses numeric for their internal state. So I hacked together a 
patch for this to see what the performance gain would be.


Previous thread: 
http://www.postgresql.org/message-id/20141017182500.gf2...@alap3.anarazel.de


What the patch does is switching from using numerics in the aggregate 
state to int128 and then convert the type from the 128-bit integer in 
the final function.


The functions where we can make use of int128 states are:

- sum(int8)
- avg(int8)
- var_*(int2)
- var_*(int4)
- stdev_*(int2)
- stdev_*(int4)

The initial benchmark results look very promising. When summing 10 
million int8 I get a speedup of ~2.5x and similarly for var_samp() on 10 
million int4 I see a speed up of ~3.7x. To me this indicates that it is 
worth the extra code. What do you say? Is this worth implementing?


The current patch still requires work. I have not written the detection 
of int128 support yet, and the patch needs code cleanup (for example: I 
used an int16_ prefix on the added functions, suggestions for better 
names are welcome). I also need to decide on what estimate to use for 
the size of that state.


The patch should work and pass make check on platforms where __int128_t 
is supported.


The simple benchmarks:

CREATE TABLE test_int8 AS SELECT x::int8 FROM generate_series(1, 
1000) x;


Before:

# SELECT sum(x) FROM test_int8;
  sum

 500500
(1 row)

Time: 2521.217 ms

After:

# SELECT sum(x) FROM test_int8;
  sum

 500500
(1 row)

Time: 1022.811 ms

CREATE TABLE test_int4 AS SELECT x::int4 FROM generate_series(1, 
1000) x;


Before:

# SELECT var_samp(x) FROM test_int4;
  var_samp

 83416.6667
(1 row)

Time: 3808.546 ms

After:

# SELECT var_samp(x) FROM test_int4;
  var_samp

 83416.6667
(1 row)

Time: 1033.243 ms

Andreas
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
new file mode 100644
index 2d6a4cb..65a3d08
*** a/src/backend/utils/adt/numeric.c
--- b/src/backend/utils/adt/numeric.c
*** typedef int16 NumericDigit;
*** 119,124 
--- 119,129 
   * The weight is arbitrary in that case, but we normally set it to zero.
   */
  
+ /* FIXME: Do this properly */
+ typedef __int128_t int128;
+ typedef __uint128_t uint128;
+ #define HAVE_INT128 1
+ 
  struct NumericShort
  {
  	uint16		n_header;		/* Sign + display scale + weight */
*** static void apply_typmod(NumericVar *var
*** 389,394 
--- 394,402 
  static int32 numericvar_to_int4(NumericVar *var);
  static bool numericvar_to_int8(NumericVar *var, int64 *result);
  static void int8_to_numericvar(int64 val, NumericVar *var);
+ #ifdef HAVE_INT128
+ static void int16_to_numericvar(int128 val, NumericVar *var);
+ #endif
  static double numeric_to_double_no_overflow(Numeric num);
  static double numericvar_to_double_no_overflow(NumericVar *var);
  
*** numeric_accum_inv(PG_FUNCTION_ARGS)
*** 2775,2783 
--- 2783,2867 
   * routines for SUM and AVG of these datatypes.
   */
  
+ #ifdef HAVE_INT128
+ typedef struct Int16AggState
+ {
+ 	bool	calcSumX2;	/* if true, calculate sumX2 */
+ 	int64	N;			/* count of processed numbers */
+ 	int128	sumX;		/* sum of processed numbers */
+ 	int128	sumX2;		/* sum of squares of processed numbers */
+ } Int16AggState;
+ 
+ /*
+  * Prepare state data for a numeric aggregate function that needs to compute
+  * sum, count and optionally sum of squares of the input.
+  */
+ static Int16AggState *
+ makeInt16AggState(FunctionCallInfo fcinfo, bool calcSumX2)
+ {
+ 	Int16AggState *state;
+ 	MemoryContext agg_context;
+ 	MemoryContext old_context;
+ 
+ 	if (!AggCheckCallContext(fcinfo, agg_context))
+ 		elog(ERROR, aggregate function called in non-aggregate context);
+ 
+ 	old_context = MemoryContextSwitchTo(agg_context);
+ 
+ 	state = (Int16AggState *) palloc0(sizeof(Int16AggState));
+ 	state-calcSumX2 = calcSumX2;
+ 
+ 	MemoryContextSwitchTo(old_context);
+ 
+ 	return state;
+ }
+ 
+ /*
+  * Accumulate a new input value for numeric aggregate functions.
+  */
+ static void
+ do_int16_accum(Int16AggState *state, int128 newval)
+ {
+ 	if (state-calcSumX2)
+ 		state-sumX2 += newval * newval;
+ 
+ 	state-sumX += newval;
+ 	state-N++;
+ }
+ 
+ /*
+  * Remove an input value from the aggregated state.
+  */
+ static void
+ do_int16_discard(Int16AggState *state, int128 newval)
+ {
+ 	if (state-calcSumX2)
+ 		state-sumX2 -= newval * newval;
+ 
+ 	state-sumX -= newval;
+ 	state-N--;
+ }
+ #endif
+ 
  Datum
  int2_accum(PG_FUNCTION_ARGS)
  {
+ #ifdef HAVE_INT128
+ 	Int16AggState *state;
+ 
+ 	state = PG_ARGISNULL(0) ? NULL : (Int16AggState *) PG_GETARG_POINTER(0);
+ 
+ 	/* Create the state data on the first call */
+ 	if (state == NULL)
+ 		state = makeInt16AggState(fcinfo, 

Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-10-25 Thread Robert Haas
On Fri, Oct 24, 2014 at 6:48 PM, Peter Geoghegan p...@heroku.com wrote:
 You're conflating the user-visible syntax with the parse tree
 representation in way that is utterly without foundation.  I don't
 have a position at this point on which parse-analysis representation
 is preferable, but it's completely wrong-headed to say that you
 can't make NEW.x produce the same parse-analysis output that your
 CONFLICTING(x) syntax would have created.  Sure, it might be harder,
 but it's not that much harder, and it's definitely not an unsolvable
 problem.

 I don't believe I did. The broader point is that the difficulty in
 making that work reflects the conceptually messiness, from
 user-visible aspects down. I can work with the difficulty, and I may
 even be able to live with the messiness, but I'm trying to bring the
 problems with it to a head sooner rather than later for good practical
 reasons. In all sincerity, my real concern is that you or the others
 will change your mind when I actually go and implement an OLD.* style
 syntax, and see the gory details. I regret it if to ask this is to ask
 too much of you, but FYI that's the thought process behind it.

I think what's more likely is that we'll complain about the way you
chose to implement it.  I don't believe your contention (in the other
email) that Support
for an OLD.* style syntax would have to exist at *all* stages of query
execution, from parse analysis through to rewriting, planning, and
execution.  I think if your design for implementing that syntax
requires that, and your design for some other syntax doesn't require
that, then you're not thinking clearly enough about what needs to
happen in parse analysis.  Make the parse analysis for this syntax
emit the same representation that you would have had it emit in the
other syntax, and you won't have this problem.

 What if we spelled
 EXCLUDING/CONFLICTING as follows:

 INSERT INTO upsert VALUES(1, 'Art') ON CONFLICT (key) UPDATE SET val =
 EXCLUDED || 'this works' WHERE another_col != EXCLUDED;

 What do you think of that?

I am in complete agreement with the comments made by Petr and Alvaro.

 You're wrong.  The choice of which index to use is clearly wildly
 inappropriately placed in the parse analysis phase, and if you think
 that has any chance of ever being committed by anyone, then you are
 presuming the existence of a committer who won't mind ignoring the
 almost-immediate revert request that would draw from, at the very
 least, Tom.

 Why? This has nothing to do with optimization.

 That is false.  If there is more than one index that could be used,
 the system should select the best one.  That is an optimization
 decision per se.  Also, if a plan is saved - in the plancache, say, or
 in a view - the query can be re-planned if the index it depends on is
 dropped, but there's no way to do parse analysis.

 Generating index paths for the UPDATE is a waste of cycles.
 Theoretically, there could be an (a, b, c) unique index and a (c,b,a)
 unique index, and those two might have a non-equal cost to scan. But
 that almost certainly isn't going to happen in practice, since that's
 a rather questionable indexing strategy, and even when it does, you're
 going to have to insert into all the unique indexes a good proportion
 of the time anyway, making the benefits of that approach pale in
 comparison to the costs.

I don't care whether you actually generate index-paths or not, and in
fact I suspect it makes no sense to do so.  But I do care that you do
a cost comparison between the available indexes and pick the one that
looks cheapest.  If somebody's got a bloated index and builds a new
one, they will want it to get used even before they drop the old one.

Even if that weren't an issue, though, the fact that you can't
re-parse but you can re-plan is a killer point AFAICS. It means you
are borked if the statement gets re-executed after the index you
picked is dropped.

 From my point of view, I spent a significant amount of time making the
 patch more or less match your proposed design for unique index
 inference. It is discouraging to hear that you think I'm not
 cooperating with community process. I'm doing my best. I think it
 would be a bad idea for me to not engage with the community for an
 extended period at this point. There were plenty of other issues
 address by V1.3 that were not the CONFLICTING()/EXCLUDING thing that
 you highlighted (or the other thing you highlighted around where to do
 unique index inference).

I think this gets at a point that has been bugging me and, perhaps,
other people here.  You often post a new patch with some fixes but
without fixes for the issues that reviewers have indicated are
top-of-mind for them.  Sometimes, but not always, you say something
like I know this doesn't fix X but I'd like comments on other aspects
of the patch.  Even when you do, though, it can be difficult to
overlook something that appears to be a fundamental structural problem
to comment on 

Re: [HACKERS] Reducing lock strength of adding foreign keys

2014-10-25 Thread Noah Misch
On Fri, Oct 24, 2014 at 12:07:42PM -0400, Robert Haas wrote:
 I think instead of focusing on foreign keys, we should rewind a bit
 and think about the locking level required to add a trigger.

Agreed.

 http://www.postgresql.org/message-id/ca+tgmoy4glsxzk0tao29-ljtcuj0sl1xwcwq51xb-hfysgi...@mail.gmail.com
 http://www.postgresql.org/message-id/20893.1393892...@sss.pgh.pa.us
 http://www.postgresql.org/message-id/20140306224340.ga3551...@tornado.leadboat.com
 
 As far as triggers are concerned, the issue of skew between the
 transaction snapshot and what the ruleutils.c snapshots do seems to be
 the principal issue.  Commit e5550d5fec66aa74caad1f79b79826ec64898688
 changed pg_get_constraintdef() to use an MVCC snapshot rather than a
 current MVCC snapshot; if that change is safe, I am not aware of any
 reason why we couldn't change pg_get_triggerdef() similarly.

pg_get_triggerdef() is fine as-is with concurrent CREATE TRIGGER.  The
pg_get_constraintdef() change arose to ensure a consistent result when
concurrent ALTER TABLE VALIDATE CONSTRAINT mutates a constraint definition.
(Reducing the lock level of DROP TRIGGER or ALTER TRIGGER, however, would
create the analogous problem for pg_get_triggerdef().)

 So I tentatively propose (and with due regard for the possibility
 others may see dangers that I've missed) that a reasonable goal would
 be to lower the lock strength required for both CREATE TRIGGER and ADD
 FOREIGN KEY from AccessExclusiveLock to ShareRowExclusiveLock,
 allowing concurrent SELECT and SELECT FOR SHARE against the tables,
 but not any actual write operations.

+1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys

2014-10-25 Thread Thom Brown
On 25 October 2014 13:28, Alvaro Herrera alvhe...@2ndquadrant.com wrote:

 Thom Brown wrote:
  On 24 October 2012 18:17, Tom Lane t...@sss.pgh.pa.us wrote:
 
   Marco Nenciarini marco.nenciar...@2ndquadrant.it writes:
Please find the attached refreshed patch (v2) which fixes the loose
 ends
you found.
  
   Attached is a v3 patch that updates the syntax per discussion, uses
 what
   seems to me to be a saner (more extensible) catalog representation, and
   contains assorted other code cleanup.  I have not touched the
   documentation at all except for catalogs.sgml, so it still explains the
   old syntax.  I have to stop working on this now, because I've already
   expended more time on it than I should, and it still has the serious
   problems mentioned in
  
 http://archives.postgresql.org/message-id/16787.1351053...@sss.pgh.pa.us
   and
  
 http://archives.postgresql.org/message-id/28389.1351094...@sss.pgh.pa.us
  
   I'm going to mark this Returned With Feedback for the current CF.
  
 
  Does anyone have any intention of resurrecting this at this stage?

 Not in this room.  Do you?


I'm not qualified to, but I'm happy to make time to test it when it next
gets picked up.  My email was really just bumping the topic.
-- 
Thom


Re: [HACKERS] Getting rid of accept incoming network connections prompts on OS X

2014-10-25 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 Given that this doesn't affect make check anymore, I'm unsure about
 this patch.  There is a lot of magic in the configure change.  I don't
 know what to pass as the configure option argument, so can't really
 evaluate that.  I'd like to see an explanation for what is done there.

As I said, I'd not written any docs.  The argument that would get passed
there is just a name identifying the signing certificate you want to use.
Most of the documentation would be about how to create such a cert, no
doubt.  (It's pretty simple to make a self-signed cert using Apple's
keychain utility, but it would require some explanation.)

 I'm afraid there is security ridicule potential.  We are essentially
 adding an option to patch out an operating system security feature that
 the user chose.  Some might find that neat and ship binaries built that
 way.  Because it's --with-codesign and not
 --with-codesign-for-devel-dont-use-in-production.

Yeah, that would be a risk :-(.  However, for the typical case of a
self-signed certificate, nothing much would happen because no one
else's machine would even have the same certificate let alone trust it.

 Have we dug deep enough into the firewall configuration to evaluate
 other options?  Can we, for example, exclude a port range?

Not that I've been able to detect.  Any simple way to do that would
presumably open up exactly the security hole Apple is trying to close,
so I'd bet against there being one.  (It is annoying that the firewall
triggers on ports bound to 127.0.0.1, though --- it's not apparent why
that's a security risk.  Perhaps there's some way to adjust that choice?)

 I could see adding this as a contrib script if we don't find a better way.

Meh.  That's just a less convenient packaging of the same code, with
the same potential for misuse.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting rid of accept incoming network connections prompts on OS X

2014-10-25 Thread Tom Lane
I wrote:
 Peter Eisentraut pete...@gmx.net writes:
 Have we dug deep enough into the firewall configuration to evaluate
 other options?  Can we, for example, exclude a port range?

 Not that I've been able to detect.  Any simple way to do that would
 presumably open up exactly the security hole Apple is trying to close,
 so I'd bet against there being one.  (It is annoying that the firewall
 triggers on ports bound to 127.0.0.1, though --- it's not apparent why
 that's a security risk.  Perhaps there's some way to adjust that choice?)

And a bit of experimentation later: it seems that on Yosemite (and
probably earlier OS X versions), localhost maps to all three of these
addresses:
127.0.0.1
::1
fe80:1::1
Binding to 127.0.0.1 does not trigger the firewall popup.  Binding
to ::1 doesn't, either.  But binding to fe80:1::1 does.  So the
easy fix, for a default installation, is to keep the postmaster
from binding to that last address.

I'm not sufficiently up on my IPv6 to be sure exactly what that third
address does.  Perhaps it is a bug in the firewall logic that it
considers that address external?  If it *is* externally accessible,
what the heck is the OS doing including it in localhost?

(Not sure if it's relevant, but I've got IPv6 set to link-local only
in network preferences.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How ugly would this be? (ALTER DATABASE)

2014-10-25 Thread Greg Stark
On 24 Oct 2014 20:28, Robert Haas robertmh...@gmail.com wrote:

 You could perhaps try to create a command that would move a schema
 between two databases in the same cluster.  It's fraught with
 practical difficulties because a single backend can't be connected to
 both databases at the same time, so how exactly do you make the
 required catalog changes all in a single transaction?  But if you
 imagine that you have an infinite pool of top-notch PostgreSQL talent
 with unbounded time to work on this problem and no other, I bet
 somebody could engineer a solution.

I think the bigger problem is the dependent objects. Things like data types
which might exist in both databases but with different oids.

If you simplify the problem to only handle tables and indexes and only if
they only use system types and other objects then it seems doable but that
creates a lot of pitfalls for users.

I would do it in three steps more like pg_upgrade. 1) copy the schema to
the new database and note the new oids and relfilenodes. 2) copy or move
the data files over. 3) drop the old schema. Each of those can be done in
separate transactions or even backends -- the intermediate states just have
some empty tables in one of the schemas.

It's not clear to me what state the databases should be in during step 2
though. A simple lock would not be sufficient. Perhaps there needs to be
something like indisvalid for tables. That might be handy for pg_bulkload
as well.


Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys

2014-10-25 Thread Thom Brown
On 25 October 2014 19:19, Thom Brown t...@linux.com wrote:

 On 25 October 2014 13:28, Alvaro Herrera alvhe...@2ndquadrant.com wrote:

 Thom Brown wrote:
  On 24 October 2012 18:17, Tom Lane t...@sss.pgh.pa.us wrote:
 
   Marco Nenciarini marco.nenciar...@2ndquadrant.it writes:
Please find the attached refreshed patch (v2) which fixes the loose
 ends
you found.
  
   Attached is a v3 patch that updates the syntax per discussion, uses
 what
   seems to me to be a saner (more extensible) catalog representation,
 and
   contains assorted other code cleanup.  I have not touched the
   documentation at all except for catalogs.sgml, so it still explains
 the
   old syntax.  I have to stop working on this now, because I've already
   expended more time on it than I should, and it still has the serious
   problems mentioned in
  
 http://archives.postgresql.org/message-id/16787.1351053...@sss.pgh.pa.us
   and
  
 http://archives.postgresql.org/message-id/28389.1351094...@sss.pgh.pa.us
  
   I'm going to mark this Returned With Feedback for the current CF.
  
 
  Does anyone have any intention of resurrecting this at this stage?

 Not in this room.  Do you?


 I'm not qualified to, but I'm happy to make time to test it when it next
 gets picked up.  My email was really just bumping the topic.


I should mention that the latest patch no longer applies against git
master, so I can't test it in its current form.
-- 
Thom


[HACKERS] Possible problem with shm_mq spin lock

2014-10-25 Thread Haribabu Kommi
Hi Hackers,

I am thinking of a possible problem with shm_mq structure spin lock.
This is used for protecting the shm_mq structure.

During the processing of any code under the spin lock, if the process
receives SIGQUIT signal then it is leading to a dead lock situation.

SIGQUIT-proc_exit-shm_mq_detach-try to acquire spin lock. The spin
lock is already took by the process.

It is very dificult to reproduce the problem as because the code under
the lock is very minimal.
Please let me know if I missed anything.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] snapshot too large error when initializing logical replication (9.4)

2014-10-25 Thread Steve Singer
I sometimes get the error snapshot too large from my logical 
replication walsender process when in response to a CREATE_REPLICATION_SLOT.


This is in SnapBuildExportSnapshot in snapbuild.c

newxcnt is 212 at that point

I have max_connections = 200

procArray-maxProcs=212

Should we be testing
newxcnt  GetMaxSnapshotXidCount()

instead of
newxcnt = GetMaxSnapshotXidCount()






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] logical decoding - reading a user catalog table

2014-10-25 Thread Steve Singer

My logical decoding plugin is occasionally getting this  error

could not resolve cmin/cmax of catalog tuple

I get this when my output plugin is trying to read one of the user 
defined catalog tables (user_catalog_table=true)



I am not sure if this is a bug in the time-travel support in the logical 
decoding support of if I'm just using it wrong (ie not getting a 
sufficient lock on the relation or something).




This is the interesting part of the stack trace

#4  0x0091bbc8 in HeapTupleSatisfiesHistoricMVCC 
(htup=0x7fffcf42a900,

snapshot=0x7f786ffe92d8, buffer=10568) at tqual.c:1631
#5  0x004aedf3 in heapgetpage (scan=0x28d7080, page=0) at 
heapam.c:399

#6  0x004b0182 in heapgettup_pagemode (scan=0x28d7080,
dir=ForwardScanDirection, nkeys=0, key=0x0) at heapam.c:747
#7  0x004b1ba6 in heap_getnext (scan=0x28d7080,
direction=ForwardScanDirection) at heapam.c:1475
#8  0x7f787002dbfb in lookupSlonyInfo (tableOid=91754, ctx=0x2826118,
origin_id=0x7fffcf42ab8c, table_id=0x7fffcf42ab88, 
set_id=0x7fffcf42ab84)

at slony_logical.c:663
#9  0x7f787002b7a3 in pg_decode_change (ctx=0x2826118, txn=0x28cbec0,
relation=0x7f787a3446a8, change=0x7f786ffe3268) at slony_logical.c:237
#10 0x007497d4 in change_cb_wrapper (cache=0x28cbda8, 
txn=0x28cbec0,

relation=0x7f787a3446a8, change=0x7f786ffe3268) at logical.c:704



Here is what the code in lookupSlonyInfo is doing
--

  sltable_oid = get_relname_relid(sl_table,slony_namespace);

  sltable_rel = relation_open(sltable_oid,AccessShareLock);
  tupdesc=RelationGetDescr(sltable_rel);
  scandesc=heap_beginscan(sltable_rel, 
GetCatalogSnapshot(sltable_oid),0,NULL);

  reloid_attnum = get_attnum(sltable_oid,tab_reloid);

  if(reloid_attnum == InvalidAttrNumber)
  elog(ERROR,sl_table does not have a tab_reloid column);
  set_attnum = get_attnum(sltable_oid,tab_set);

  if(set_attnum == InvalidAttrNumber)
  elog(ERROR,sl_table does not have a tab_set column);
  tableid_attnum = get_attnum(sltable_oid, tab_id);

  if(tableid_attnum == InvalidAttrNumber)
  elog(ERROR,sl_table does not have a tab_id column);

  while( (tuple = heap_getnext(scandesc,ForwardScanDirection) ))





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible problem with shm_mq spin lock

2014-10-25 Thread Andres Freund
Hi,

On 2014-10-26 08:52:42 +1100, Haribabu Kommi wrote:
 I am thinking of a possible problem with shm_mq structure spin lock.
 This is used for protecting the shm_mq structure.
 
 During the processing of any code under the spin lock, if the process
 receives SIGQUIT signal then it is leading to a dead lock situation.
 
 SIGQUIT-proc_exit-shm_mq_detach-try to acquire spin lock. The spin
 lock is already took by the process.
 
 It is very dificult to reproduce the problem as because the code under
 the lock is very minimal.
 Please let me know if I missed anything.

I think you missed the following bit in postgres.c:

/*
 * quickdie() occurs when signalled SIGQUIT by the postmaster.
 *
 * Some backend has bought the farm,
 * so we need to stop what we're doing and exit.
 */
void
quickdie(SIGNAL_ARGS)
{
...
/*
 * We DO NOT want to run proc_exit() callbacks -- we're here because
 * shared memory may be corrupted, so we don't want to try to clean up 
our
 * transaction.  Just nail the windows shut and get out of town.  Now 
that
 * there's an atexit callback to prevent third-party code from breaking
 * things by calling exit() directly, we have to reset the callbacks
 * explicitly to make this work as intended.
 */
on_exit_reset();
..

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible problem with shm_mq spin lock

2014-10-25 Thread Haribabu Kommi
On Sun, Oct 26, 2014 at 10:17 AM, Andres Freund and...@2ndquadrant.com wrote:
 Hi,

 On 2014-10-26 08:52:42 +1100, Haribabu Kommi wrote:
 I am thinking of a possible problem with shm_mq structure spin lock.
 This is used for protecting the shm_mq structure.

 During the processing of any code under the spin lock, if the process
 receives SIGQUIT signal then it is leading to a dead lock situation.

 SIGQUIT-proc_exit-shm_mq_detach-try to acquire spin lock. The spin
 lock is already took by the process.

 It is very dificult to reproduce the problem as because the code under
 the lock is very minimal.
 Please let me know if I missed anything.

 I think you missed the following bit in postgres.c:

 /*
  * quickdie() occurs when signalled SIGQUIT by the postmaster.
  *
  * Some backend has bought the farm,
  * so we need to stop what we're doing and exit.
  */
 void
 quickdie(SIGNAL_ARGS)
 {
 ...
 /*
  * We DO NOT want to run proc_exit() callbacks -- we're here because
  * shared memory may be corrupted, so we don't want to try to clean 
 up our
  * transaction.  Just nail the windows shut and get out of town.  Now 
 that
  * there's an atexit callback to prevent third-party code from 
 breaking
  * things by calling exit() directly, we have to reset the callbacks
  * explicitly to make this work as intended.
  */
 on_exit_reset();

Thanks for the details. I am sorry It is not proc_exit. It is the exit
callback functions
that can cause problem.

The following is the callstack where the problem can happen, if the signal
handler is called after the spin lock took by the worker.

Breakpoint 1, 0x0072dd83 in shm_mq_detach ()
(gdb) bt
#0  0x0072dd83 in shm_mq_detach ()
#1  0x0072e7db in shm_mq_detach_callback ()
#2  0x00726d71 in dsm_detach ()
#3  0x00726c43 in dsm_backend_shutdown ()
#4  0x00727450 in shmem_exit ()
#5  0x007272fc in proc_exit_prepare ()
#6  0x00727501 in atexit_callback ()
#7  0x0030ff435da2 in exit () from /lib64/libc.so.6
#8  0x006ddaec in bgworker_quickdie ()
#9  signal handler called
#10 0x0072ce9a in shm_mq_sendv ()


Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible problem with shm_mq spin lock

2014-10-25 Thread Tom Lane
Haribabu Kommi kommi.harib...@gmail.com writes:
 Thanks for the details. I am sorry It is not proc_exit. It is the exit
 callback functions that can cause problem.

 The following is the callstack where the problem can happen, if the signal
 handler is called after the spin lock took by the worker.

 Breakpoint 1, 0x0072dd83 in shm_mq_detach ()
 (gdb) bt
 #0  0x0072dd83 in shm_mq_detach ()
 #1  0x0072e7db in shm_mq_detach_callback ()
 #2  0x00726d71 in dsm_detach ()
 #3  0x00726c43 in dsm_backend_shutdown ()
 #4  0x00727450 in shmem_exit ()
 #5  0x007272fc in proc_exit_prepare ()
 #6  0x00727501 in atexit_callback ()
 #7  0x0030ff435da2 in exit () from /lib64/libc.so.6
 #8  0x006ddaec in bgworker_quickdie ()

Or in other words, Robert broke it.  This control path should absolutely
not occur: the entire point of the on_exit_reset call in quickdie() is to
prevent any callbacks from being executed when we get to shmem_exit().
DSM-related functions DO NOT get an exemption.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible problem with shm_mq spin lock

2014-10-25 Thread Haribabu Kommi
On Sun, Oct 26, 2014 at 12:12 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Haribabu Kommi kommi.harib...@gmail.com writes:
 Thanks for the details. I am sorry It is not proc_exit. It is the exit
 callback functions that can cause problem.

 The following is the callstack where the problem can happen, if the signal
 handler is called after the spin lock took by the worker.

 Breakpoint 1, 0x0072dd83 in shm_mq_detach ()
 (gdb) bt
 #0  0x0072dd83 in shm_mq_detach ()
 #1  0x0072e7db in shm_mq_detach_callback ()
 #2  0x00726d71 in dsm_detach ()
 #3  0x00726c43 in dsm_backend_shutdown ()
 #4  0x00727450 in shmem_exit ()
 #5  0x007272fc in proc_exit_prepare ()
 #6  0x00727501 in atexit_callback ()
 #7  0x0030ff435da2 in exit () from /lib64/libc.so.6
 #8  0x006ddaec in bgworker_quickdie ()

 Or in other words, Robert broke it.  This control path should absolutely
 not occur: the entire point of the on_exit_reset call in quickdie() is to
 prevent any callbacks from being executed when we get to shmem_exit().
 DSM-related functions DO NOT get an exemption.

The reset_on_dsm_detach function is called to remove the DSM related
callbacks.
It's my mistake, I am really sorry, the code I am using is a wrong
one. Sorry for the noise.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting rid of accept incoming network connections prompts on OS X

2014-10-25 Thread Peter Eisentraut
On 10/25/14 2:52 PM, Tom Lane wrote:
 And a bit of experimentation later: it seems that on Yosemite (and
 probably earlier OS X versions), localhost maps to all three of these
 addresses:
   127.0.0.1
   ::1
   fe80:1::1
 Binding to 127.0.0.1 does not trigger the firewall popup.  Binding
 to ::1 doesn't, either.  But binding to fe80:1::1 does.  So the
 easy fix, for a default installation, is to keep the postmaster
 from binding to that last address.
 
 I'm not sufficiently up on my IPv6 to be sure exactly what that third
 address does.  Perhaps it is a bug in the firewall logic that it
 considers that address external?

I think that's exactly it.  I have filed a bug with Apple about it.

For the time begin, I think it's a reasonable workaround to comment out
the line in /etc/hosts.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting rid of accept incoming network connections prompts on OS X

2014-10-25 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On 10/25/14 2:52 PM, Tom Lane wrote:
 And a bit of experimentation later: it seems that on Yosemite (and
 probably earlier OS X versions), localhost maps to all three of these
 addresses:
 127.0.0.1
 ::1
 fe80:1::1
 Binding to 127.0.0.1 does not trigger the firewall popup.  Binding
 to ::1 doesn't, either.  But binding to fe80:1::1 does.  So the
 easy fix, for a default installation, is to keep the postmaster
 from binding to that last address.
 
 I'm not sufficiently up on my IPv6 to be sure exactly what that third
 address does.  Perhaps it is a bug in the firewall logic that it
 considers that address external?

 I think that's exactly it.  I have filed a bug with Apple about it.

 For the time begin, I think it's a reasonable workaround to comment out
 the line in /etc/hosts.

Hmm ... I was about to object that that seemed likely to break other
stuff, but on poking around I notice that my non-laptop Yosemite machine
has no such entry at all.  (That one was originally Mountain Lion, and was
upgraded to Mavericks and then Yosemite, whereas my laptop was Mavericks
on delivery.)  Even more interesting, there's a /etc/hosts~orig file on
my laptop that does not have the entry.

A little bit of data gathering later:
wife's laptop (10.9.5): has it
dromedary (10.6.8): has it
prairiedog (10.4.11): doesn't have it

So it looks like Apple has been using this for awhile but it's not really
essential to anything.  What worries me a bit is that the evidence on my
laptop suggests there may be code somewhere (like System Preferences) that
will edit the file, so that a manual removal might not keep.
We shall see.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Service Name Enhancement - Wildcard support for LDAP/DNS lookup

2014-10-25 Thread Doyle, Bryan
Tom, 

I believe there are two main concerns that you raise, addressed below:

First:

 It needs to be a more constrained syntax.  
 One possibility is to insist that the wildcard be only a part
 of the name string, eg

 [myservers-%]
 host=%.domain.com
 port=5433
 user=admin

* This counter-proposal is getting closer to the more complex matching 
requirements that I was attempting to de-scope, but it should certainly be 
considered. I can see where someone may want to have a different LDAP/DNS 
domain or something like that through prefix convention, though we would likely 
want to restrict to one % symbol in the service definition, yes? I am fine 
with including this capability in the patch, provided the general [%] case is 
still supported (see below for expanded reasons).

Second:

 since we check service names
 before other possibilities such as host/database names, the entry would
 then proceed to capture every possible connection request

* I should have explicitly covered the case where no service name is provided 
at connection time - If a service name is not specified in the connection 
string/connection parameters at all, I would propose that this wildcard entry 
not match (even if service names are processed first) and normal processing 
proceed. As a comparable, the '%' in a like statement doesn't match a NULL 
after all. I don't think having a blank replacement value would make much sense 
either. Please inform me if I am not addressing some part of your concern with 
this mindset.


Some additional comments:

* It is in fact is desirable for us (and likely others) to capture all service 
names in one entry; I expect to utilize it exclusively once implemented. I 
would like to look up all service name entries from a single LDAP location if a 
previous entry in the file does not short circuit the match. To do this, I am 
explicitly looking to use a [%] entry in our service name file - the prefix 
requirement is not consistent with our environmental requirements. The service 
name file is a client construct that can be overridden by the caller if they 
desire, but keep in mind that this type of feature is targeted for 
controlled/managed environments.

* Adding additional processing logic for 'myserver-%' would only make this more 
flexible for other use cases and would certainly meet the goals of this 
proposal, so I am fine including it in scope if the [%] is also allowed per 
above. When these other wildcard w/ prefix entries can match above the general 
[%] entry, it could be compelling (see my first email regarding entry order 
considerations). Those not wanting the [%] could choose to not implement it and 
stick to something closer to the prefix approach you have in mind.


Summary of Open Questions:

* (From Above) For prefix wildcards, OK to restrict to one % replacement?

* Do the above points address initial concerns regarding service names being 
processed before host/db names?

* If both prefix/non-prefix are allowed, what should be the behavior for cases 
where [prefix-%] matches and fails to connect/lookup and then [%] is also 
located further down?
** Without additional discussion, I would assume that it would attempt a 
connection for consistency. Again, people can choose to not use both features 
together. Another option would be to somehow introduce a stop processing flag 
in the service entries on connection/lookup failures, which may be generally 
useful even when wild cards are not in use.


Thanks for the reply and your thoughts on this proposal so far. I am looking 
forward to the continued conversation.

Bryan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make pg_controldata accept -D dirname

2014-10-25 Thread Michael Paquier
On Sat, Oct 25, 2014 at 1:20 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 Argh, looks like I forgot the actual code changes required.
 I just noticed that pg_controldata and pg_resetxlog don't check for extra
 arguments:
 $ pg_resetxlog data fds sdf sdf
 Transaction log reset
I think that it would be good to add as well a set of TAP tests for
all those things. So attached are two patches, one to complete the
tests of pg_controldata, and a second one to add TAP tests for
pg_resetxlog.
-- 
Michael
From b196d4159d7fce4f84222659a3b445c86cd2b5e8 Mon Sep 17 00:00:00 2001
From: Michael Paquier mich...@otacoo.com
Date: Sun, 26 Oct 2014 13:08:05 +0900
Subject: [PATCH 1/2] Add more TAP tests for pg_controldata

Those checks are related to extra arguments and the newly-introduced
option -D.
---
 src/bin/pg_controldata/t/001_pg_controldata.pl | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/bin/pg_controldata/t/001_pg_controldata.pl b/src/bin/pg_controldata/t/001_pg_controldata.pl
index 35ad10a..f9cdcec 100644
--- a/src/bin/pg_controldata/t/001_pg_controldata.pl
+++ b/src/bin/pg_controldata/t/001_pg_controldata.pl
@@ -1,7 +1,7 @@
 use strict;
 use warnings;
 use TestLib;
-use Test::More tests = 6;
+use Test::More tests = 10;
 
 my $tempdir = TestLib::tempdir;
 
@@ -11,6 +11,14 @@ program_options_handling_ok('pg_controldata');
 command_fails(['pg_controldata'], 'pg_controldata without arguments fails');
 command_fails([ 'pg_controldata', 'nonexistent' ],
 	'pg_controldata with nonexistent directory fails');
+command_fails([ 'pg_controldata', 'arg1', 'arg2', 'arg3' ],
+	'pg_controldata with too many arguments fails');
+command_fails([ 'pg_controldata', '-D' ],
+	'pg_controldata -D with no directory defined fails');
+command_fails([ 'pg_controldata', '-D', 'arg1', 'arg2', 'arg3' ],
+	'pg_controldata -D with too many arguments fails');
 system_or_bail initdb -D '$tempdir'/data -A trust /dev/null;
 command_like([ 'pg_controldata', $tempdir/data ],
 	qr/checkpoint/, 'pg_controldata produces output');
+command_like([ 'pg_controldata', -D, $tempdir/data ],
+	qr/checkpoint/, 'pg_controldata produces output');
-- 
2.1.2

From 7d92cbfd0607fbf35902408e3a82bf2ca60bf71f Mon Sep 17 00:00:00 2001
From: Michael Paquier mpaqu...@otacoo.com
Date: Sat, 25 Oct 2014 08:14:37 +
Subject: [PATCH 2/2] Add TAP tests for pg_resetxlog

Those checks are related to extra arguments, the use of option -D with
a data directory as well as the generation of output for this utility.
---
 src/bin/pg_resetxlog/.gitignore|  1 +
 src/bin/pg_resetxlog/Makefile  |  6 ++
 src/bin/pg_resetxlog/t/001_pg_resetxlog.pl | 24 
 3 files changed, 31 insertions(+)
 create mode 100644 src/bin/pg_resetxlog/t/001_pg_resetxlog.pl

diff --git a/src/bin/pg_resetxlog/.gitignore b/src/bin/pg_resetxlog/.gitignore
index 6b84208..68d8890 100644
--- a/src/bin/pg_resetxlog/.gitignore
+++ b/src/bin/pg_resetxlog/.gitignore
@@ -1 +1,2 @@
 /pg_resetxlog
+/tmp_check/
diff --git a/src/bin/pg_resetxlog/Makefile b/src/bin/pg_resetxlog/Makefile
index 6610690..5ebf8f4 100644
--- a/src/bin/pg_resetxlog/Makefile
+++ b/src/bin/pg_resetxlog/Makefile
@@ -33,3 +33,9 @@ uninstall:
 
 clean distclean maintainer-clean:
 	rm -f pg_resetxlog$(X) $(OBJS)
+
+check: all
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
diff --git a/src/bin/pg_resetxlog/t/001_pg_resetxlog.pl b/src/bin/pg_resetxlog/t/001_pg_resetxlog.pl
new file mode 100644
index 000..4e6e0a7
--- /dev/null
+++ b/src/bin/pg_resetxlog/t/001_pg_resetxlog.pl
@@ -0,0 +1,24 @@
+use strict;
+use warnings;
+use TestLib;
+use Test::More tests = 10;
+
+my $tempdir = TestLib::tempdir;
+
+program_help_ok('pg_resetxlog');
+program_version_ok('pg_resetxlog');
+program_options_handling_ok('pg_resetxlog');
+command_fails(['pg_resetxlog'], 'pg_resetxlog without arguments fails');
+command_fails([ 'pg_resetxlog', 'nonexistent' ],
+	'pg_resetxlog with nonexistent directory fails');
+command_fails([ 'pg_resetxlog', 'arg1', 'arg2', 'arg3' ],
+	'pg_resetxlog with too many arguments fails');
+command_fails([ 'pg_resetxlog', '-D' ],
+	'pg_resetxlog -D with no directory defined fails');
+command_fails([ 'pg_resetxlog', '-D', 'arg1', 'arg2', 'arg3' ],
+	'pg_resetxlog -D with too many arguments fails');
+system_or_bail initdb -D '$tempdir'/data -A trust /dev/null;
+command_like([ 'pg_resetxlog', $tempdir/data ],
+	qr/Transaction/, 'pg_resetxlog produces output');
+command_like([ 'pg_resetxlog', -D, $tempdir/data ],
+	qr/Transaction/, 'pg_resetxlog produces output');
-- 
2.1.2


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index scan optimization

2014-10-25 Thread Haribabu Kommi
On Tue, Sep 23, 2014 at 10:38 PM, Rajeev rastogi
rajeev.rast...@huawei.com wrote:
 On 22 September 2014 19:17, Heikki Linnakangas wrote:

 On 09/22/2014 04:45 PM, Tom Lane wrote:
  Heikki Linnakangas hlinnakan...@vmware.com writes:
  On 09/22/2014 07:47 AM, Rajeev rastogi wrote:
  So my proposal is to skip the condition check on the first scan key
 condition for every tuple.
 
  The same happens in a single-column case. If you have a query like
  SELECT * FROM tbl2 where id2  'a', once you've found the start
  position of the scan, you know that all the rows that follow match
 too.
 
  ... unless you're doing a backwards scan.

 Sure. And you have to still check for NULLs. Have to get the details
 right..

 I have finished implementation of the discussed optimization.
 I got a performance improvement of around 30% on the schema and data shared 
 in earlier mail.

 I also tested for the index scan case, where our optimization is not done and 
 observed that there
 is no effect on those query because of this change.

 Change details:
 I have added a new flag as SK_BT_MATCHED as part of sk_flags (ScanKey 
 structure), the value used for this
 0x0004, which was unused.
 Inside the function _bt_first, once we finish finding the start scan position 
 based on the first key,
 I am appending the flag SK_BT_MATCHED to the first key.
 Then in the function _bt_checkkeys, during the key comparison, I am checking 
 if the key has SK_BT_MATCHED flag set, if yes then
 there is no need to further comparison. But if the tuple is having NULL 
 value, then even if this flag is set, we will continue
 with further comparison (this handles the Heikki point of checking NULLs).

Hi,

I reviewed index scan optimization patch, the following are the observations.

- Patch applies cleanly.
- Compiles without warnings
- All regress tests are passed.

There is a good performance gain with the patch in almost all scenarios.

I have a question regarding setting of key flags matched. Only the
first key was set as matched
even if we have multiple index conditions. Is there any reason behind that?

If any volatile function is present in the index condition, the index
scan itself is not choosen,
Is there any need of handling the same similar to NULLS?

Thanks for the patch.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers